summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965: Advertise our vertex shader texture units.Kenneth Graunke2011-12-191-1/+1
| | | | | | | | | | | | | Previously, we advertised 0 VS texture units. Now that we have proper support for using the sampling engine in the VS, we can advertise 16, which is conveniently the number required for OpenGL 3.0. v2: Enable on Gen4. I hacked up my tests to not use flat ivec varyings and they pass. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Implement EXT_texture_swizzle support for VS texturing.Kenneth Graunke2011-12-192-1/+52
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Add texture related data to brw_vs_prog_key.Kenneth Graunke2011-12-192-0/+11
| | | | | | | | Now that this is all factored out, it's trivial to do. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Only set brw_wm_prog_key data for samplers used by the WM.Kenneth Graunke2011-12-191-1/+3
| | | | | | | | | This should avoid state-dependent FS recompiles when samplers that are only used by the VS change. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Factor out texturing related data from brw_wm_prog_key.Kenneth Graunke2011-12-197-115/+168
| | | | | | | | | | The idea is to reuse this for the VS and (in the future) GS as well. v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Add support for texel offsets.Kenneth Graunke2011-12-193-2/+23
| | | | | | | | | | | The visit() half computes the values to put in the header based on the IR and simply stuffs that in the vec4_instruction; the emit() half uses this to set up the message header. This works out well since emit() can use brw_reg directly and access individual DWords without kludgery. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Factor out texture offset bitfield computation.Kenneth Graunke2011-12-193-18/+26
| | | | | | | | | We'll want to reuse this for the VS, and it's complex enough that I'd rather not cut and paste it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Implement vec4_visitor::visit(ir_texture *).Kenneth Graunke2011-12-191-7/+120
| | | | | | | | | | | | | | | This translates the GLSL compiler's IR into vec4_instruction IR, generating code to load coordinates, LOD info, shadow comparitors, and so on into the appropriate message registers. It turns out that the SIMD4x2 parameters are identical on Gen 5-7, and the Gen4 code is similar enough that, unlike in the FS, it's easy enough to support all generations in a single function. v2: Load zeros for missing coordinates (fixing vs-texelFetch-sampler1D and 2D on G45), and fix G45 message length for shadow comparisons. Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs: Implement vec4_visitor::generate_tex().Kenneth Graunke2011-12-192-0/+110
| | | | | | | | | | This is the part that takes the vec4_instruction IR and turns it into actual Gen ISA. v2: Add Gen4 messages, don't retype m0 to UW. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Add missing SIMD4x2 sample_l_c message #defines.Kenneth Graunke2011-12-191-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't minify depth when setting up cube map miptrees on Gen4.Kenneth Graunke2011-12-191-1/+2
| | | | | | | | | | | | | | | | | | | | Prior to Ironlake, cube maps were stored as 3D textures. In recent refactoring, we removed a separate "layers" parameter in favor of using depth. Unfortunately, depth was getting minified, which is only correct for actual 3D textures. Fixes piglit tests: - bugs/crash-cubemap-order - fbo/fbo-cubemap - texturing/cubemap Also changes texturing/cubemap npot from abort to fail. This hasn't seen a full test run since Piglit on Mesa master hangs GM45 a lot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add support for GL_ARB_depth_buffer_float under 3.0 override.Eric Anholt2011-12-194-1/+20
| | | | | | | | This is not exposed generally yet because some of the swrast paths hit in piglit (drawpixels, copypixels, blit) aren't yet converted to MapRenderbuffer. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add separate stencil/HiZ setup for MESA_FORMAT_Z32_FLOAT_X24S8.Eric Anholt2011-12-193-15/+20
| | | | | | | | | This is a little more unusual than the separate MESA_FORMAT_S8_Z24 support, because in addition to storing the real stencil data in a MESA_FORMAT_S8 miptree, we also make the Z miptree be MESA_FORMAT_Z32_FLOAT instead of the requested format. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use the miptree format for texture surface format choice.Eric Anholt2011-12-192-2/+2
| | | | | | | | With separate stencil GL_DEPTH32F_STENCIL8, the miptree will have a really different format (MESA_FORMAT_Z32_FLOAT) from the teximage (MESA_FORMAT_Z32_FLOAT_X24S8). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for mapping Z32_FLOAT_X24S8 fake packed depth/stencil.Eric Anholt2011-12-191-5/+17
| | | | | | | | The format handling here is tricky, because we're not actually generating a Z32_FLOAT_X24S8 miptree, so we're guessing the format that GL wants based on seeing Z32_FLOAT with a separate stencil. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop creating the wrapped depth irb.Eric Anholt2011-12-192-111/+8
| | | | | | | All the operations were just trying to get at irb->wrapped_depth->mt, which is the same as irb->mt now. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Properly demote the depth mt format for fake packed depth/stencil.Eric Anholt2011-12-194-3/+19
| | | | | | | | | | | gen7 only supports the non-packed formats, even if you associate a real separate stencil buffer -- otherwise it's as if the depth test always fails. This requires a little bit of care in the match_texture_image case, since the miptree format no longer matches the texture image format. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Reuse intel_miptree_match_image().Eric Anholt2011-12-191-9/+6
| | | | | | | | This little bit of logic was duplicated, which isn't much, but I was going to need to duplicate a bit of additional logic in the next commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop creating the wrapped stencil irb.Eric Anholt2011-12-195-78/+67
| | | | | | | | | | | There were only two places it was really used at this point, which was in the batchbuffer emit of the separate stencil packets for gen6/7. Just write in the ->stencil_mt reference in those two places and ditch all this flailing around with allocation and refcounts. v2: Fix separate stencil on gen7. Reviewed-by: Kenneth Graunke <[email protected]>
* osmesa: fix RGB565 renderingAlex Galakhov2011-12-191-0/+4
| | | | Signed-off-by: Brian Paul <[email protected]>
* i965/vs: Add a new dst_reg constructor for file, number, type, and mask.Kenneth Graunke2011-12-181-0/+10
| | | | | | | | This will be especially useful for loading texturing parameters, where I need to (for example) reference m3.xz<D>. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Add vec4_instruction::is_tex() query.Kenneth Graunke2011-12-182-0/+11
| | | | | | | Copy and pasted from fs_inst::is_tex(), but without TXB. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename texturing ops from FS_OPCODE to SHADER_OPCODE, except TXB.Kenneth Graunke2011-12-185-46/+48
| | | | | | | | We'll be reusing most of these for the VS shortly. The one exception is TXB (texturing with LOD bias), which is explicitly forbidden in the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Don't swizzle the results of textureSize().Kenneth Graunke2011-12-181-0/+3
| | | | | | | | | Fixes a regression since d2235b0f4681f75d562131d655a6d7b7033d2d8b, in my new textureSize sampler(1DArrayShadow|2DShadow|2DArrayShadow) piglit tests, though I'm not honestly sure how this ever worked. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meta: use _mesa_prepare_mipmap_level() in the mipmap generation codeBrian Paul2011-12-161-35/+12
| | | | | | | See previous commit for more information. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: implement DrawTransformFeedback from ARB_transform_feedback2Marek Olšák2011-12-153-7/+14
| | | | | | | | | | | | | | It's like DrawArrays, but the count is taken from a transform feedback object. This removes DrawTransformFeedback from dd_function_table and adds the same function to GLvertexformat (with the function parameters matching GL). The vbo_draw_func callback has a new parameter "struct gl_transform_feedback_object *tfb_vertcount". The rest of the code just validates states and forwards the transform feedback object into vbo_draw_func.
* i965: Drop separate stencil assertions in update_draw_buffer().Eric Anholt2011-12-141-16/+0
| | | | | | | The comment said they deserved to be in emit_depthbuffer, and at this point they were all there already. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Simplify and touch up the FBO completeness test.Eric Anholt2011-12-141-18/+21
| | | | | | | | | Now that we have miptrees for everything, we can more easily test for !has_separate_stencil completeness. Also, test for whether the stencil rb is the wrong kind of format for separate stencil, or if we are trying to do packed to different images of a single miptree. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Remove another renderbuffer allocation path.Eric Anholt2011-12-141-8/+4
| | | | | | | | | | Now there's the thing that CALLOCs and sets up window system vtable, and the thing that CALLOCs and sets up user renderbuffer vtable. The user renderbuffer vtable gets replaced later by intel_renderbuffer_update_wrapper for wrapped renderbuffers (things with name == ~0). Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Make the separate stencil RB storage path match texture more.Eric Anholt2011-12-141-76/+52
| | | | | | | There were too many things making intel_renderbuffer *s and tweaking their bits. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Move S8 width/height alignment to miptree creation.Eric Anholt2011-12-143-55/+22
| | | | | | | | We were doing it in the caller in the renderbuffer code, but it was missed in the separate stencil creation for textures. Apparently our testing was using renderbuffers or pre-aligned sizes. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Drop check for wrapped_depth in RB mapping.Eric Anholt2011-12-141-1/+1
| | | | | | | This used to be needed because irb->mt would be unset for fake packed depth/stencil, but no longer. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fix uninitialized values in debug output for renderbuffer mapping.Eric Anholt2011-12-141-1/+1
|
* radeon: stop using _DepthBuffer, _StencilBuffer fieldsBrian Paul2011-12-132-9/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nouveau: stop using _DepthBuffer, _StencilBuffer fieldsBrian Paul2011-12-136-13/+14
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa,intel: use _mesa_image_offset() for PBOsnobled2011-12-081-2/+3
| | | | | | | | This avoids forming invalid pointers needlessly, which even if never dereferenced is undefined behavior. It also makes _mesa_validate_pbo_access() more comprehensible. Reviewed-by: Brian Paul <[email protected]>
* mesa/drivers: use new swrast renderbuffer functionsBrian Paul2011-12-0812-62/+74
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa: rewrite accum buffer supportBrian Paul2011-12-082-2/+3
| | | | | | | | | | | | | Implemented in terms of renderbuffer mapping/unmapping and format packing/unpacking functions. The swrast and state tracker code for implementing accumulation are unused and will be removed in the next commit. v2: don't use memcpy() in _mesa_clear_accum_buffer() v3: don't allocate MAX_WIDTH arrays, be more careful with mapping flags Reviewed-by: Eric Anholt <[email protected]>
* mesa: remove the ctx->Driver.IsTextureResident() hookBrian Paul2011-12-081-1/+0
| | | | | | | No driver implemented this and we always returned "True" for residence queries. Reviewed-by: Ian Romanick <[email protected]>
* mesa: remove TextureMemCpy driver hookBrian Paul2011-12-081-1/+0
| | | | There's probably no reason to use a special version of memcpy() anymore.
* i965 gen6: Implement pass-through GS for transform feedback.Paul Berry2011-12-076-46/+208
| | | | | | | | | | | | | | | | | | | | | | In Gen6, transform feedback is accomplished by having the geometry shader send vertex data to the data port using "Streamed Vertex Buffer Write" messages, while simultaneously passing vertices through to the rest of the graphics pipeline (if rendering is enabled). This patch adds a geometry shader program that simply passes vertices through to the rest of the graphics pipeline. The rest of transform feedback functionality will be added in future patches. To make the new geometry shader easier to test, I've added an environment variable "INTEL_FORCE_GS". If this environment variable is enabled, then the pass-through geometry shader will always be used, regardless of whether transform feedback is in effect. On my Sandy Bridge laptop, I'm able to enable INTEL_FORCE_GS with no Piglit regressions. Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Clean up misleading defines for DWORD 2 of URB_WRITE header.Paul Berry2011-12-075-24/+59
| | | | | | | | | | | | | R02_PRIM_END and R02_PRIM_START don't actually refer to bits in DWORD 2 of R0 (as the name, and comments in the code, would seem to indicate). Actually they refer to bits in DWORD 2 of the header for URB_WRITE messages. This patch renames the defines to reflect what they actually mean. It also addes a define URB_WRITE_PRIM_TYPE_SHIFT, which previously was just hardcoded in .c files. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gs: Clean up dodgy register re-use, at the cost of a few MOVs.Paul Berry2011-12-072-65/+111
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to this patch, in the Gen4 and Gen5 GS, we used GRF 0 (called "R0" in the code) as a staging area to prepare the message header for the FF_SYNC and URB_WRITE messages. This cleverly avoided an unnecessary MOV operation (since the initial value of GRF 0 contains data that needs to be included in the message header), but it made the code confusing, since GRF 0 could no longer be relied upon to contain its initial value once the GS started preparing its first message. This patch avoids confusion by using a separate register ("header") as the staging area, at the cost of one MOV instruction. Worse yet, prior to this patch, the GS would completely overwrite the contents of GRF 0 with the writeback data it received from a completed FF_SYNC or URB_WRITE message. It did this because DWORD 0 of the writeback data contains the new URB handle, and that neds to be included in DWORD 0 of the next URB_WRITE message header. However, that caused the rest of the message header to be corrupted either with undefined data or zeros. Astonishingly, this did not produce any known failures (probably by dumb luck). However, it seems really dodgy--corrupting FFTID in particular seems likely to cause GPU hangs. This patch avoids the corruption by storing the writeback data in a temporary register and then copying just DWORD 0 to the header for the next message. This costs one extra MOV instruction per message sent, except for the final message. Also, this patch moves the logic for overriding DWORD 2 of the header (which contains PrimType, PrimStart, PrimEnd, and some other data that we don't care about yet). This logic is now in the function brw_gs_overwrite_header_dw2() rather than in brw_gs_emit_vue(). This saves one MOV instruction in brw_gs_quads() and brw_gs_quad_strip(), and paves the way for the Gen6 GS, which will need more complex logic to override DWORD 2 of the header. Finally, the function brw_gs_alloc_regs() contained a benign bug: it neglected to increment the register counter when allocating space for the "temp" register. This turned out not to have any effect because the temp register wasn't used on Gen4 and Gen5, the only hardware models (so far) to require a GS program. Now, all the registers allocated by brw_gs_alloc_regs() are actually used, and properly accounted for. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Allocate URB space for GSPaul Berry2011-12-073-12/+63
| | | | | | | | | | | | | | When the GS is not in use, the entire URB space is available for the VS. When the GS is in use, we split the URB space 50/50. The 50/50 split is probably not optimal--we'll probably want tune this for performance in a future patch. For example, in most situations, it's probably worth allocating more than 50% of the space to the VS, since VS space is used for vertex caching. But for now this is good enough. Based on previous work by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Set the maximum number of GS URB entries on Sandybridge.Kenneth Graunke2011-12-071-0/+2
| | | | | | | | | | | We never filled this in before because we didn't care. I'm skeptical these are correct; my sources indicate that both the VS and GS # of entries are 256 on both GT1 and GT2. I'm also loathe to change it and break stuff. Reviewed-by: Paul Berry <[email protected]>
* i965: Only convert if/else to conditional adds prior to Gen6.Paul Berry2011-12-071-2/+28
| | | | | | | | | | | | | | | | | | | | | | | | | Normally when outputting instructions in SPF (single program flow) mode, we convert IF and ELSE instructions to conditional ADD instructions applied to the IP register. On platforms prior to Gen6, flow control instructions cause an implied thread switch, so this is a significant savings. However, according to the SandyBridge PRM (Volume 4 part 2, p79): [Errata DevSNB{WA}] - When SPF is ON, IP may not be updated by non-flow control instructions. So we have to disable this optimization on Gen6. On later platforms, there is no significant benefit to converting flow control instructions to ADDs, so for the sake of consistency, this patch disables the optimization on later platforms too. The reason we never noticed this problem before is that so far we haven't needed to use SPF mode on Gen6. However, later patches in this series will introduce a Gen6 GS program which uses SPF mode. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gs: Remove unnecessary mapping of key->primitive.Paul Berry2011-12-072-16/+7
| | | | | | | | | | | | | Previously, GS generation code contained a lookup table that mapped primitive types POLYGON, TRISTRIP, and TRIFAN to TRILIST, mapped LINESTRIP to LINELIST, and left all other primitives unchanged. This was silly, because we never generate a GS program for those primitive types anyhow. This patch removes the unnecessary lookup table. Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Set Ivybridge's is_array SURFACE_STATE bit.Kenneth Graunke2011-12-071-1/+2
| | | | | | | | | | | Fixes piglit tests fbo-array, fbo-depth-array, fbo-generatemipmap-array, and array-texture, as well as the array variants of my new textureSize and texelFetch tests. Not a candidate for 7.11 because EXT_texture_array wasn't supported. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Return BRW_DEPTHBUFFER_D32_FLOAT as the null-depthbuffer format.Kenneth Graunke2011-12-071-0/+3
| | | | | | | | | | | Fixes many crashes on Ivybridge due to upload_sf_state calling brw_depthbuffer_format without an actual depth buffer. This was a recent regression on master. +3992 piglits on Ivybridge. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* intel: Update comment about how depth/stencil miptrees are handled.Eric Anholt2011-12-071-6/+18
| | | | | | | This evolved over several commits, and I also wanted to document some new information about how we handle formats. Reviewed-by: Chad Versace <[email protected]>