summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i965 gen6: Turn on transform feedback extension unconditionally.Paul Berry2011-12-201-1/+1
| | | | | | | | | | | | Previously, we only enabled transform feedback when MESA_GL_VERSION_OVERRIDE was 3.0 or greater, since transform feedback support was not completely finished, so it didn't make sense to advertise support for it unless absolutely necessary. Now that transform feedback is fully implemented on gen6, we can enable this extension unconditionally. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Implement transform feedback queries.Paul Berry2011-12-203-0/+54
| | | | | | | | | | | | | | | | | | | | This patch adds software-based PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries that work by keeping track of the number of primitives that are sent down the pipeline, and adjusting as necessary to account for the way each primitive type is tessellated. In the long run we'll want to replace this with a hardware-based implementation, because the software approach won't work with geometry shaders or primitive restart. However, at the moment, we don't have the necessary kernel support to implement a hardware-based query (we would need the kernel to save GPU registers when context switching, so that drawing performed by another process doesn't get counted). Fixes Piglit tests EXT_transform_feedback/query-primitives_generated-* and EXT_transform_feedback/query-primitives-written-*. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Convert if/else to switch statements in brw_queryobj.cPaul Berry2011-12-201-6/+30
| | | | | | | | | | | | | Previously, i965 only supported two query types: GL_TIME_ELAPSED_EXT and GL_SAMPLES_PASSED_ARB, and it distinguished between the two using if/else statements that compared query->Base.Target to GL_TIME_ELAPSED_EXT. This patch changes the if/else statements to switch statements so that we can add more query types without having to have a chain of else-ifs. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Ensure correct transform feedback indices on new batch.Paul Berry2011-12-205-8/+72
| | | | | | | | | | | | | | | | We don't currently have kernel support for saving GPU registers on a context switch, so if multiple processes are performing transform feedback at the same time, their SVBI registers will interfere with each other. To avoid this situation, we keep a software shadow of the state of the SVBI 0 register (which is the only register we use), and re-upload it on every new batch. The function that updates the shadow state of SVBI 0 is called brw_update_primitive_count, since it will also be used to update the counters for the PRIMITIVES_GENERATED and TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN queries. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add a function to query whether a meta-op is in progress.Paul Berry2011-12-202-0/+13
| | | | | | | This is needed by i965 to ensure that transform feedback counters are not incremented during meta-ops. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Implement rasterizer discard.Paul Berry2011-12-203-0/+37
| | | | | | | | | | | | | | | | | | | This patch enables rasterizer discard functionality (a part of transform feedback) in Gen6, by generating an alternate GS program when rasterizer discard is active. Instead of forwarding vertices down the pipeline, the alternate GS program uses a URB Write message to deallocate the URB entry that was allocated by FF sync and terminate the thread. Note: parts of the Sandy Bridge PRM seem to imply that we could do this more efficiently, by clearing the GEN6_GS_RENDERING_ENABLE bit, and not allocating a URB entry at all. However, it's not clear how we are supposed to terminate the thread if we do that. Volume 2 part 1, section 4.5.4, says "GS threads must terminate by sending a URB_WRITE message with the EOT and Complete bits set.", and my experiments so far confirm that. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Implement bounds checking for transform feedback output.Kenneth Graunke2011-12-204-0/+52
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Paul Berry <[email protected]>
* i965: Flush pipeline on EndTransformFeedback.Paul Berry2011-12-203-0/+22
| | | | | | | | | | | | | | | | | | | | | | | A common use case for transform feedback is to perform one draw operation that writes transform feedback output to a buffer, followed by a second draw operation that consumes that buffer as vertex input. Since vertex input is consumed at an earlier pipeline stage than writing transform feedback output, we need to flush the pipeline to ensure that the transform feedback output is completely written before the data is consumed. In an ideal world, we would do some dependency tracking, so that we would only flush the pipeline if the next draw call was about to consume data generated by a previous draw call in the same batch. However, since we don't have that sort of dependency tracking infrastructure right now, we just unconditionally flush the buffer every time glEndTransformFeedback() is called. This will cause a performance hit compared to the ideal case (since we will sometimes flush the pipeline unnecessarily), but fortunately the performance hit will be confined to circumstances where transform feedback is in use. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965 gen6+: Make intel_batchbuffer_emit_mi_flush() actually flush.Paul Berry2011-12-201-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previous to this patch, the function intel_batchbuffer_emit_mi_flush() was a bit of a misnomer. On Gen4+, when not using the blit engine, it didn't actually flush the pipeline--it simply generated a PIPE_CONTROL command with the necessary bits set to flush GPU caches. This was usually sufficient, since in most situations where intel_batchbuffer_emit_mi_flush() was called, all we really care about was ensuring cache coherency. However, with the advent of OpenGL 3.0, there are two cases in which data output by one stage of the pipeline might be consumed, in a later draw operation, by an earlier stage of the pipeline: (a) When using textures in the vertex shader. (b) When using drawing with a vertex buffer that was previously generated using transform feedback. This patch addresses case (a) by changing intel_batchbuffer_emit_mi_flush() so that on Gen6+, it sets the PIPE_CONTROL_CS_STALL bit (this forces the pipeline to actually flush). (Case (b) will be addressed by the next patch in the series). This is not an ideal solution--in a perfect world, the driver would have some buffer dependency tracking so that we would only have to flush the pipeline in the two cases above. Until that dependency tracking is implemented, however, it seems prudent to have intel_batchbuffer_emit_mi_flush() actually flush the pipeline, so that we get correct rendering, at the expense of a (hopefully small) performance hit. The change is only applied to Gen6+, since at the moment only Gen6+ supports the OpenGL 3.0 features that make a full pipeline flush necessary. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965 gen6: Turn on transform feedback extension.Paul Berry2011-12-201-0/+3
| | | | | | | | | | | | This patch advertises support for EXT_transform_feedback on Intel Gen6. Since transform feedback support is not completely finished yet, for now we only advertise support for it when MESA_GL_VERSION_OVERRIDE is 3.0 or greater (since transform feedback is required by GL version 3.0). Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6: Initial implementation of transform feedback.Paul Berry2011-12-2015-10/+417
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch adds basic transform feedback capability for Gen6 hardware. This consists of several related pieces of functionality: (1) In gen6_sol.c, we set up binding table entries for use by transform feedback. We use one binding table entry per transform feedback varying (this allows us to avoid doing pointer arithmetic in the shader, since we can set up the binding table entries with the appropriate offsets and surface pitches to place each varying at the correct address). (2) In brw_context.c, we advertise the hardware capabilities, which are as follows: MAX_TRANSFORM_FEEDBACK_INTERLEAVED_COMPONENTS 64 MAX_TRANSFORM_FEEDBACK_SEPARATE_ATTRIBS 4 MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS 16 OpenGL 3.0 requires these values to be at least 64, 4, and 4, respectively. The reason we advertise a larger value than required for MAX_TRANSFORM_FEEDBACK_SEPARATE_COMPONENTS is that we have already set aside 64 binding table entries, so we might as well make them all available in both separate attribs and interleaved modes. (3) We set aside a single SVBI ("streamed vertex buffer index") for use by transform feedback. The hardware supports four independent SVBI's, but we only need one, since vertices are added to all transform feedback buffers at the same rate. Note: at the moment this index is reset to 0 only when the driver is initialized. It needs to be reset to 0 whenever BeginTransformFeedback() is called, and otherwise preserved. (4) In brw_gs_emit.c and brw_gs.c, we modify the geometry shader program to output transform feedback data as a side effect. (5) In gen6_gs_state.c, we configure the geometry shader stage to handle the SVBI pointer correctly. Note: ordering of vertices is not yet correct for triangle strips (alternate triangles are improperly oriented). This will be addressed in a future patch. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965 gs: Move vue_map to brw_gs_compile.Paul Berry2011-12-202-3/+4
| | | | | | | | | | | | This patch stores the geometry shader VUE map from a local variable in compile_gs_prog() to a field in the brw_gs_compile struct, so that it will be available while compiling the geometry shader. This is necessary in order to support transform feedback on Gen6, because the Gen6 geometry shader code that supports transform feedback needs to be able to inspect the VUE map in order to find the correct vertex data to output. Reviewed-by: Kenneth Graunke <[email protected]>
* i965 gen6+: Use 1-wide null operands for IF instructionsPaul Berry2011-12-201-4/+4
| | | | | | | | | | | | | | The Sandy Bridge PRM, volume 4, part 2, section 5.3.10 ("5.3.10 Register Region Restrictions") contains the following restriction on the execution size and operand width of instructions: "3. ExecSize must be equal to or greater than Width." When emitting an IF instruction in single program flow mode on Gen6+, we use an ExecSize of 1, therefore the Width of each operand must also be 1. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Advertise our vertex shader texture units.Kenneth Graunke2011-12-191-1/+1
| | | | | | | | | | | | | Previously, we advertised 0 VS texture units. Now that we have proper support for using the sampling engine in the VS, we can advertise 16, which is conveniently the number required for OpenGL 3.0. v2: Enable on Gen4. I hacked up my tests to not use flat ivec varyings and they pass. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Implement EXT_texture_swizzle support for VS texturing.Kenneth Graunke2011-12-192-1/+52
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Add texture related data to brw_vs_prog_key.Kenneth Graunke2011-12-192-0/+11
| | | | | | | | Now that this is all factored out, it's trivial to do. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Only set brw_wm_prog_key data for samplers used by the WM.Kenneth Graunke2011-12-191-1/+3
| | | | | | | | | This should avoid state-dependent FS recompiles when samplers that are only used by the VS change. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Factor out texturing related data from brw_wm_prog_key.Kenneth Graunke2011-12-197-115/+168
| | | | | | | | | | The idea is to reuse this for the VS and (in the future) GS as well. v2: Include yuvtex data since we're not dropping GL_MESA_ycbycr. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1] Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Add support for texel offsets.Kenneth Graunke2011-12-193-2/+23
| | | | | | | | | | | The visit() half computes the values to put in the header based on the IR and simply stuffs that in the vec4_instruction; the emit() half uses this to set up the message header. This works out well since emit() can use brw_reg directly and access individual DWords without kludgery. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/fs: Factor out texture offset bitfield computation.Kenneth Graunke2011-12-193-18/+26
| | | | | | | | | We'll want to reuse this for the VS, and it's complex enough that I'd rather not cut and paste it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vs: Implement vec4_visitor::visit(ir_texture *).Kenneth Graunke2011-12-191-7/+120
| | | | | | | | | | | | | | | This translates the GLSL compiler's IR into vec4_instruction IR, generating code to load coordinates, LOD info, shadow comparitors, and so on into the appropriate message registers. It turns out that the SIMD4x2 parameters are identical on Gen 5-7, and the Gen4 code is similar enough that, unlike in the FS, it's easy enough to support all generations in a single function. v2: Load zeros for missing coordinates (fixing vs-texelFetch-sampler1D and 2D on G45), and fix G45 message length for shadow comparisons. Signed-off-by: Kenneth Graunke <[email protected]>
* i965/vs: Implement vec4_visitor::generate_tex().Kenneth Graunke2011-12-192-0/+110
| | | | | | | | | | This is the part that takes the vec4_instruction IR and turns it into actual Gen ISA. v2: Add Gen4 messages, don't retype m0 to UW. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Add missing SIMD4x2 sample_l_c message #defines.Kenneth Graunke2011-12-191-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Don't minify depth when setting up cube map miptrees on Gen4.Kenneth Graunke2011-12-191-1/+2
| | | | | | | | | | | | | | | | | | | | Prior to Ironlake, cube maps were stored as 3D textures. In recent refactoring, we removed a separate "layers" parameter in favor of using depth. Unfortunately, depth was getting minified, which is only correct for actual 3D textures. Fixes piglit tests: - bugs/crash-cubemap-order - fbo/fbo-cubemap - texturing/cubemap Also changes texturing/cubemap npot from abort to fail. This hasn't seen a full test run since Piglit on Mesa master hangs GM45 a lot. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Add support for GL_ARB_depth_buffer_float under 3.0 override.Eric Anholt2011-12-194-1/+20
| | | | | | | | This is not exposed generally yet because some of the swrast paths hit in piglit (drawpixels, copypixels, blit) aren't yet converted to MapRenderbuffer. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add separate stencil/HiZ setup for MESA_FORMAT_Z32_FLOAT_X24S8.Eric Anholt2011-12-193-15/+20
| | | | | | | | | This is a little more unusual than the separate MESA_FORMAT_S8_Z24 support, because in addition to storing the real stencil data in a MESA_FORMAT_S8 miptree, we also make the Z miptree be MESA_FORMAT_Z32_FLOAT instead of the requested format. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Use the miptree format for texture surface format choice.Eric Anholt2011-12-192-2/+2
| | | | | | | | With separate stencil GL_DEPTH32F_STENCIL8, the miptree will have a really different format (MESA_FORMAT_Z32_FLOAT) from the teximage (MESA_FORMAT_Z32_FLOAT_X24S8). Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add support for mapping Z32_FLOAT_X24S8 fake packed depth/stencil.Eric Anholt2011-12-191-5/+17
| | | | | | | | The format handling here is tricky, because we're not actually generating a Z32_FLOAT_X24S8 miptree, so we're guessing the format that GL wants based on seeing Z32_FLOAT with a separate stencil. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop creating the wrapped depth irb.Eric Anholt2011-12-192-111/+8
| | | | | | | All the operations were just trying to get at irb->wrapped_depth->mt, which is the same as irb->mt now. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Properly demote the depth mt format for fake packed depth/stencil.Eric Anholt2011-12-194-3/+19
| | | | | | | | | | | gen7 only supports the non-packed formats, even if you associate a real separate stencil buffer -- otherwise it's as if the depth test always fails. This requires a little bit of care in the match_texture_image case, since the miptree format no longer matches the texture image format. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Reuse intel_miptree_match_image().Eric Anholt2011-12-191-9/+6
| | | | | | | | This little bit of logic was duplicated, which isn't much, but I was going to need to duplicate a bit of additional logic in the next commit. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Stop creating the wrapped stencil irb.Eric Anholt2011-12-195-78/+67
| | | | | | | | | | | There were only two places it was really used at this point, which was in the batchbuffer emit of the separate stencil packets for gen6/7. Just write in the ->stencil_mt reference in those two places and ditch all this flailing around with allocation and refcounts. v2: Fix separate stencil on gen7. Reviewed-by: Kenneth Graunke <[email protected]>
* osmesa: fix RGB565 renderingAlex Galakhov2011-12-191-0/+4
| | | | Signed-off-by: Brian Paul <[email protected]>
* i965/vs: Add a new dst_reg constructor for file, number, type, and mask.Kenneth Graunke2011-12-181-0/+10
| | | | | | | | This will be especially useful for loading texturing parameters, where I need to (for example) reference m3.xz<D>. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/vs: Add vec4_instruction::is_tex() query.Kenneth Graunke2011-12-182-0/+11
| | | | | | | Copy and pasted from fs_inst::is_tex(), but without TXB. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Rename texturing ops from FS_OPCODE to SHADER_OPCODE, except TXB.Kenneth Graunke2011-12-185-46/+48
| | | | | | | | We'll be reusing most of these for the VS shortly. The one exception is TXB (texturing with LOD bias), which is explicitly forbidden in the VS. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965/fs: Don't swizzle the results of textureSize().Kenneth Graunke2011-12-181-0/+3
| | | | | | | | | Fixes a regression since d2235b0f4681f75d562131d655a6d7b7033d2d8b, in my new textureSize sampler(1DArrayShadow|2DShadow|2DArrayShadow) piglit tests, though I'm not honestly sure how this ever worked. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meta: use _mesa_prepare_mipmap_level() in the mipmap generation codeBrian Paul2011-12-161-35/+12
| | | | | | | See previous commit for more information. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: implement DrawTransformFeedback from ARB_transform_feedback2Marek Olšák2011-12-153-7/+14
| | | | | | | | | | | | | | It's like DrawArrays, but the count is taken from a transform feedback object. This removes DrawTransformFeedback from dd_function_table and adds the same function to GLvertexformat (with the function parameters matching GL). The vbo_draw_func callback has a new parameter "struct gl_transform_feedback_object *tfb_vertcount". The rest of the code just validates states and forwards the transform feedback object into vbo_draw_func.
* i965: Drop separate stencil assertions in update_draw_buffer().Eric Anholt2011-12-141-16/+0
| | | | | | | The comment said they deserved to be in emit_depthbuffer, and at this point they were all there already. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Simplify and touch up the FBO completeness test.Eric Anholt2011-12-141-18/+21
| | | | | | | | | Now that we have miptrees for everything, we can more easily test for !has_separate_stencil completeness. Also, test for whether the stencil rb is the wrong kind of format for separate stencil, or if we are trying to do packed to different images of a single miptree. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Remove another renderbuffer allocation path.Eric Anholt2011-12-141-8/+4
| | | | | | | | | | Now there's the thing that CALLOCs and sets up window system vtable, and the thing that CALLOCs and sets up user renderbuffer vtable. The user renderbuffer vtable gets replaced later by intel_renderbuffer_update_wrapper for wrapped renderbuffers (things with name == ~0). Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Make the separate stencil RB storage path match texture more.Eric Anholt2011-12-141-76/+52
| | | | | | | There were too many things making intel_renderbuffer *s and tweaking their bits. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Move S8 width/height alignment to miptree creation.Eric Anholt2011-12-143-55/+22
| | | | | | | | We were doing it in the caller in the renderbuffer code, but it was missed in the separate stencil creation for textures. Apparently our testing was using renderbuffers or pre-aligned sizes. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Drop check for wrapped_depth in RB mapping.Eric Anholt2011-12-141-1/+1
| | | | | | | This used to be needed because irb->mt would be unset for fake packed depth/stencil, but no longer. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Fix uninitialized values in debug output for renderbuffer mapping.Eric Anholt2011-12-141-1/+1
|
* radeon: stop using _DepthBuffer, _StencilBuffer fieldsBrian Paul2011-12-132-9/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* nouveau: stop using _DepthBuffer, _StencilBuffer fieldsBrian Paul2011-12-136-13/+14
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa,intel: use _mesa_image_offset() for PBOsnobled2011-12-081-2/+3
| | | | | | | | This avoids forming invalid pointers needlessly, which even if never dereferenced is undefined behavior. It also makes _mesa_validate_pbo_access() more comprehensible. Reviewed-by: Brian Paul <[email protected]>
* mesa/drivers: use new swrast renderbuffer functionsBrian Paul2011-12-0812-62/+74
| | | | Reviewed-by: Eric Anholt <[email protected]>