summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers
Commit message (Collapse)AuthorAgeFilesLines
* mesa: allocate gl_debug_state on demandBrian Paul2014-02-081-1/+5
| | | | | | | | | | | | We don't need to allocate all the state related to GL_ARB_debug_output until some aspect of that extension is actually needed. The sizeof(gl_debug_state) is huge (~285KB on 64-bit systems), not even counting the 54(!) hash tables and lists that it contains. This change reduces the size of gl_context alone from 431KB bytes to 145KB bytes on 64-bit systems and from 277KB bytes to 78KB bytes on 32-bit systems. Reviewed-by: Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Label JIP and UIP in Broadwell shader disassembly.Kenneth Graunke2014-02-071-2/+6
| | | | | | | This makes it obvious which number is which. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Don't disassemble UIP field for Broadwell WHILE instructions.Kenneth Graunke2014-02-071-2/+1
| | | | | | | The WHILE instruction doesn't have UIP. It only has JIP. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Don't print source registers for Broadwell flow control.Kenneth Graunke2014-02-071-13/+14
| | | | | | | | | | | The bits which normally contain the source register descriptions actually contain the JIP/UIP jump targets, which we already printed. Interpreting JIP/UIP as source registers results in some really creepy looking output, like IF statements with acc14.4<0,1,0>UD sources. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Fix fast depth clear values on Broadwell.Kenneth Graunke2014-02-071-1/+4
| | | | | | | | | | | | | | Broadwell's 3DSTATE_CLEAR_PARAMS packet expects a floating point value regardless of format. This means we need to stop converting it to UNORM. Storing the value as float would make sense, but since we already have a uint32_t field, this patch continues shoehorning it into that. In a sense, this makes mt->depth_clear_value the DWord you emit in the packet, rather than the clear value itself. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Enable ARB_texture_gather for one component on Gen6.Chris Forbes2014-02-082-1/+3
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Emit shader w/a for Gen6 gatherChris Forbes2014-02-082-0/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Emit shader w/a for Gen6 gatherChris Forbes2014-02-082-0/+35
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add surface format overrides for Gen6 gatherChris Forbes2014-02-081-5/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add Gen6 gather wa to sampler keyChris Forbes2014-02-082-0/+32
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add some informative debug when the X Server botches DRI2 GetBuffers.Eric Anholt2014-02-071-1/+11
| | | | | | | | We've had various bug reports over the years where miptrees are missing, and when I screwed it up while adding DRI2 to the modesetting driver, I figured I should put the info necessary for debug here. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove redundant check in blitter-based glBlitFramebuffer().Eric Anholt2014-02-071-10/+0
| | | | | | | The intel_miptree_blit() code checks the format for us now, plus it handles xrgb vs argb for us. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix Gen8+ disassembly of half float subregister numbers.Kenneth Graunke2014-02-071-0/+1
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Use the new brw_load_register_mem helper for draw indirect.Kenneth Graunke2014-02-071-31/+22
| | | | | | | | | | | This makes it work on Broadwell, too. v2: Drop bogus double write to 3DPRIM_BASE_VERTEX register (caught by Chris Forbes). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Implement a brw_load_register_mem helper function.Kenneth Graunke2014-02-072-0/+32
| | | | | | | | | | | | This saves some boilerplate and hides the OUT_RELOC/OUT_RELOC64 distinction. Placing the function in intel_batchbuffer.c is rather arbitrary; there wasn't really an obvious place for it. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* i965: Fix INTEL_DEBUG=vs for fixed-function/ARB programs.Kenneth Graunke2014-02-072-4/+4
| | | | | | | | | | | Since commit 9cee3ff562f3e4b51bfd30338fd1ba7716ac5737, INTEL_DEBUG=vs has caused a NULL pointer dereference for fixed-function/ARB programs. In the vec4 generators, "prog" is a gl_program, and "shader_prog" is the gl_shader_program. This is different than the FS visitor. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Add missing null check in fs_visitor::dead_code_eliminate_local()Juha-Pekka Heikkila2014-02-071-0/+4
| | | | | Signed-off-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* i965/vs: Fix typo in brw_compute_vue_mapChris Forbes2014-02-051-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix register types in dump_instructions().Kenneth Graunke2014-02-054-2/+32
| | | | | | | | | | | | | | This regressed when I converted BRW_REGISTER_TYPE_* to be an abstract type that doesn't match the hardware description. dump_instruction() was using reg_encoding[] from brw_disasm.c, which no longer matches (and was incorrect for Gen8+ anyway). This patch introduces a new function to convert the abstract enum values into the letter suffix we expect. Signed-off-by: Kenneth Graunke <[email protected]> Reported-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/fs: Assume FBO rendering in precompile if MRT.Chris Forbes2014-02-061-4/+5
| | | | | | | | If multiple color outputs are written, this shader is unlikely to be useful with a winsys framebuffer. Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Guess nr_color_regions better in precompileChris Forbes2014-02-061-1/+3
| | | | | Signed-off-by: Chris Forbes <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move intel_prepare_render() above first buffer accessKristian Høgsberg2014-02-052-4/+4
| | | | | | | | | | | | | | The driver is supposed to ensure buffers before any drawing operation, but in do_blit_drawpixels() and do_blit_copypixels() we inspect the buffer format before calling intel_prepare_render(). That was covered up by the unconditional call to intel_prepare_render() in intelMakeCurrent(), but we now only do this on the initial intelMakeCurrent call for a context (to get the size for the initial viewport values). https://bugs.freedesktop.org/show_bug.cgi?id=74083 Signed-off-by: Kristian Høgsberg <[email protected]> Tested-by: Alexander Monakov <[email protected]>
* i965/cs: Allow ARB_compute_shader to be enabled via env var.Paul Berry2014-02-052-1/+13
| | | | | | | | | | This will allow testing of compute shader functionality before it is completed. To enable ARB_compute_shader functionality in the i965 driver, set INTEL_COMPUTE_SHADER=1. Reviewed-by: Jordan Justen <[email protected]>
* i965/cs: Create the brw_compute_program struct, and the code to initialize it.Paul Berry2014-02-052-0/+19
| | | | | | v2: Fix comment. Reviewed-by: Jordan Justen <[email protected]>
* i965/blorp: do not use unnecessary hw-blending supportTopi Pohjolainen2014-02-041-20/+0
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is really not needed as blorp blit programs already sample XRGB normally and get alpha channel set to 1.0 automatically by the sampler engine. This is simply copied directly to the payload of the render target write message and hence there is no need for any additional blending support from the pixel processing pipeline. The blending formula is anyway broken for color components, it multiplies the color component with itself (blend factor is the component itself). Alpha blending in turn would not fix the alpha to one independent of the source but simply used the source alpha as is instead (1.0 * src_alpha + 0.0 * dst_alpha). Quoting Eric: "If we want to actually make the no-alpha-bits-present thing work, we need to override the bits in the surface state or in the generated code. In the normal draw path, it's done for sampling by the swizzling code in brw_wm_surface_state.c, and the blending overrides is just to fix up the alpha blending stage which doesn't pay attention to that for the destination surface." If one modifies piglit test gl-3.2-layered-rendering-blit to use color component values other than zero or one, this change will kick in on IVB. No regressions on IVB. This is effectively revert of c0554141a9b831b4e614747104dcbbe0fe489b9d: i965/blorp: Support overriding destination alpha to 1.0. Currently, Blorp requires the source and destination formats to be equal. However, we'd really like to be able to blit between XRGB and ARGB formats; our BLT engine paths have supported this for a long time. For ARGB -> XRGB, nothing needs to occur: the missing alpha is already interpreted as 1.0. For XRGB -> ARGB, we need to smash the alpha channel to 1.0 when writing the destination colors. This is fairly straightforward with blending. For now, this code is never used, as the source and destination formats still must be equal. The next patch will relax that restriction. NOTE: This is a candidate for the 9.1 branch. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Topi Pohjolainen <[email protected]>
* i965: Use brw_bo_map[_gtt]() in intel_miptree_map_raw().Kenneth Graunke2014-02-031-8/+2
| | | | | | | | | | | | | This moves the intel_batchbuffer_flush before the drm_intel_bo_busy call, which is a change in behavior. However, the old behavior was broken. In the future, we may want to only flush in the batchbuffer references the BO being mapped. That's certainly more typical. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Use brw_bo_map() in intel_texsubimage_tiled_memcpy().Kenneth Graunke2014-02-031-7/+1
| | | | | | | | | This additionally measures the time stalled, while also simplifying the code. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Create drm_intel_bo_map wrappers with performance warnings.Kenneth Graunke2014-02-032-0/+46
| | | | | | | | | | | | | | | | | | | | | | Mapping a buffer is a common place where we could stall the CPU. In a few places, we've added special code to check whether a buffer is busy and log the stall as a performance warning. Most of these give no indication of the severity of the stall, though, since measuring the time is a small hassle. This patch introduces a new brw_bo_map() function which wraps drm_intel_bo_map, but additionally measures the time stalled and reports a performance warning. If performance debugging is not enabled, it simply maps the buffer with negligable overhead. We also add a similar wrapper for drm_intel_gem_bo_map_gtt(). This should make it easy to add performance warnings in lots of places. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Carl Worth <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: Rename _mesa_..._array_obj functions to _mesa_..._vao.Kenneth Graunke2014-02-032-4/+4
| | | | | | | | | | | | | | | _mesa_update_vao_client_arrays() is less of a mouthful than _mesa_update_array_object_client_arrays(), and generally clearer. Generated by: $ find . -type f -print0 | xargs -0 sed -i \ 's/_mesa_\([^_]*\)_array_object/_mesa_\1_vao/g' with manual whitespace and indentation fixes applied. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Rename "struct gl_array_object" to gl_vertex_array_object.Kenneth Graunke2014-02-031-1/+1
| | | | | | | | | | | | | | | | | | I considered replacing it with "gl_vao", but spelling it out seemed to fit better with Mesa's traditional style. Mesa doesn't shy away from long type names - consider gl_transform_feedback_object, gl_fragment_program_state, gl_uniform_buffer_binding, and so on. Completely generated by: $ find . -type f -print0 | xargs -0 sed -i \ 's/gl_array_object/gl_vertex_array_object/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: Rename ArrayObj to VAO and DefaultArrayObj to DefaultVAO.Kenneth Graunke2014-02-031-69/+69
| | | | | | | | | | | | | | | | | | | | | | | | | | | | When reading through the Mesa drawing code, it's not immediately obvious to me that "ArrayObj" (gl_array_object) is the Vertex Array Object (VAO) state. The comment above the structure explains this, but readers still have to remember this and translate accordingly. Out of context, "array object" is a fairly vague. Even in context, "array" has a lot of meanings: glDrawArrays, vertex data stored in user arrays, gl_client_arrays, gl_vertex_attrib_arrays, and so on. Using the term "VAO" immediately associates these fields with the OpenGL concept, clarifying the situation and aiding programmer sanity. Completely generated by: $ find . -type f -print0 | xargs -0 sed -i \ -e 's/ArrayObj;/VAO;/g' \ -e 's/->ArrayObj/->VAO/g' \ -e 's/Array\.ArrayObj/Array.VAO/g' \ -e 's/Array\.DefaultArrayObj/Array.DefaultVAO/g' v2: Rerun command to resolve conflicts with Ian's meta patches. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: Silence several 'unused parameter' warnings10.1-branchpointIan Romanick2014-02-021-21/+17
| | | | | | | | | | | | | | | | | | | | | | | Silences many GCC warnings of the form: drivers/common/meta.c: In function 'cleanup_temp_texture': drivers/common/meta.c:1208:41: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_blit_framebuffer': drivers/common/meta.c:1453:46: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_blit_cleanup': drivers/common/meta.c:1998:43: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_clear_cleanup': drivers/common/meta.c:2287:44: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'setup_ff_generate_mipmap': drivers/common/meta.c:3365:45: warning: unused parameter 'ctx' [-Wunused-parameter] drivers/common/meta.c: In function 'meta_glsl_generate_mipmap_cleanup': drivers/common/meta.c:3556:54: warning: unused parameter 'ctx' [-Wunused-parameter] There are a couple other similar warnings, but they are less trivial. I want to investigate these further before axing them. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: Don't use fixed-function to decompress array texturesIan Romanick2014-02-021-3/+20
| | | | | | | | | | | | Array textures can't be used with fixed-function, so don't. Instead, just drop the decompress request on the floor. This is no worse than what was done previously because generating the GL error (in _mesa_set_enable) broke everything anyway. A later patch will get GL_TEXTURE_2D_ARRAY targets working. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: Use NDC in decompress_texture_imageIan Romanick2014-02-021-9/+8
| | | | | | | | There is no need to use pixel coordinates, and using NDC directly will simplify the GLSL paths. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* meta: Consistenly use non-Apple VAO functionsIan Romanick2014-02-021-4/+4
| | | | | | | | | | For these objects, meta was already using the non-Apple function to delete the objects. Everywhere else in the file uses _mesa_GenVertexArrays and _mesa_BindVertexArrays. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]>
* meta: Fallback to software for GetTexImage of compressed ↵Ian Romanick2014-02-021-1/+2
| | | | | | | | | | | | | GL_TEXTURE_CUBE_MAP_ARRAY The hardware decompression path isn't even close to being able to handle this. This converts the crash (assertion failure) in "EXT_texture_compression_s3tc/getteximage-targets S3TC CUBE_ARRAY" to a plain old failure. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]>
* meta: Release resources used by _mesa_meta_DrawPixelsIan Romanick2014-02-021-0/+19
| | | | | | | | | | _mesa_meta_DrawPixels creates a VAO and (potentially) two fragment programs, but none of them are ever released. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]>
* meta: Release resources used by decompress_texture_imageIan Romanick2014-02-021-0/+21
| | | | | | | | | | | decompress_texture_image creates an FBO, an RBO, a VBO, a VAO, and a sampler object, but none of them are ever released. Later patches will add program objects, exacerbating the problem. Leaking piles of memory is generally frowned upon. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Brian Paul <[email protected]> Cc: "9.1 9.2 10.0" <[email protected]>
* mesa: remove target param from ctx->Driver.TexParameter()Brian Paul2014-02-023-5/+4
| | | | | | Not really used anywhere. Reviewed-by: Kenneth Graunke <[email protected]>
* swrast: use _mesa_get_current_tex_object() in swrastSetTexBuffer2()Brian Paul2014-02-021-3/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* radeon: use _mesa_get_current_tex_object() in radeonSetTexBuffer2()Brian Paul2014-02-021-4/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* r200: use _mesa_get_current_tex_object() in r200SetTexBuffer2()Brian Paul2014-02-021-4/+2
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Silence unused variable 'ctx' warning.Kenneth Graunke2014-01-311-1/+0
| | | | | | Somehow I missed this before pushing the Broadwell PS state upload code. Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Fix math instruction hstride assertions on Broadwell.Kenneth Graunke2014-01-311-1/+1
| | | | | | | | | | | | | | In the final revision of my gen8_generator patch, I updated the MATH instruction's assertion from (dst.hstride == 1) to check that source and destination hstride matched. Unfortunately, I didn't test this enough, and many Piglit tests fail this test. The documentation indicates that "scalar source is also supported", which we believe means <0,1,0> access mode (hstride == 0). If hstride is non-zero, then it must match the destination register. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Disable 3DSTATE_WM_HZ_OP fields.Kenneth Graunke2014-01-312-0/+10
| | | | | | | | | | | | | Eric believes this to be wrong and unnecessary, as the command is supposed to emit an implicit rectangle primitive. However, empirically the pixel pipeline is completely unreliable without it. So for now, it stays until someone comes up with a better solution. We'll need to do better than this when we implement multisampling, HiZ, or fast clears...but for now, this will do. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Update GS state for Broadwell.Kenneth Graunke2014-01-315-1/+155
| | | | | | | | | | | | | | | This is quite similar to the Gen7 code. The main changes: - 48-bit relocations - Thread count is specified as U/2-1 instead of U-1. - An extra DWord (DW9) with clip planes, URB entry output length/offsets - We need to program the "Expected Vertex Count" (VerticesIn) v2: Set the number of binding table entries so they can be prefetched (requested by Eric Anholt). v3: Add a WARN_ONCE for a missing workaround. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update multisampling state for Broadwell.Kenneth Graunke2014-01-317-1/+135
| | | | | | | | | | | | | | | | | | | | On previous platforms, 3DSTATE_MULTISAMPLE contained the number of samples, pixel location, and the positions of each sample within a pixel for each multisampling mode (4x and 8x). It was also a non-pipelined command, presumably since changing the sample positions is fairly drastic. Broadwell improves upon this by splitting the sample positions out into a separate non-pipelined state packet, 3DSTATE_SAMPLE_PATTERN. With that removed, 3DSTATE_MULTISAMPLE becomes a pipelined state packet. Broadwell also supports 2x and 16x multisampling, in addition to the 4x and 8x supported by Gen7. This patch, however, does not implement 2x and 16x. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Chad Versace <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update 3DSTATE_{DEPTH,STENCIL,...}_BUFFER and such for Broadwell.Kenneth Graunke2014-01-314-1/+180
| | | | | | | | | | | | | | | | | | The amount of cut and paste from Gen7 is rather ugly, and should probably be cleaned up in the future. Even the Gen7 code is in need of some tidying though; many of the function parameters aren't used on platforms that use level/layer rather than tile offsets. Tidying both can be left to a future patch series. This at least gets things going. v2: Rebase on Paul's rename of NumLayers -> MaxNumLayers. v3: Shift QPitch by 2 when storing it in the packet. Bits 14:0 store bits 16:2 of the actual value. Fixes tests. v4: Add missing stencil buffer QPitch. Signed-off-by: Kenneth Graunke <[email protected]> Acked-by: Eric Anholt <[email protected]>
* i965: Update BLEND_STATE for Broadwell.Kenneth Graunke2014-01-314-1/+216
| | | | | | | | | | | v2: Allow logic ops on all surface types. The UNORM restriction was lifted with Haswell and I simply hadn't noticed. Also, add missing BRW_NEW_STATE_BASE_ADDRESS dirty bit. Both caught by Eric Anholt. v3: Fix swapped per-RT DWord pairs. Eliminates bizarre hacks. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* i965: Update SF_CLIP_VIEWPORT for Broadwell.Kenneth Graunke2014-01-314-1/+121
| | | | | | | | | | | It has additional fields to support clipping to the viewport even if guardband clipping is enabled. v2: Update for viewport array changes. v3: No, seriously, update for viewport array changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> [v1]