summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: don't make 1d staging textures linearIlia Mirkin2014-09-011-1/+0
| | | | | | | | | | | Experimentally, the sampler doesn't appear to like these, neither as buffer nor as rect textures. So remove 1D from the list of texture types to make linear when used for staging. This fixes the OSD in mplayer for VDPAU. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nv50: zero out unbound samplersIlia Mirkin2014-09-011-2/+5
| | | | | | | | | Samplers are only defined up to num_samplers, so set all samplers above nr to NULL so that we don't try to read them again later. Tested-by: Christian Ruppert <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* nvc0/ir: avoid infinite recursion when finding first uses of texIlia Mirkin2014-09-012-8/+29
| | | | | | | | | | | In certain circumstances, findFirstUses could end up doubling back on instructions it had already processed, resulting in an infinite recursion. Avoid this by keeping track of already-visited instructions. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83079 Tested-by: Tobias Klausmann <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.3" <[email protected]>
* freedreno/ir3: add DDX/DDYRob Clark2014-09-011-4/+53
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: don't keep IR aroundRob Clark2014-09-011-1/+6
| | | | | | | Once we've assembled the shader, no need to keep the intermediate around. Signed-off-by: Rob Clark <[email protected]>
* i965/fs: Don't segfault when debug-logging a null programJason Ekstrand2014-09-011-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/vec4: Don't segfault when debug-logging a null programJason Ekstrand2014-09-011-2/+2
| | | | | Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: implement EXPCLEAR optimization for depthMarek Olšák2014-09-015-2/+23
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: initialize HTILE to fully-expanded stateMarek Olšák2014-09-011-1/+3
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: implement fast depth clearMarek Olšák2014-09-014-2/+21
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: move DB_RENDER_CONTROL into draw_vboMarek Olšák2014-09-015-58/+46
| | | | | | So that I can add fast depth clear. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: disable occlusion queries if they are not neededMarek Olšák2014-09-011-0/+8
| | | | | | | We always left them enabled, which turned off HiZ in some cases. This should improve performace with Hyper-Z. Reviewed-by: Michel Dänzer <[email protected]>
* r600g,radeonsi: force fast stencil and HTILE stencil off, fixing a Hyper-Z hangMarek Olšák2014-09-012-9/+14
| | | | | | | | | | | | | This should be as fast as no HTILE for stencil. I think we can still get full performance with depth-only rendering even if stencil is present in the buffer but not used, but I'm not 100% sure. This may be revisited when HiS and fast stencil clear are implemented. This fixes a hang in Brutal Legend. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64471 Reviewed-by: Michel Dänzer <[email protected]>
* r600g: set VGT_ENHANCE=4 on R7xxMarek Olšák2014-09-012-0/+2
| | | | | | | This is a golden setting on RV740, but there is a hw bug which recommends setting it on all R7xx chipsets. Acked-by: Michel Dänzer <[email protected]>
* r600g: expose AMD_vertex_shader_layer and *_viewport_index on R600-R700Marek Olšák2014-09-011-1/+1
| | | | | | already implemented Acked-by: Michel Dänzer <[email protected]>
* r600g: fix layered clearMarek Olšák2014-09-011-1/+2
| | | | | Cc: [email protected] Acked-by: Michel Dänzer <[email protected]>
* r600g: some DB bug workarounds for R6xx DB flushingMarek Olšák2014-09-011-0/+7
| | | | Acked-by: Michel Dänzer <[email protected]>
* r600g: enable fast depth clear for array textures and cubemapsMarek Olšák2014-09-011-1/+2
| | | | | | I have a piglit test that hits this. Acked-by: Michel Dänzer <[email protected]>
* r600g: use HTILE allocator from SIMarek Olšák2014-09-013-47/+23
| | | | | | | | | | | | It's almost the same. This enables tiling for HTILE. It also enables Hyper-Z for other texture targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT). 2D array depth textures are tested by Unigine Sanctuary and my new piglit test. Acked-by: Michel Dänzer <[email protected]>
* r600g: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX for EG/CM, inline other fieldsMarek Olšák2014-09-011-9/+12
| | | | | | | | This fixes rendering to non-zero layer/face/slice with HTILE. v2: added the assertion Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: set DB_DEPTH_SIZE.HEIGHT_TILE_MAX, inline other fieldsMarek Olšák2014-09-011-9/+8
| | | | | | | | | | This fixes rendering to a non-zero layer/face/slice with HTILE. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685 v2: added the assertion Reviewed-by: Michel Dänzer <[email protected]>
* r600g: Implement sm5 geometry shader instancingGlenn Kennard2014-09-014-3/+15
| | | | | | Requires Evergreen or later hardware. Signed-off-by: Glenn Kennard <[email protected]>
* glsl_to_tgsi: allocate and enlarge arrays for temporaries on demandMarek Olšák2014-09-011-18/+33
| | | | | | | | | | | This fixes crashes if the number of temporaries is greater than 4096. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184 v2: added fail paths for realloc failures Cc: 10.2 10.3 [email protected] Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/pb_bufmgr_cache: limit the size of cacheMarek Olšák2014-09-014-8/+30
| | | | | | This should make a machine which is running piglit more responsive at times. e.g. streaming-texture-leak can easily eat 600 MB because of how fast it creates new textures.
* pipe-loader: use the correct screen indexMarek Olšák2014-09-011-2/+18
|
* egl/dri2: use the correct screen indexMarek Olšák2014-09-012-10/+30
| | | | Required for multi-GPU configuration where each GPU has its own X screen.
* docs: Mark ARB_compute_shader as work in progressJordan Justen2014-09-011-2/+2
| | | | Signed-off-by: Jordan Justen <[email protected]>
* i965/fs: don't use ir->shadow_comparitor in emit_texture_*Connor Abbott2014-09-012-7/+5
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: don't pass ir_variable * to emit_samplepos_setup()Connor Abbott2014-09-013-5/+4
| | | | | | | | We were only using it to get at its type, which we already know because it's a builtin variable. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: don't pass ir_variable * to emit_frontfacing_interpolation()Connor Abbott2014-09-014-6/+6
| | | | | | | | | | We were only using it to get at its type, which we already know because it's a builtin variable. v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix GPU hangs when INTEL_DEBUG=no16 is set.Kenneth Graunke2014-08-311-1/+2
| | | | | | | | The replicated data clear shader needs to be SIMD16, or else the GPU will hang. So, compile it even if INTEL_DEBUG=no16 is set. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa: fix make tarballsEmil Velikov2014-09-012-2/+3
| | | | | | | | | | | | | Current method of generating distribution tar-balls involves manually invoking make + target name in the appropriate places. This temporary solution is used until we get 'make dist' working. Currently it does not work, as in order to have the target (which is also a filename) available in the final Makefile we need to add a PHONY target + use the correct target name. Cc: "10.2 10.3" <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* i965/vec4: Remove try_emit_saturateAbdiel Janulgue2014-08-312-22/+0
| | | | | | | | | Now that saturate is implemented natively as an instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: Refactor try_emit_saturateAbdiel Janulgue2014-08-311-15/+8
| | | | | | | | | | | | v3: Since the fs backend can emit saturate as a separate instruction, there is no need to detect for min/max instructions and to rewrite the instruction tree accordingly. On the other hand, we don't need to emit a separate saturated mov either when the expression generating src can do saturate directly. v4: Add can_do_saturate() check before enabling saturate modifer (Ken) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* ir_to_mesa, glsl_to_tgsi: Remove try_emit_saturateAbdiel Janulgue2014-08-312-99/+0
| | | | | | | | | Now that saturate is implemented natively as instruction, we can cut down on unneeded functionality. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/vec4: Allow propagation of instructions with saturate flag to selAbdiel Janulgue2014-08-311-27/+58
| | | | | | | | | | | | | | | | | When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: - Syntax clarifications in inst->saturate assignment - Remove extra parenthesis when assigning src_reg value from copy_entry (Matt Turner) v4: - Take channels into consideration when propagating saturated instructions. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: Allow propagation of instructions with saturate flag to selAbdiel Janulgue2014-08-311-1/+17
| | | | | | | | | | | | | | When sel conditon is bounded within 0 and 1.0. This allows code as: mov.sat a b sel.ge dst a 0.25F To be propagated as: sel.ge.sat dst b 0.25F v3: Syntax clarifications in inst->saturate assignment (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize clamp(x, b, 1.0), where b > 0.0 as max(saturate(x),b)Abdiel Janulgue2014-08-311-0/+23
| | | | | | | | | | | | v2: - Output max(saturate(x),b) instead of saturate(max(x,b)) - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is > 0.0 and inner constant is 1.0. - Fix comments to show that the optimization is a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize clamp(x, 0.0, b), where b < 1.0 as min(saturate(x),b)Abdiel Janulgue2014-08-311-0/+39
| | | | | | | | | | | v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin - Make sure we do component-wise comparison for vectors (Ian Romanick) v3: - Add missing condition where the outer constant value is zero and inner constant is < 1 - Fix comments to reflect we are doing a commutative operation (Matt Turner) Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Optimize clamp(x, 0, 1) as saturate(x)Abdiel Janulgue2014-08-311-0/+36
| | | | | | | | | | | v2: - Check that the base type is float (Ian Romanick) v3: - Make sure comments reflect that we are doing a commutative operation - Add missing condition where the inner constant is 1.0 and outer constant is 0.0 - Make indexing of operands easier to read (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Implement saturate as ir_unop_saturateAbdiel Janulgue2014-08-311-5/+1
| | | | | | | | | Now that we have the ir_unop_saturate implemented as a single instruction, generate the correct simplified expression. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* yi965/vec4: Add support for ir_unop_saturateAbdiel Janulgue2014-08-311-0/+4
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/fs: Add support for ir_unop_saturateAbdiel Janulgue2014-08-312-0/+5
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* ir_to_mesa, glsl_to_tgsi: Add support for ir_unop_saturateAbdiel Janulgue2014-08-312-0/+12
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* ir_to_mesa, glsl_to_tgsi: lower ir_unop_saturateAbdiel Janulgue2014-08-312-2/+9
| | | | | | | | Needed when vertex programs doesn't allow saturate Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* glsl: Add a pass to lower ir_unop_saturate to clamp(x, 0, 1)Abdiel Janulgue2014-08-312-0/+30
| | | | | | Signed-off-by: Abdiel Janulgue <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add constant evaluation of ir_unop_saturateAbdiel Janulgue2014-08-311-0/+6
| | | | | | | | v2: Use CLAMP macro (Ian Romanick) Signed-off-by: Abdiel Janulgue <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add ir_unop_saturateAbdiel Janulgue2014-08-313-0/+4
| | | | | | Signed-off-by: Abdiel Janulgue <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4/fs: Count loops in shader debugAbdiel Janulgue2014-08-312-4/+8
| | | | | Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>
* i965/vec4: inline generate_vec4_instruction() within generate_code()Abdiel Janulgue2014-08-312-316/+296
| | | | | | | | | | Suggested by Matt. This patch combines and moves back the code-generation functions from generate_vec4_instruction() into generate_code(). Makes generate_code() a bit larger, but helps us to count loops in a straightforward manner. Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Abdiel Janulgue <[email protected]>