summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/vec4: Perform CSE on CMP(N) instructions.Matt Turner2014-07-061-1/+16
| | | | | | | | Port of commit b16b3c87 to the vec4 code. No shader-db improvements, but might as well. The fs backend saw an improvement because it's scalar and multiple identical CMP instructions were generated by the SEL peepholes.
* i965/vec4: Don't emit null MOVs in CSE.Matt Turner2014-07-061-5/+7
| | | | Port of commit 219b43c6 to the vec4 code.
* i965/vec4: Improve CSE performance by expiring some available expressions.Matt Turner2014-07-061-0/+20
| | | | Port of commit 5daf867f to the vec4 code.
* i965/vec4: Add basic common subexpression elimination.Kenneth Graunke2014-07-064-0/+236
| | | | | | | | | | | [mattst88]: Modified to perform CSE on instructions with the same writemask. Offered no improvement before. total instructions in shared programs: 1995633 -> 1995185 (-0.02%) instructions in affected programs: 14410 -> 13962 (-3.11%) Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* i965: Fix warnings introduced in commit e24ef5ab.Matt Turner2014-07-061-2/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* gallium/radeon: use PRIX64 instead of PRIu64Christian König2014-07-062-2/+2
| | | | | | | We want hex values here, not decimals. Signed-off-by: Christian König <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: Move assembly annotation functions to intel_asm_annotation.c.Matt Turner2014-07-054-61/+67
| | | | | | It's C. Compile it as such. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename intel_asm_printer -> intel_asm_annotation.Matt Turner2014-07-058-7/+7
| | | | | | The #ifndef include guards already said the right thing :) Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make backend_instruction usable from C.Matt Turner2014-07-051-4/+7
| | | | | | | With a hack to place an exec_node in the struct in C to be at the same location as the inherited exec_node in C++. Acked-by: Topi Pohjolainen <[email protected]>
* i965/cfg: Make cfg_t usable from C.Matt Turner2014-07-053-8/+6
| | | | Acked-by: Topi Pohjolainen <[email protected]>
* i965: Repack backend_instruction struct.Matt Turner2014-07-051-7/+5
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a brw_predicate enum.Matt Turner2014-07-056-31/+35
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a brw_conditional_mod enum.Matt Turner2014-07-0518-43/+54
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move common fields into backend_instruction.Matt Turner2014-07-053-25/+13
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use enum brw_reg_type for register types.Matt Turner2014-07-057-13/+14
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move is_zero/one/null/accumulator into backend_reg.Matt Turner2014-07-056-93/+44
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Make a common backend_reg class.Matt Turner2014-07-054-42/+36
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop imm union from visitor register classes.Matt Turner2014-07-052-14/+0
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use immediate storage in brw_reg for visitor regs.Matt Turner2014-07-056-41/+37
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* docs: add news item for mesa-demos 8.2.0 releaseAndreas Boll2014-07-051-0/+8
|
* glsl: Fix merging of layout(invocations) with other qualifiersChris Forbes2014-07-051-0/+10
| | | | | | | | | | | | | | | | | If another layout qualifier appeared to the left of `invocations` in the GS input layout declaration, the invocation count would be dropped on the floor. Fixes the piglit tests: spec/ARB_transform_feedback3/arb_transform_feedback3-ext_interleaved_two_bufs_gs_max spec/ARB_gpu_shader5/arb_gpu_shader5-invocation-id spec/ARB_gpu_shader5/compiler/correct-multiple-layout-qualifier-invocations.geom spec/ARB_gpu_shader5/execution/invocations-conflicting Signed-off-by: Chris Forbes <[email protected]> Tested-by: Ilia Mirkin <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* nvc0: add a memory barrier when there are persistent UBOsIlia Mirkin2014-07-035-4/+57
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50: do an explicit flush on draw when there are persistent buffersIlia Mirkin2014-07-033-2/+50
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50: disable dedicated ubo upload methodIlia Mirkin2014-07-031-0/+7
| | | | | | | | | | The hardware allows multiple simultaneous renders with the same memory-backed constbufs but with each invocation having different values. However in order for that to work, the data has to be streamed in via the right constbuf slot. We weren't doing that for UBOs. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2 10.1" <[email protected]>
* gallium: rename PIPE_CAP_TGSI_VS_LAYER to also have _VIEWPORTIlia Mirkin2014-07-0317-18/+20
| | | | | | | | | Now that this cap is used to determine the availability of both, adjust its name to reflect the new reality. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa/st: enable AMD_vertex_shader_viewport_indexIlia Mirkin2014-07-033-1/+7
| | | | | | | | | | The assumption is that any driver capable of emitting layer from the vertex shader and supporting viewports should be able to also handle emitting viewport index from the vertex shader. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Tobias Droste <[email protected]>
* r600g: allow vs to write to gl_ViewportIndexIlia Mirkin2014-07-031-0/+17
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Tested-by: Tobias Droste <[email protected]>
* svga: Don't unnecessarily reemit BindGBShader commands v2Thomas Hellstrom2014-07-033-20/+8
| | | | | | | | | | | | | The Linux winsys can no longer relocate shader code, so avoid reemitting BindGBShader commands. They are costly. v2: Correctly handle errors from SVGA3D_BindGBShader() Reported-by: Michael Banack <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]> Tested-by: Brian Paul <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* radeon/llvm: Allocate space for kernel metadata operandsAaron Watry2014-07-031-3/+7
| | | | | | | | | | | | | | | | | | Previously, we were assuming that kernel metadata nodes only had 1 operand. Kernels which have attributes can have more than 1, e.g.: !0 = metadata !{void (i32 addrspace(1)*)* @testKernel, metadata !1} !1 = metadata !{metadata !"work_group_size_hint", i32 4, i32 1, i32 1} Attempting to get the kernel without the correct number of attributes led to memory corruption and luxrays crashing out. Fixes the cl/program/execute/attributes.cl piglit test. Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76223 CC: "10.2" <[email protected]>
* glsl: fix duplicated layout qualifier detection for GSSamuel Iglesias Gonsalvez2014-07-031-6/+16
| | | | | | | | | | | | This patch fixes the duplicated layout qualifier detection for geometry shader's layout qualifiers. Also it makes the detection code more legible by defining allowed_duplicates_mask variable. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80778 Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* svga: add switch cases for PIPE_SHADER_CAP_DOUBLESBrian Paul2014-07-031-0/+4
| | | | Signed-off-by: Brian Paul <[email protected]>
* st/xa: Don't close the drm fd on failure v2Thomas Hellstrom2014-07-031-1/+6
| | | | | | | | | | | | | | | | If XA fails to initialize with pipe_loader enabled, the pipe_loader's cleanup function will close the drm file descriptor. That's pretty bad because the file descriptor will probably be the X server driver's only connection to drm. Temporarily solve this by dup()'ing the file descriptor before handing it over to the pipe loader. This fixes freedesktop.org bugzilla bug #80645. v2: Fix CC addresses. Cc: "10.2" <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Jakob Bornecrantz <[email protected]>
* Revert "radeonsi: Use dma_copy when possible for si_blit."Michel Dänzer2014-07-031-19/+0
| | | | | | | This reverts commit 5d5c20920e0e570742a497aa047e99a2fa3c04f2. Caused visual corruption, see e.g. https://bugs.freedesktop.org/show_bug.cgi?id=80827#c1
* i965: expose AMD_vertex_shader_viewport_index on gen7+Ilia Mirkin2014-07-022-1/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* glsl: add support for AMD_vertex_shader_viewport_indexIlia Mirkin2014-07-024-0/+8
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Tested-by: Tobias Droste <[email protected]>
* mesa: add support for AMD_vertex_shader_viewport_indexIlia Mirkin2014-07-022-0/+2
| | | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Chris Forbes <[email protected]> Tested-by: Tobias Droste <[email protected]>
* mesa/st: enable ARB_fragment_layer_viewportIlia Mirkin2014-07-023-1/+3
| | | | | | | | | | If multiple viewports are supported, that implies the presence of a GS and layered rendering, so we can enable ARB_fragment_layer_viewport as well. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* i965/gen6: Add a spec citation about push constant packet requirements.Eric Anholt2014-07-021-1/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Add a comment about null renderbuffer surfaces and why they exist.Eric Anholt2014-07-023-2/+22
| | | | | | I noticed this when trying to find comments about pull constant buffers. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Update a ton of comments about constant buffers.Eric Anholt2014-07-024-32/+74
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Merge VS/GS and WM pull constant buffer upload paths.Eric Anholt2014-07-024-53/+42
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965/gen6+: Merge VS/GS and WM push constant buffer upload paths.Eric Anholt2014-07-024-66/+52
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Move dispatch_grf_start_reg and first_curbe_grf into stage_prog_data.Eric Anholt2014-07-0214-35/+36
| | | | | | | I wanted to access this value from stage-generic code, so stop storing it under two different names. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix state flags for gen4/5 CURBE.Eric Anholt2014-07-021-8/+8
| | | | | | | | If we had some NOS affecting VS compilation that resulted in optimization changing the set of constants to be uploaded, we might not have reuploaded the constants. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Remove a dead define.Eric Anholt2014-07-021-2/+0
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse libdrm's header for AUB definitions.Eric Anholt2014-07-024-71/+7
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix stale comments about the state cache.Eric Anholt2014-07-022-2/+9
| | | | | | This changed in the state streaming work years ago. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Fix stale binding table comment.Eric Anholt2014-07-021-2/+0
| | | | | | | I recently moved the code from the mentioned location right into this file. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Drop the memcmp for finding duplicated CURBE uploads.Eric Anholt2014-07-024-50/+2
| | | | | | | | | | | | | | At this point, the extra copy of the data and memcmp are as expensive as just re-uploading. Note: now that we'll always upload, and brw_constant_buffer watches BRW_NEW_BATCH anyway, we don't need to explicitly unref the old curbe_bo at batch reset time. No significant performance difference on glamor copywinwin10 (n=55), despite that test having a 98% hit rate on the cache. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Reuse intel_upload.c for gen4/5 constant buffers.Eric Anholt2014-07-023-31/+7
| | | | | | No performance difference on glamor with copywinwin10 (n=40) on my gm45. Reviewed-by: Kenneth Graunke <[email protected]>