summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Perform CSE on texture operations.Matt Turner2014-06-171-1/+10
| | | | | | | | Helps Unigine Tropics and some (old) gstreamer shaders in shader-db. instructions in affected programs: 792 -> 744 (-6.06%) Reviewed-by: Kenneth Graunke <[email protected]>
* i965/fs: Copy propagate from load_payload.Matt Turner2014-06-171-0/+22
| | | | | But only into non-load_payload instructions. Otherwise we would prevent register coalescing from combining identical payloads.
* i965/fs: Perform CSE on load_payload instructions if it's not a copy.Matt Turner2014-06-171-0/+18
| | | | | | | | | | | | | | Since CSE creates instructions, if we let CSE generate things register coalescing can't remove, bad things will happen. Only let CSE combine non-copy load_payloads. E.g., allow CSE to handle this load_payload vgrf4+0, vgrf5, vgrf6 but not this load_payload vgrf4+0, vgrf5+0, vgrf5+1
* i965/fs: Support register coalescing on LOAD_PAYLOAD operands.Matt Turner2014-06-171-10/+54
|
* i965/fs: Emit load_payload instead of multiple MOVs for large VGRFs.Matt Turner2014-06-171-12/+21
|
* i965/fs: Only consider real sources when comparing instructions.Matt Turner2014-06-171-4/+15
|
* i965/fs: Apply cube map array fixup and restore the payload.Matt Turner2014-06-171-1/+14
| | | | | So that we don't have partial writes to a large VGRF. Will be cleaned up by register coalescing.
* i965/fs: Use LOAD_PAYLOAD in emit_texture_gen7().Matt Turner2014-06-171-62/+73
|
* i965/fs: Lower LOAD_PAYLOAD and clean up.Matt Turner2014-06-172-0/+39
| | | | Clean up with with register_coalesce()/dead_code_eliminate().
* i965/fs: Add SHADER_OPCODE_LOAD_PAYLOAD.Matt Turner2014-06-175-0/+33
| | | | | | Will be used to simplify the handling of large virtual GRFs in SSA form. Reviewed-by: Topi Pohjolainen <[email protected]>
* glsl: type check between switch init-expression and caseTapani Pälli2014-06-171-3/+45
| | | | | | | | | | | Patch adds a type check between switch init-expression and case label and performs a implicit signed->unsigned type conversion when possible. v2: add GLSL spec reference, do implicit conversion if possible (Matt) Signed-off-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79724 Reviewed-by: Matt Turner <[email protected]>
* nv50/ir: Remove NV50_SEMANTIC_VIEWPORTINDEXTobias Klausmann2014-06-162-2/+1
| | | | | | | Use TGSI_SEMANTIC_VIEWPORT_INDEX for the last consumer. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* docs: update GL3.txt, relnotes: mark GL_ARB_viewport_array as done for nvc0Tobias Klausmann2014-06-162-1/+2
| | | | | Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: implement multiple viewports/scissors, enable ARB_viewport_arrayTobias Klausmann2014-06-167-63/+113
| | | | | | Signed-off-by: Tobias Klausmann <[email protected]> [imirkin: mark things dirty on ctx switch, 3d blit] Reviewed-by: Ilia Mirkin <[email protected]>
* nv50: make sure to mark first scissor dirty after blitIlia Mirkin2014-06-161-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* i965: Use 8x4 aligned rectangles for HiZ operations on Broadwell.Kenneth Graunke2014-06-161-4/+16
| | | | | | | | | | | | | | | | | Like on Haswell, we need to use 8x4 aligned rectangle primitives for hierarchical depth buffer resolves and depth clears. See the comments in brw_blorp.cpp's brw_hiz_op_params() constructor. (The Broadwell documentation confirms that this is still necessary.) This patch makes the Broadwell code follow the same behavior as Chad and Jordan's Gen7 BLORP code. Based on a patch by Topi Pohjolainen. This fixes es3conform's framebuffer_blit_functionality_scissor_blit test, with no Piglit regressions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: "10.2" <[email protected]>
* i965: Make INTEL_DEBUG=mip print out whether HiZ is enabled.Kenneth Graunke2014-06-161-0/+2
| | | | | | | | | We only enable HiZ for miplevels which are aligned on 8x4 blocks. When debugging HiZ failures, it's useful to know whether a particular miplevel is using HiZ or not. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* glsl/cs: Fix local_size_y and local_size_zJordan Justen2014-06-161-1/+1
| | | | | | | | | | | | | | | flags.q.local_size has 3 bits. One each for x, y and z. Fixes piglit's: * spec/ARB_compute_shader/linker/mismatched_local_work_sizes * spec/ARB_compute_shader/compiler/default_local_size.comp * spec/ARB_compute_shader/compiler/work_group_size_too_large * spec/ARB_compute_shader/compiler/gl_WorkGroupSize_matches_layout.comp This was regressed in 738c9c3c. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Chris Forbes <[email protected]>
* main/extensions: Only parse MESA_EXTENSION_OVERRIDE onceJordan Justen2014-06-161-74/+40
| | | | | | | | | Previously, we would parse MESA_EXTENSION_OVERRIDE each time a context was created. Now we will save the results of that parsing and use it during context initialization. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main/extensions: Build list of extensions that can't be disabledJordan Justen2014-06-161-5/+20
| | | | | | | | This will allow us to utilize the early MESA_EXTENSION_OVERRIDE parsing at the later extension string initialization step. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main/extensions: Create extra extensions override stringJordan Justen2014-06-161-0/+38
| | | | | | | | This will allow us to utilize the early MESA_EXTENSION_OVERRIDE parsing at the later extension string initialization step. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/cs: Use override structure rather than separate env varJordan Justen2014-06-162-4/+2
| | | | | | | | | | | | In 25268b93, we added a new environment variable (INTEL_COMPUTE_SHADER) to allow some constant values to be upgraded for the ARB_compute_shader extension. Now, we can look to see if the extension was enabled via the MESA_EXTENSION_OVERRIDE environment variable. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main/extensions: Add early extension override structuresJordan Justen2014-06-163-0/+59
| | | | | | | | | | | | | | | | | | | | | | | During the early one_time_init phase of context creation, we initialize two global gl_extensions structures. We read the MESA_EXTENSION_OVERRIDE environment variable, and store positive and negative overrides in two structures: * struct gl_extensions _mesa_extension_override_enables * struct gl_extensions _mesa_extension_override_disables These are filled before the driver initializes extensions and constants, therefore the driver can make adjustments based on the desired overrides. This can be useful during development of a new extension where the extension is only partially ready. The driver can't actually advertise support for the extension, but if it sees that the override is set for the extension, then it can expose more supported parts of the extension, such as upgrading context constants. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main/extensions: Create a context-less set_extensions functionJordan Justen2014-06-161-5/+20
| | | | | | | | | | | | We will add new gl_extensions structures that capture the environment variable extension overrides and are available early in context creation. This will allow a driver to take actions during its initialization based on the extension overrides. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* main/extensions: Don't advertise unknown extensions overrides with (-)Jordan Justen2014-06-161-1/+1
| | | | | | | | | | | | Previously setting: MESA_EXTENSION_OVERRIDE=-GL_MESA_ham_sandwich Would cause Mesa to advertise support for the GL_MESA_ham_sandwich extension, even though the override specifically asked for it to be disabled. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* radeonsi: fixup sizes of shader resource and sampler arraysMarek Olšák2014-06-161-2/+2
| | | | | | | This was wrong for a very long time. I wonder if the array size has any effect on anything. Reviewed-by: Christian König <[email protected]>
* scons: Link libGL.so against xcb-dri2.José Fonseca2014-06-161-1/+1
| | | | | | Fixing undefined xcb_dri2_* symbols. Trivial.
* r600g/radeonsi: Remove default case from PIPE_COMPUTE_CAP_* switchMichel Dänzer2014-06-161-4/+3
| | | | | | This way, the compiler warns about unhandled caps. Reviewed-by: Marek Olšák <[email protected]>
* docs: update ARB_explicit_uniform_location statusTapani Pälli2014-06-162-1/+2
| | | | | | | + modify release notes for 10.3 Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Petri Latvala <[email protected]>
* Enable GL_ARB_explicit_uniform_location in the drivers.Tapani Pälli2014-06-163-0/+3
| | | | | | | v2: enable also for i915 (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Petri Latvala <[email protected]>
* glsl: parser changes for GL_ARB_explicit_uniform_locationTapani Pälli2014-06-164-0/+54
| | | | | | | | | | | | Patch adds a preprocessor define for the extension and stores explicit location data for uniforms during AST->HIR conversion. It also sets layout token to be available when having the extension in place. v2: change parser check to require GLSL 330 or enabling GL_ARB_explicit_attrib_location (Ian) v3: fix the check and comment in AST->HIR (Petri) Signed-off-by: Tapani Pälli <[email protected]>
* glsl: add enable bit for ARB_explicit_uniform_locationTapani Pälli2014-06-162-0/+3
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: support inactive uniforms in glUniform* functionsTapani Pälli2014-06-161-0/+15
| | | | | | | | | | Support inactive uniforms that have explicit location set in glUniform* functions. v2: remove unnecessary extension check, use new define (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/linker: assign explicit uniform locationsTapani Pälli2014-06-161-5/+56
| | | | | | | | | | | | | | | | | | | Patch refactors the existing uniform processing so explicit locations are taken in to account during variable processing. These locations are temporarily stored in gl_uniform_storage before actual locations are set. UNMAPPED_UNIFORM_LOC marks unset location so that we can use 0 as a valid explicit location. When locations are set, UniformRemapTable is first populated with uniforms that have explicit location set (inactive and active ones), rest are put after explicit location slots. v2: introduce define for locations that have not been set yet (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl/linker: initialize explicit uniform locationsTapani Pälli2014-06-162-0/+119
| | | | | | | | | | | Patch initializes the UniformRemapTable for explicit locations. This needs to happen before optimizations to make sure all inactive uniforms get their explicit locations correctly. v2: fix initialization bug, introduce define for inactive uniforms (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: add glsl_type::uniform_locations() helper functionTapani Pälli2014-06-162-0/+32
| | | | | | | | This function calculates the number of unique values from glGetUniformLocation for the elements of the type. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add new enum MAX_UNIFORM_LOCATIONSTapani Pälli2014-06-165-0/+12
| | | | | | | | | | | | | Patch adds new implementation dependent value required by the GL_ARB_explicit_uniform_location extension. Default value for user assignable locations is calculated as sum of MaxUniformComponents for each stage. v2: fix descriptor in get_hash_params.py (Petri) v3: simpler formula for calculating initial value (Ian) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: add enable bit for ARB_explicit_uniform_locationTapani Pälli2014-06-162-0/+2
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glapi: add GL_ARB_explicit_uniform_locationTapani Pälli2014-06-161-0/+6
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965/vec4: Use the sampler for pull constant loads on Broadwell.Kenneth Graunke2014-06-151-8/+8
| | | | | | | | | | | | | | | | | | | | | We've used the LD sampler message for pull constant loads on earlier hardware for some time, and also were already using it for the FS on Broadwell. This patch makes us use it for Broadwell VS/GS as well. I believe that when I wrote this code in 2012, we still used the data port in some cases, and I somehow neglected to convert it while rebasing. Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821% (n = 17). Many other applications should benefit similarly: this speeds up uniform array access in the VS, which is commonly used for skinning shaders, among other things. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Tested-by: Ben Widawsky <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing newlines to a few perf_debug messages.Kenneth Graunke2014-06-151-2/+2
| | | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Drop Broadwell perf_debugs about missing MOCS that aren't missing.Kenneth Graunke2014-06-152-4/+0
| | | | | | | | | I actually added MOCS support for these things, but forgot to delete the corresponding perf_debug() warnings. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965: Add missing MOCS setup for 3DSTATE_INDEX_BUFFER on Broadwell.Kenneth Graunke2014-06-151-3/+1
| | | | | | | | Somehow I missed this when adding all of the other MOCS values. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: "10.2" <[email protected]>
* i965/vec4: Fix dead code elimination for VGRFs of size > 1.Kenneth Graunke2014-06-151-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | When faced with code such as: mov vgrf31.0:UD, 960D mov vgrf31.1:UD, vgrf30.xxxx:UD The dead code eliminator didn't consider reg_offsets, so it decided that the second instruction was writing was writing to the same register as the first one, and eliminated the first one. But they're actually different registers. This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above code, vgrf31.0 represents the offset into the shader_time buffer where the data should be written, and vgrf31.1 represents the actual time data. With a completely undefined offset, results were...unexpected. I think this is probably one of the few cases (maybe only case) where we generate multiple MOVs to a large VGRF. Normally, we just use them as texturing results; the other SEND-from-GRF uses a size 1 VGRF. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029 Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]> Cc: [email protected]
* i965: Add SHADER_OPCODE_SHADER_TIME_ADD to dump_instructions() decode.Kenneth Graunke2014-06-151-0/+2
| | | | | | | "shader_time_add" is a lot more informative than "op152". Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Fix clang mismatched-tags warnings with glsl_type.Vinson Lee2014-06-151-1/+1
| | | | | | | | | | | | | | | | | | Fix clang mismatched-tags warnings introduced with commit 4f5445a45d3ed02e00a061b10c943c0b079c6020. ./glsl_symbol_table.h:37:1: warning: class 'glsl_type' was previously declared as a struct [-Wmismatched-tags] class glsl_type; ^ ./glsl_types.h:86:8: note: previous use is here struct glsl_type { ^ ./glsl_symbol_table.h:37:1: note: did you mean struct here? class glsl_type; ^~~~~ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* mesa/drivers: Fix clang constant-logical-operand warnings.Vinson Lee2014-06-144-13/+13
| | | | | | | | | | | | | | | | This patch fixes several clang constant-logical-operand warnings such as the following. ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: warning: use of logical '||' with constant operand [-Wconstant-logical-operand] if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL) ^ ~~~~~~~~~~~ ../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: note: use '|' for a bitwise operation if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL) ^~ | Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Correct more typosChris Forbes2014-06-152-2/+2
| | | | Signed-off-by: Chris Forbes <[email protected]>
* radeon/compute: Always report at least 1 compute unitTom Stellard2014-06-131-1/+1
| | | | | Some apps will abort if they detect 0 compute units. This fixes crashes in some OpenCV tests.
* meta_blit: properly compute texture width for the CopyTexSubImage fallbackJason Ekstrand2014-06-131-1/+1
| | | | | | | Cc: "10.2" <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Matt Turner <[email protected]>