summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* st/dri/drm: remove __driDriverExtensions and driDriverAPIEmil Velikov2014-07-104-22/+29
| | | | | | | | | | | | ... and use libmegadriver_stub as their provider. Teach scons how to build the library archive and use it. v2: scons: fix build on a drm-less system. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: cleanup conversion leftoversEmil Velikov2014-07-102-38/+4
| | | | | | | | | | With all the users converted to __driGetExtensions_* we can have only a single inclusion of the required header + define. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: update scons build to handle __driDriverGetExtensions_vmwgfxEmil Velikov2014-07-101-0/+5
| | | | | | | | | | Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Cc: Jakob Bornecrantz <jakob@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_vmwgfxEmil Velikov2014-07-102-0/+17
| | | | | | | | | | | | Identical to previous commits - will bring us a step closer to megadrivers. Cc: Jose Fonseca <jfonseca@vmware.com> Cc: Brian Paul <brianp@vmware.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_i965 symbolEmil Velikov2014-07-102-0/+17
| | | | | | | | | | | Identical to previous commits - will bring us a step closer to megadrivers. Cc: Chia-I Wu <olv@lunarg.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_i915 symbolEmil Velikov2014-07-102-0/+17
| | | | | | | | | | | Identical to previous commits - will bring us a step closer to megadrivers. Cc: Stephane Marchesin <stephane.marchesin@gmail.com> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_freedreno symbolEmil Velikov2014-07-102-0/+17
| | | | | | | | | | | Identical to previous two commits - will bring us a step closer to megadrivers. Cc: Rob Clark <robclark@freedesktop.org> Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_(r300|r600|radeonsi) symbolsEmil Velikov2014-07-102-0/+41
| | | | | | | | | | | | The symbol is introduced by the mesa megadrivers, and adding gallium support for it will allow us to merge st/dri/drm and st/dri/sw. Resulting in a single dri library across all of gallium. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* targets/dri: Add __driDriverGetExtensions_nouveau symbolEmil Velikov2014-07-105-0/+46
| | | | | | | | | | | | | The symbol is introduced by the mesa megadrivers, and adding gallium support for it will allow us to merge st/dri/drm and st/dri/sw. Resulting in a single dri library across gallium. v2: Rebase on top of gallium dri3. Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com> Tested-by: Rob Clark <robclark@freedesktop.org> Tested-by: Thomas Helland <thomashelland90 at gmail.com> Acked-by: Tom Stellard <thomas.stellard@amd.com>
* tgsi: add interpolation location modifier support to text parserIlia Mirkin2014-07-091-0/+17
| | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* mesa/st: add per sample shading state to fp key and set interpolationIlia Mirkin2014-07-093-1/+11
| | | | | | | | | | | This enables a gallium driver not to care about the semantics of ARB_sample_shading vs ARB_gpu_shader5 sample attributes. When ARB_sample_shading-style sample shading is enabled, all of the fp inputs are marked for interpolation at the sample location. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* gallium: switch dedicated centroid field to interpolation locationIlia Mirkin2014-07-0917-31/+57
| | | | | | | | The new location field can be either center, centroid, or sample, which indicates the location that the shader should interpolate at. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* meta: Call glObjectLabel before linking.Kenneth Graunke2014-07-091-1/+1
| | | | | | | | | | | i965 precompiles shaders at link time, and prints a disassembly if INTEL_DEBUG=vs,gs,fs, including the shader name. However, blit shaders were showing up as "unnamed" since we hadn't set a name prior to linking. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Matt Turner <mattst88@gmail.com>
* ff_fragment_shader: Access glsl_types directly.Kenneth Graunke2014-07-091-15/+15
| | | | | | | | | Originally, we didn't have direct accessors for all of the GLSL types, so the only way to get at them was to use the symbol table. Now, we can just get at them directly, which is simpler and faster. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Ben Widawsky <ben@bwidawsk.net>
* st/mesa: add PIPE_FORMAT_R10G10B10A2_UNORM to format_map tableBrian Paul2014-07-091-1/+2
| | | | | | as a candidate for the GL_RGB10_A2 internal texture format. Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* st/mesa: add some missing MESA/PIPE_FORMAT_R10G10B10A2_UNORM switch casesBrian Paul2014-07-091-0/+4
| | | | Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* glsl/glcpp: Don't choke on an empty pragmaCarl Worth2014-07-093-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | The lexer was insisting that there be at least one character after "#pragma" and before the end of the line. This caused an error for a line consisting only of "#pragma" which volates at least the following sentence from the GLSL ES Specification 3.00.4: The scope as well as the effect of the optimize and debug pragmas is implementation-dependent except that their use must not generate an error. [Page 12 (Page 28 of PDF)] and likely the following sentence from that specification and also in GLSLangSpec 4.30.6: If an implementation does not recognize the tokens following #pragma, then it will ignore that pragma. Add a "make check" test to ensure no future regressions. This change fixes at least part of the following Khronos GLES3 CTS test: preprocessor.pragmas.pragma_vertex Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl/glcpp: Promote "extra token at end of directive" from warning to errorCarl Worth2014-07-093-1/+14
| | | | | | | | | | | | | | | | We've always warned about this case, but a recent confromance test expects this to be an error that causes compilation to fail. Make it so. Also add a "make check" test to ensure these errors are generated. This fixes the following Khronos GLES3 conformance tests: invalid_conditionals.tokens_after_ifdef_vertex invalid_conditionals.tokens_after_ifdef_fragment invalid_conditionals.tokens_after_ifndef_vertex invalid_conditionals.tokens_after_ifndef_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl/glcpp: Once again report undefined macro name in error message.Carl Worth2014-07-093-38/+86
| | | | | | | | | | | While writing the previous commit message, I just felt bad documenting the shortcoming of the change, (that undefined macro names would not be reported in error messages). Fix this by preserving the first-encounterd undefined macro name and reporting that in any resulting error message. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl/glcpp: Add short-circuiting for || and && in #if/#elif for OpenGL ES.Carl Worth2014-07-094-30/+140
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The GLSL ES Specification 3.00.4 says: #if, #ifdef, #ifndef, #else, #elif, and #endif are defined to operate as for C++ except for the following: ... • Undefined identifiers not consumed by the defined operator do not default to '0'. Use of such identifiers causes an error. [Page 11 (page 127 of the PDF file)] as well as: The semantics of applying operators in the preprocessor match those standard in the C++ preprocessor with the following exceptions: • The 2nd operand in a logical and ('&&') operation is evaluated if and only if the 1st operand evaluates to non-zero. • The 2nd operand in a logical or ('||') operation is evaluated if and only if the 1st operand evaluates to zero. If an operand is not evaluated, the presence of undefined identifiers in the operand will not cause an error. (Note that neither of these deviations from C++ preprocessor behavior apply to non-ES GLSL, at least as of specfication version 4.30.6). The first portion of this, (generating an error for an undefined macro in an (short-circuiting to squelch errors), was not implemented previously, but is implemented in this commit. A test is added for "make check" to ensure this behavior. Note: The change as implemented does make the error message a bit less precise, (it just states that an undefined macro was encountered, but not the name of the macro). This commit fixes the following Khronos GLES3 conformance test: undefined_identifiers.valid_undefined_identifier_1_vertex undefined_identifiers.valid_undefined_identifier_1_fragment undefined_identifiers.valid_undefined_identifier_2_vertex undefined_identifiers.valid_undefined_identifier_2_fragment Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* glsl/glcpp: Fix glcpp to properly lex entire "preprocessing numbers"Carl Worth2014-07-093-0/+81
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The preprocessor defines a notions of a "preprocessing number" that starts with either a digit or a decimal point, and continues with zero or more of digits, decimal points, identifier characters, or the sign symbols, ('-' and '+'). Prior to this change, preprocessing numbers were lexed as some combination of OTHER and IDENTIFIER tokens. This had the problem of causing undesired macro expansion in some cases. We add tests to ensure that the undesired macro expansion does not happen in cases such as: #define e +1 #define xyz -2 int n = 1e; int p = 1xyz; In either case these macro definitions have no effect after this change, so that the numeric literals, (whether valid or not), will be passed on as-is from the preprocessor to the compiler proper. This fixes the following Khronos GLES3 CTS tests: preprocessor.basic.correct_phases_vertex preprocessor.basic.correct_phases_fragment v2. Thanks to Anuj Phogat for improving the original regular expression, (which accepted a '+' or '-', where these are only allowed after one of [eEpP]. I also expanded the test to exercise this. v3. Also fixed regular expression to require at least one digit at the beginning (after an optional period). Otherwise, a string such as ".xyz" was getting sucked up as a preprocessing number, (where obviously this should be a field access). Again, I expanded the test to exercise this. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* glsl/glcpp: Fix glcpp to catch garbage after #if 1 ... #elseCarl Worth2014-07-097-16/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, a line such as: #else garbage would flag an error if it followed "#if 0", but not if it followed "#if 1". We fix this by setting a new bit of state (lexing_else) that allows the lexer to defer switching to the <SKIP> start state until after the NEWLINE following the #else directive. A new test case is added for: #if 1 #else garbage #endif which was untested before, (and did not generate the desired error). This fixes the following Khronos GLES3 CTS tests: tokens_after_else_vertex tokens_after_else_fragment Reviewed-by: Matt Turner <mattst88@gmail.com> Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* glsl/glcpp: Fixup glcpp tests for redefining a macro with whitespace changes.Carl Worth2014-07-093-1/+40
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the test suite was expecting the compiler to allow a redefintion of a macro with whitespace added, but gcc is more strict and allows only for changes in the amounts of whitespace, (but insists that whitespace exist or not in exactly the same places). See: https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html: These definitions are effectively the same: #define FOUR (2 + 2) #define FOUR (2 + 2) #define FOUR (2 /* two */ + 2) but these are not: #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 * 2) #define FOUR(score,and,seven,years,ago) (2 + 2) This change adjusts the existing "redefine-macro-legitimate" test to work with the more strict understanding, and adds a new "redefine-whitespace" test to verify that changes in the position of whitespace are flagged as errors. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* glsl/glcpp: Fix preprocessor error condition for macro redefinitionAnuj Phogat2014-07-091-8/+10
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch specifically fixes redefinition condition for white space changes. #define and #undef functionality in GLSL follows the standard for C++ preprocessors for macro definitions. From https://gcc.gnu.org/onlinedocs/cpp/Undefining-and-Redefining-Macros.html: These definitions are effectively the same: #define FOUR (2 + 2) #define FOUR (2 + 2) #define FOUR (2 /* two */ + 2) but these are not: #define FOUR (2 + 2) #define FOUR ( 2+2 ) #define FOUR (2 * 2) #define FOUR(score,and,seven,years,ago) (2 + 2) Fixes Khronos GLES3 CTS tests; invalid_object_whitespace_vertex invalid_object_whitespace_fragment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Carl Worth <cworth@cworth.org>
* glsl/glcpp: Add test to ensure compiler won't allow #undef for some builtinsCarl Worth2014-07-092-0/+10
| | | | | | | Currently verifying that an #undef of __FILE__, __LINE__, or __VERSION__ will generate an error. Reviewed-by: Anuj Phogat <anuj.phogat@gmail.com>
* glsl/glcpp: Do not allow undefining the built-in macrosAnuj Phogat2014-07-091-0/+6
| | | | | | | | | | | | | | | | | Fixes piglit tests in spec/glsl-es-3.00/compile: undef-__FILE__.vert undef-GL_ES.vert undef-__LINE__.vert undef-__VERSION__.vert Also, fixes Khronos GLES3 CTS tests: undefine_invalid_object_1_vertex undefine_invalid_object_1_fragment undefine_invalid_object_2_vertex undefine_invalid_object_2_fragment Signed-off-by: Anuj Phogat <anuj.phogat@gmail.com> Reviewed-by: Carl Worth <cworth@cworth.org>
* gallium/u_blitter: fix some shader memory leaksBrian Paul2014-07-091-0/+9
| | | | | | | The _msaa shaders weren't getting freed. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* tgsi: properly parse indirect dimension references (e.g. for UBOs)Ilia Mirkin2014-07-091-0/+7
| | | | | Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Reviewed-by: Brian Paul <brianp@vmware.com>
* radeonsi: fix order of r600_need_dma_space and r600_context_bo_relocChristian König2014-07-091-1/+2
| | | | | Signed-off-by: Christian König <christian.koenig@amd.com> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* st/mesa: fix geometry shader memory leakBrian Paul2014-07-091-0/+1
| | | | | | | | Spotted by Charmaine Lee. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com> Reviewed-by: Charmaine Lee <charmainel@vmware.com>
* mesa: fix geometry shader memory leaksBrian Paul2014-07-092-0/+4
| | | | | | | Spotted by Charmaine Lee. Cc: "10.2" <mesa-stable@lists.freedesktop.org> Reviewed-by: Marek Olšák <marek.olsak@amd.com>
* st/mesa: minor simplification of some state atom assignmentsBrian Paul2014-07-092-7/+4
|
* st/mesa: minor fix-up in st_GetSamplePosition()Brian Paul2014-07-091-2/+4
| | | | | If the driver doesn't implement get_sample_position(), let's return some non-garbage values.
* mesa: use float to silence MSVC warning in _mesa_GetMultisamplefv()Brian Paul2014-07-091-1/+1
|
* nvc0: allocate more space before a counter is configuredSamuel Pitoiset2014-07-081-2/+3
| | | | | | | | | On nvc0, a counter can have up to 6 sources instead of only one for nve4+. This fixes a crash when a counter uses more than one source. Signed-off-by: Samuel Pitoiset <samuel.pitoiset@gmail.com> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nv50/ir: use unordered_set instead of list to keep track of var usesTobias Klausmann2014-07-084-9/+10
| | | | | | | | | | | The set of variable uses does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~22s from ~4h Signed-off-by: Tobias Klausmann <tobias.johannes.klausmann@mni.thm.de> Reviewed-by: Ilia Mirkin <imirkin@alum.mit.edu>
* i965/disasm: Fix disassembly of the any16h/all16h predicates.Kenneth Graunke2014-07-081-1/+1
| | | | | | | | BRW_PREDICATE_ALIGN1_ANY16H was incorrectly being disassembled as "all16h", and ALL16H would probably print as "(null)". Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* glsl: Fix the foreach_in_list_reverse macro.Kenneth Graunke2014-07-081-3/+3
| | | | | | | | | | | We clearly don't want to start at the head and walk backwards; we want to start at the last real element before the tail sentinel. If the list is empty, tail_pred will be the head sentinel, and we'll stop. Nothing uses this function, so I guess nobody noticed it was broken. Signed-off-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Matt Turner <mattst88@gmail.com>
* radeonsi: mark MSAA config state as dirty at the beginning of CSMarek Olšák2014-07-081-0/+1
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81020 Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
* gallium: fix u_default_transfer_inline_write for texturesMarek Olšák2014-07-081-2/+2
| | | | | | | | | This doesn't fix any known issue. In fact, radeon drivers ignore all the discard flags for textures and implicitly do "discard range" for any write transfer. Cc: mesa-stable@lists.freedesktop.org Reviewed-by: Roland Scheidegger <sroland@vmware.com>
* i965: Remove artificial dependency between math instructions.Matt Turner2014-07-081-1/+2
| | | | | | ... on Gen6+. I'm not actually sure which class Gen6 fits into. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* i965/fs: Track dependencies in instruction scheduling per reg offset.Matt Turner2014-07-081-8/+15
| | | | | | | | | | | | | | | | | | | | | Previously instruction scheduling tracked dependencies on a per-register basis. This meant that there was an artificial dependency between interpolation instructions writing into the same virtual register. Instruction scheduling would insert a number of instructions between the two instructions in this example, when they are actually independent. linterp vgrf8+0.0:F, hw_reg2:F, hw_reg3:F, hw_reg6:F linterp vgrf8+1.0:F, hw_reg2:F, hw_reg3:F, hw_reg6+16:F This lead to cases where the first texture coordinate is interpolated at the beginning of the shader, but the second is done immediately before the texture operation that uses it as a source. After this change, the artificial dependency is removed and the interpolation instructions are scheduled together. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* configure: Don't special case Cygwin to use gnu99, define _XOPEN_SOURCE insteadJon TURNEY2014-07-081-9/+2
| | | | | | | | | | | | | | Revert "build: Build on Cygwin with gnu99 instead of c99." and define _XOPEN_SOURCE appropriately. This reverts commit 53e36d333c9b619c1a5fe9a8d2d08665654b0234. Since Cygwin 1.7.18 (April 2013), it's headers correctly prototype strtoll() when using -std=c99, and correctly prototype strdup() when _XOPEN_SOURCE is defined appropriately, so this workaround is no longer needed. Signed-off-by: Jon TURNEY <jon.turney@dronecode.org.uk> Cc: Vinson Lee <vlee@freedesktop.org>
* ilo: fix fence reference countingChia-I Wu2014-07-081-12/+9
| | | | The old code was complicated, and was wrong when *ptr is NULL.
* i965: Extend compute-to-mrf pass to understand blocks of MOVsKristian Høgsberg2014-07-071-10/+53
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current compute-to-mrf pass doesn't handle blocks of MOVs. Shaders that end with a texture fetch follwed by an fb write are left like this: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g2<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: mov(8) g113<1>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000028: mov(8) g114<1>F g3<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000030: mov(8) g115<1>F g4<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000038: mov(8) g116<1>F g5<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000040: sendc(8) null g113<8,8,1>F render ( RT write, 0, 4, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; This patch lets compute-to-mrf recognize blocks of MOVs and match them to instructions (typically SEND) that writes multiple registers. With this, the above shader becomes: 0x00000000: pln(8) g6<1>F g4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000008: pln(8) g7<1>F g4.4<0,1,0>F g2<8,8,1>F { align1 WE_normal 1Q compacted }; 0x00000010: send(8) g113<1>UW g6<8,8,1>F sampler (1, 0, 0, 1) mlen 2 rlen 4 { align1 WE_normal 1Q }; 0x00000020: sendc(8) null g113<8,8,1>F render ( RT write, 0, 20, 12) mlen 4 rlen 0 { align1 WE_normal 1Q EOT }; which is the bulk of the shader db results: total instructions in shared programs: 987040 -> 986720 (-0.03%) instructions in affected programs: 844 -> 524 (-37.91%) GAINED: 0 LOST: 0 The optimization also applies to MRT shaders that write the same color value to multiple RTs, in which case we can eliminate four MOVs in a similar fashion. See fbo-drawbuffers2-blend in piglit for an example. No measurable performance impact. No piglit regressions. Signed-off-by: Kristian Høgsberg <krh@bitplanet.net>
* nvc0/ir: fill offset in properly for TXDIlia Mirkin2014-07-081-13/+43
| | | | | | | | Apparently TXD wants its offset differently than TEX, accepting it in the upper bits of the layer index. Unclear what happens when this is combined with indirect sampler indexing. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu>
* nvc0/ir: use manual TXD when offsets are involvedIlia Mirkin2014-07-081-1/+2
| | | | | | | | | | Something about how we're implementing offsets for TXD is wrong, just flip to the generic quadop-based implementation in that case. This is the minimal fix appropriate for backporting. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>
* nvc0/ir: do quadops on the right texture coordinates for TXDIlia Mirkin2014-07-081-2/+3
| | | | | | | | handleTEX moves the layer as the first argument. This makes sure that the quadops deal with the texture coordinates. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>
* nv50/ir: ignore bias for samplerCubeShadow on nv50Ilia Mirkin2014-07-081-0/+10
| | | | | | | | Unfortunately there's no good way to do this on the nv50 shader isa. Dropping the bias seems preferable to doing the compare post-filtering. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>
* nv50/ir: retrieve shadow compare from first argIlia Mirkin2014-07-081-1/+1
| | | | | | | | This can only happen with texture(samplerCubeShadow, bias), where the compare will be in the first argument. Signed-off-by: Ilia Mirkin <imirkin@alum.mit.edu> Cc: <mesa-stable@lists.freedesktop.org>