summaryrefslogtreecommitdiffstats
path: root/src/mesa/program
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Add string_to_uint_map facade classIan Romanick2011-10-042-1/+120
| | | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Acked-by: Paul Berry <[email protected]>
* mesa: Add hash_table_replaceIan Romanick2011-10-042-0/+40
| | | | | | | hash_table_replace doesn't use get_node to avoid having to hash the key twice. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ir_to_mesa: Don't assertion fail on integer modulus.Kenneth Graunke2011-10-021-1/+4
| | | | | | | | | | | Drivers implementing GLSL 1.30 want to do integer modulus, and until we can stop generating code via ir_to_mesa, it's easier to make it silently generate rubbish code. Multiply will do. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Ian Romanick <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: s/INLINE/inline/Brian Paul2011-10-013-17/+17
| | | | | | | INLINE is still seen in some files (some generated files, etc) but this is a good start. Acked-by: Kenneth Graunke <[email protected]>
* mesa: Refactor hash_table_{find,remove} to share some codeIan Romanick2011-09-301-16/+16
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Document an odd side-effect of hash_table_insertIan Romanick2011-09-301-0/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Remove unused field gl_program::VaryingIan Romanick2011-09-302-7/+0
| | | | | | | Lots of things set and copy this field around, but nothing uses it. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Use Add linker_error instead of fail_linkIan Romanick2011-09-301-14/+3
| | | | | | | See also 8aadd89. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: add _NEW_CURRENT_ATTRIB in _mesa_program_state_flags()Brian Paul2011-09-301-2/+5
| | | | | | | | | | | | If color material mode is enabled, constant buffer entries related to the material coefficients will depend on glColor. So add _NEW_CURRENT_ATTRIB to the bitset returned for material-related constants in _mesa_program_state_flags(). This fixes a bug exercised by the new piglit draw-arrays-colormaterial test. Note: This is a candidate for the 7.11 branch.
* ir_to_mesa: Don't assertion fail on remaining GLSL 1.30 ops.Eric Anholt2011-09-281-2/+10
| | | | | | | | | | For hardware drivers, we only have ir_to_mesa called for the purposes of potential swrast fallbacks (basically never on a 1.30 driver), which we don't really care about. This will allow 1.30 to be implemented without rewriting swrast for it. Reviewed-by: Chad Versace <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add a flag to indicate whether a program uses gl_ClipDistance.Paul Berry2011-09-231-0/+2
| | | | | | | | | | | | | | | | | | GLSL 1.30 requires us to use gl_ClipDistance for clipping if the vertex shader contains a static write to it, and otherwise use user-defined clipping planes. Since the driver needs to behave differently in these two cases, we need a flag to record whether the shader has written to gl_ClipDistance. The new flag is called UsesClipDistance. We initially store it in gl_shader_program (since that is the data structure that is available when we check to see whethe gl_ClipDistance was written to), and we later copy it to a flag with the same name in gl_vertex_program, since that is a more convenient place for the driver to access it (in i965, at least). Reviewed-by: Eric Anholt <[email protected]> Tested-by: Brian Paul <[email protected]>
* i965/fs: Implement texelFetch() on Ironlake and Sandybridge.Kenneth Graunke2011-09-191-3/+2
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: fix shadow2DArray comparisonMarek Olšák2011-09-101-3/+14
| | | | | | The depth should be in W. v2: adjust the assertion, add a comment
* nvprogram: Silence "warning: unused parameter ‘ctx’"Ian Romanick2011-09-091-1/+1
|
* mesa: Replace the EmitNoIfs compiler flag with a MaxIfDepth flag.Bryan Cain2011-08-311-4/+4
| | | | | | | This is a better, more fine-grained way of lowering if statements. Fixes the game And Yet It Moves on nv50. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Use a separate div_to_mul_rcp lowering flag for integers.Bryan Cain2011-08-311-1/+1
| | | | | | | | | | | | | | Using multiply and reciprocal for integer division involves potentially lossy floating point conversions. This is okay for older GPUs that represent integers as floating point, but undesirable for GPUs with native integer division instructions. TGSI, for example, has UDIV/IDIV instructions for integer division, so it makes sense to handle this directly. Likewise for i965. Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Bryan Cain <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: Make the gl_constant_value's bool occupy the same space as float/int.Eric Anholt2011-08-301-1/+1
| | | | | | | | | At least for Intel, all our uniform components are of uint32_t size, either float or signed or unsigned int. For uploading uniform data in the driver, it's much easier to upload a full dword per uniform element instead of trying to pick out the bool byte and then fill in the top 3 bytes of pad with 0. Reviewed-by: Kenneth Graunke <[email protected]>
* Change return type of try_emit_* methods to bool.Kai Wasserbäch2011-08-251-4/+4
| | | | | | | | | | | Ian Romanick explained (Message-Id: <[email protected]>), that the return type of non-API methods shouldn't use GLboolean but a standard C++ bool. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Bryan Cain <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Kai Wasserbäch <[email protected]>
* i965/fs: Implement textureSize (TXS) on Gen5+.Kenneth Graunke2011-08-231-2/+5
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add a new ir_txs (textureSize) opcode to ir_texture.Kenneth Graunke2011-08-231-0/+1
| | | | | | | | One unique aspect of TXS is that it doesn't have a coordinate. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ir_to_mesa: Remove incorrect usage of the 'struct' keyword on classes.Kenneth Graunke2011-08-191-2/+2
| | | | Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: set Q=1 for OPCODE_TEX executionBrian Paul2011-08-191-0/+8
| | | | | | | | | | | Q should not be significant for OPCODE_TEX, but it winds up getting passed to the compute_lambda() function. Make sure it's 1.0 to prevent garbage values, which is effectively what we get when the swizzle is coord.xyzz (which is what GLSL gives us). Part of the fix for piglit's fbo-generatemipmap-array test. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Bump instruction execution limit to 65536Ian Romanick2011-08-161-1/+1
| | | | | | | | | | | Shader Model 3.0[1] requires that shaders be able to execute at least 65536 instructions. Bump Mesa maxExec to that limit. This allows several vertex shaders in the OpenGL ES 2.0 conformance test suite to run to completion. 1: http://en.wikipedia.org/wiki/High_Level_Shader_Language Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add partial constant propagation pass for Mesa IRIan Romanick2011-08-163-0/+456
| | | | | | | | | | | | This cleans up some code generated by the IR-to-Mesa pass for i915. In particular, some shaders involving arrays of constant matrices result in really bad code. v2: Silence several warnings from merging the gl_constant_value work. Fix DP[23] folding. Add support for a bunch more opcodes that appear in piglit runs on i915. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Emit a MAD(b, -a, b) for !a && bIan Romanick2011-08-161-0/+52
| | | | | | | | !a && b occurs frequently when nexted if-statements have been flattened. It should also be possible use a MAD for (a && b) || c, though that would require a MAD_SAT. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGEIan Romanick2011-08-161-1/+12
| | | | | | | | | | | | | | | | | | | The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z != b.z || a.w != b.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives !bool((int(a.x != b.x) + int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be implemented using a dot-product with a vector of all 1.0. After the dot-product, the value will be an integer on the range [0,4]. Previously a SEQ instruction was used to clamp the resulting logic value to [0,1] and invert the result. Using an SGE instruction on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SEQ instruction. It must be emulated using two SGE instructions and a MUL. On these architectures, the single SGE saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_any_nequal using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-2/+20
| | | | | | | | | The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) || (a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x != b.x, a.y != b.y, a.z != b.z, a.w != b.w)). Implement the any() part the same way the regular ir_unop_any is implemented. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_any using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-4/+23
| | | | | | | | | | | | | | | | | | | | | | This is just like the ir_binop_logic_or case. The operation ir_unop_any is (a.x || a.y || a.z || a.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives (a.x + a.y + a.z + a.w). This can be implemented using a dot-product with a vector of all 1.0. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the dot-product has the same effect. Adding the saturate to the dot-product is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Make ir_to_mesa_visitor::emit_dp return the instructionIan Romanick2011-08-161-7/+7
| | | | Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_logic_or using an add w/saturate or add w/SLTIan Romanick2011-08-161-4/+21
| | | | | | | | | | | | | | | | | | | Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives a + b which has a result on the range [0, 2]. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the add has the same effect. Adding the saturate to the add is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the add result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_logic_not using 1-xIan Romanick2011-08-161-1/+7
| | | | | | | Since our logic values are 0.0 (false) and 1.0 (true), 1.0 - x accurately implements logical not. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add a convenience interface for register allocator conflicts setup.Eric Anholt2011-08-102-0/+23
|
* mesa: whitespace changesBrian Paul2011-08-081-5/+8
|
* ir_to_mesa: Replace open-coded swizzle_for_size()Eric Anholt2011-08-051-8/+1
|
* ir_to_mesa: Try to avoid emitting a MOV_SAT to saturate an expression tree.Eric Anholt2011-08-051-4/+24
| | | | | | Fixes a regression in codegen quality for ff_fragment_shader conversion to GLSL -- glean texCombine produces 7.5% fewer Mesa IR instructions.
* prog_optimize: Add support for saturates to _mesa_merge_mov_into_inst.Eric Anholt2011-08-051-3/+5
| | | | | This fixes the remaining regression from ff_fragment_shader in Mesa IR instruction count, to now being a 1.9% win overall.
* mesa: pass correct constant type to _mesa_fetch_state()Brian Paul2011-08-041-1/+1
| | | | Fixes assorted warnings about float vs. gl_constant_value pointers.
* mesa: use gl_constant_value type in ARB program parserBrian Paul2011-08-042-29/+30
|
* Merge branch 'glsl-to-tgsi'Bryan Cain2011-08-049-54/+101
|\ | | | | | | | | | | Conflicts: src/mesa/state_tracker/st_atom_pixeltransfer.c src/mesa/state_tracker/st_program.c
| * mesa, glsl_to_tgsi: add native support for integers in shadersBryan Cain2011-08-012-3/+30
| | | | | | | | | | Disabled by default on all drivers. To enable it, change ctx->GLSLVersion to 130 in st_extensions.c. Currently, softpipe is the only driver with integer support.
| * mesa: support boolean and integer-based parameters in prog_parameterBryan Cain2011-08-019-49/+68
| | | | | | | | | | | | The functionality is not used by anything yet, and the glUniform functions will need to be reworked before this can reach its full usefulness. It is nonetheless a step towards integer support in the state tracker and classic drivers.
| * mesa: fix segfault when no Mesa IR is generatedBryan Cain2011-08-011-2/+3
| |
* | ir_to_mesa: Emit warnings instead of errors for IR that can't be loweredIan Romanick2011-08-021-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rely on the driver to do the right thing. This probably means falling back to software. Page 88 of the OpenGL 2.1 spec specifically says: "A shader should not fail to compile, and a program object should not fail to link due to lack of instruction space or lack of temporary variables. Implementations should ensure that all valid shaders and program objects may be successfully compiled, linked and executed." There is no provision for saying "No" to a valid shader that is difficult for the hardware to handle, so stop doing that. On i915 this causes a large number of piglit tests to change from FAIL to WARN. The warning is because the driver still emits messages to stderr like "i915_program_error: Unsupported opcode: BGNLOOP". It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag on i915 (and probably other hardware that can't handle loops). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Use Add linker_error instead of fail_linkIan Romanick2011-08-021-31/+22
| | | | | | | | | | | | | | | | The functions were almost identical. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversionIan Romanick2011-07-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leaving the unused registers with other values caused assertion failures and other problems in places that blindly iterate over all sources. brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr != 0' failed. Fixes i965 piglit: vs-uniform-array-mat[234]-col-row-rd vs-uniform-array-mat[234]-index-col-row-rd vs-uniform-array-mat[234]-index-row-rd vs-uniform-mat[234]-col-row-rd Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructorIan Romanick2011-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes i965 piglit: vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Fixes swrast piglit: fs-temp-array-mat[234]-col-row-wr fs-temp-array-mat[234]-index-col-row-wr fs-temp-array-mat[234]-index-row-wr fs-temp-mat[234]-col-row-wr vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Add each relative address to the previousIan Romanick2011-07-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes many cases of accessing arrays of matrices using non-constant indices at each level. Fixes i965 piglit: vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd Fixes swrast piglit: fs-temp-array-mat[234]-index-col-rd fs-temp-array-mat[234]-index-col-row-rd fs-temp-array-mat[234]-index-col-wr fs-uniform-array-mat[234]-index-col-rd fs-uniform-array-mat[234]-index-col-row-rd fs-varying-array-mat[234]-index-col-rd fs-varying-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd vs-uniform-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-rd vs-varying-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-wr Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: fix a warning that a variable may be uninitializedMarek Olšák2011-07-151-0/+3
| |
* | mesa: split _mesa_reference_program() into hot/cold paths.Dave Airlie2011-07-142-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | inline the hotpath of the reference remaining the same. This shouldn't penalise the slow path at all but improve the hot path so we don't have to jump to the function. It also moves some assert checks under an #ifndef NDEBUG. Minor clean-ups added by Brian. Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* | ir_to_mesa: typo fix in a comment.Eric Anholt2011-07-111-3/+3
| |