summaryrefslogtreecommitdiffstats
path: root/src/mesa/program
Commit message (Collapse)AuthorAgeFilesLines
* i965/fs: Implement texelFetch() on Ironlake and Sandybridge.Kenneth Graunke2011-09-191-3/+2
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: fix shadow2DArray comparisonMarek Olšák2011-09-101-3/+14
| | | | | | The depth should be in W. v2: adjust the assertion, add a comment
* nvprogram: Silence "warning: unused parameter ‘ctx’"Ian Romanick2011-09-091-1/+1
|
* mesa: Replace the EmitNoIfs compiler flag with a MaxIfDepth flag.Bryan Cain2011-08-311-4/+4
| | | | | | | This is a better, more fine-grained way of lowering if statements. Fixes the game And Yet It Moves on nv50. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Use a separate div_to_mul_rcp lowering flag for integers.Bryan Cain2011-08-311-1/+1
| | | | | | | | | | | | | | Using multiply and reciprocal for integer division involves potentially lossy floating point conversions. This is okay for older GPUs that represent integers as floating point, but undesirable for GPUs with native integer division instructions. TGSI, for example, has UDIV/IDIV instructions for integer division, so it makes sense to handle this directly. Likewise for i965. Reviewed-by: Ian Romanick <[email protected]> Signed-off-by: Bryan Cain <[email protected]> Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: Make the gl_constant_value's bool occupy the same space as float/int.Eric Anholt2011-08-301-1/+1
| | | | | | | | | At least for Intel, all our uniform components are of uint32_t size, either float or signed or unsigned int. For uploading uniform data in the driver, it's much easier to upload a full dword per uniform element instead of trying to pick out the bool byte and then fill in the top 3 bytes of pad with 0. Reviewed-by: Kenneth Graunke <[email protected]>
* Change return type of try_emit_* methods to bool.Kai Wasserbäch2011-08-251-4/+4
| | | | | | | | | | | Ian Romanick explained (Message-Id: <[email protected]>), that the return type of non-API methods shouldn't use GLboolean but a standard C++ bool. Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Bryan Cain <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Signed-off-by: Kai Wasserbäch <[email protected]>
* i965/fs: Implement textureSize (TXS) on Gen5+.Kenneth Graunke2011-08-231-2/+5
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* glsl: Add a new ir_txs (textureSize) opcode to ir_texture.Kenneth Graunke2011-08-231-0/+1
| | | | | | | | One unique aspect of TXS is that it doesn't have a coordinate. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ir_to_mesa: Remove incorrect usage of the 'struct' keyword on classes.Kenneth Graunke2011-08-191-2/+2
| | | | Signed-off-by: Kenneth Graunke <[email protected]>
* mesa: set Q=1 for OPCODE_TEX executionBrian Paul2011-08-191-0/+8
| | | | | | | | | | | Q should not be significant for OPCODE_TEX, but it winds up getting passed to the compute_lambda() function. Make sure it's 1.0 to prevent garbage values, which is effectively what we get when the swizzle is coord.xyzz (which is what GLSL gives us). Part of the fix for piglit's fbo-generatemipmap-array test. Reviewed-by: Ian Romanick <[email protected]>
* mesa: Bump instruction execution limit to 65536Ian Romanick2011-08-161-1/+1
| | | | | | | | | | | Shader Model 3.0[1] requires that shaders be able to execute at least 65536 instructions. Bump Mesa maxExec to that limit. This allows several vertex shaders in the OpenGL ES 2.0 conformance test suite to run to completion. 1: http://en.wikipedia.org/wiki/High_Level_Shader_Language Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add partial constant propagation pass for Mesa IRIan Romanick2011-08-163-0/+456
| | | | | | | | | | | | This cleans up some code generated by the IR-to-Mesa pass for i915. In particular, some shaders involving arrays of constant matrices result in really bad code. v2: Silence several warnings from merging the gl_constant_value work. Fix DP[23] folding. Add support for a bunch more opcodes that appear in piglit runs on i915. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Emit a MAD(b, -a, b) for !a && bIan Romanick2011-08-161-0/+52
| | | | | | | | !a && b occurs frequently when nexted if-statements have been flattened. It should also be possible use a MAD for (a && b) || c, though that would require a MAD_SAT. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGEIan Romanick2011-08-161-1/+12
| | | | | | | | | | | | | | | | | | | The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z != b.z || a.w != b.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives !bool((int(a.x != b.x) + int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be implemented using a dot-product with a vector of all 1.0. After the dot-product, the value will be an integer on the range [0,4]. Previously a SEQ instruction was used to clamp the resulting logic value to [0,1] and invert the result. Using an SGE instruction on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SEQ instruction. It must be emulated using two SGE instructions and a MUL. On these architectures, the single SGE saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_any_nequal using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-2/+20
| | | | | | | | | The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) || (a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x != b.x, a.y != b.y, a.z != b.z, a.w != b.w)). Implement the any() part the same way the regular ir_unop_any is implemented. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_any using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-4/+23
| | | | | | | | | | | | | | | | | | | | | | This is just like the ir_binop_logic_or case. The operation ir_unop_any is (a.x || a.y || a.z || a.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives (a.x + a.y + a.z + a.w). This can be implemented using a dot-product with a vector of all 1.0. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the dot-product has the same effect. Adding the saturate to the dot-product is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Make ir_to_mesa_visitor::emit_dp return the instructionIan Romanick2011-08-161-7/+7
| | | | Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_logic_or using an add w/saturate or add w/SLTIan Romanick2011-08-161-4/+21
| | | | | | | | | | | | | | | | | | | Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives a + b which has a result on the range [0, 2]. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the add has the same effect. Adding the saturate to the add is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the add result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_logic_not using 1-xIan Romanick2011-08-161-1/+7
| | | | | | | Since our logic values are 0.0 (false) and 1.0 (true), 1.0 - x accurately implements logical not. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add a convenience interface for register allocator conflicts setup.Eric Anholt2011-08-102-0/+23
|
* mesa: whitespace changesBrian Paul2011-08-081-5/+8
|
* ir_to_mesa: Replace open-coded swizzle_for_size()Eric Anholt2011-08-051-8/+1
|
* ir_to_mesa: Try to avoid emitting a MOV_SAT to saturate an expression tree.Eric Anholt2011-08-051-4/+24
| | | | | | Fixes a regression in codegen quality for ff_fragment_shader conversion to GLSL -- glean texCombine produces 7.5% fewer Mesa IR instructions.
* prog_optimize: Add support for saturates to _mesa_merge_mov_into_inst.Eric Anholt2011-08-051-3/+5
| | | | | This fixes the remaining regression from ff_fragment_shader in Mesa IR instruction count, to now being a 1.9% win overall.
* mesa: pass correct constant type to _mesa_fetch_state()Brian Paul2011-08-041-1/+1
| | | | Fixes assorted warnings about float vs. gl_constant_value pointers.
* mesa: use gl_constant_value type in ARB program parserBrian Paul2011-08-042-29/+30
|
* Merge branch 'glsl-to-tgsi'Bryan Cain2011-08-049-54/+101
|\ | | | | | | | | | | Conflicts: src/mesa/state_tracker/st_atom_pixeltransfer.c src/mesa/state_tracker/st_program.c
| * mesa, glsl_to_tgsi: add native support for integers in shadersBryan Cain2011-08-012-3/+30
| | | | | | | | | | Disabled by default on all drivers. To enable it, change ctx->GLSLVersion to 130 in st_extensions.c. Currently, softpipe is the only driver with integer support.
| * mesa: support boolean and integer-based parameters in prog_parameterBryan Cain2011-08-019-49/+68
| | | | | | | | | | | | The functionality is not used by anything yet, and the glUniform functions will need to be reworked before this can reach its full usefulness. It is nonetheless a step towards integer support in the state tracker and classic drivers.
| * mesa: fix segfault when no Mesa IR is generatedBryan Cain2011-08-011-2/+3
| |
* | ir_to_mesa: Emit warnings instead of errors for IR that can't be loweredIan Romanick2011-08-021-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rely on the driver to do the right thing. This probably means falling back to software. Page 88 of the OpenGL 2.1 spec specifically says: "A shader should not fail to compile, and a program object should not fail to link due to lack of instruction space or lack of temporary variables. Implementations should ensure that all valid shaders and program objects may be successfully compiled, linked and executed." There is no provision for saying "No" to a valid shader that is difficult for the hardware to handle, so stop doing that. On i915 this causes a large number of piglit tests to change from FAIL to WARN. The warning is because the driver still emits messages to stderr like "i915_program_error: Unsupported opcode: BGNLOOP". It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag on i915 (and probably other hardware that can't handle loops). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Use Add linker_error instead of fail_linkIan Romanick2011-08-021-31/+22
| | | | | | | | | | | | | | | | The functions were almost identical. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversionIan Romanick2011-07-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leaving the unused registers with other values caused assertion failures and other problems in places that blindly iterate over all sources. brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr != 0' failed. Fixes i965 piglit: vs-uniform-array-mat[234]-col-row-rd vs-uniform-array-mat[234]-index-col-row-rd vs-uniform-array-mat[234]-index-row-rd vs-uniform-mat[234]-col-row-rd Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructorIan Romanick2011-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes i965 piglit: vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Fixes swrast piglit: fs-temp-array-mat[234]-col-row-wr fs-temp-array-mat[234]-index-col-row-wr fs-temp-array-mat[234]-index-row-wr fs-temp-mat[234]-col-row-wr vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Add each relative address to the previousIan Romanick2011-07-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes many cases of accessing arrays of matrices using non-constant indices at each level. Fixes i965 piglit: vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd Fixes swrast piglit: fs-temp-array-mat[234]-index-col-rd fs-temp-array-mat[234]-index-col-row-rd fs-temp-array-mat[234]-index-col-wr fs-uniform-array-mat[234]-index-col-rd fs-uniform-array-mat[234]-index-col-row-rd fs-varying-array-mat[234]-index-col-rd fs-varying-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd vs-uniform-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-rd vs-varying-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-wr Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: fix a warning that a variable may be uninitializedMarek Olšák2011-07-151-0/+3
| |
* | mesa: split _mesa_reference_program() into hot/cold paths.Dave Airlie2011-07-142-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | inline the hotpath of the reference remaining the same. This shouldn't penalise the slow path at all but improve the hot path so we don't have to jump to the function. It also moves some assert checks under an #ifndef NDEBUG. Minor clean-ups added by Brian. Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* | ir_to_mesa: typo fix in a comment.Eric Anholt2011-07-111-3/+3
| |
* | ir_to_mesa: Allocate temporary instructions on the visitor's ralloc contextIan Romanick2011-07-061-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | And don't delete them. Let ralloc clean them up. Deleting the temporary IR leaves dangling references in the prog_instruction. That results in a bad dereference when printing the IR with MESA_GLSL=dump. NOTE: This is a candidate for the 7.10 and 7.11 branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38584 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | ir_to_mesa: "Support" u2f, i2u, and u2i operations by doing nothing.Kenneth Graunke2011-06-291-1/+3
|/ | | | | | | | | Mesa IR actually stores all numbers as floating point, so this is totally a farce, but we may as well keep it going. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: set parameter list StateFlags field in _mesa_layout_parameters()Pierre-Eric Pelloux-Prayer2011-05-271-0/+1
| | | | | | | | | | | | When using _mesa_layout_parameters, all params copied in the 'layout' output in the PASS 1 don't modify StateFlags (because they are simply memcpy'ed). This patch fixes the problem, assuring output gl_prog_param_list StateFlags field is the same as the input one. NOTE: This is a candidate for the 7.10 branch. Signed-off-by: Brian Paul <[email protected]>
* mesa: Include shader target in dumps of GLSL source.Eric Anholt2011-05-271-1/+2
| | | | | | This makes automatic parsing of MESA_GLSL=dump output easier. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Don't append fog code for programs that don't output color.José Fonseca2011-05-111-0/+6
| | | | | | | | | Fixes fdo 36919. NOTE: This is a candidate for the stable branches. It should be cherry-picked to the sames branches that 3aa21f93dc1329c6f956277f2746c2a0bdae5446 was.
* ir_to_mesa: Emit TXD instruction.Kenneth Graunke2011-05-091-2/+11
| | | | | | | Mesa already supports this because of NV_fragment_program. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Marek Olšák <[email protected]>
* mesa: document instructions ir_to_mesa emitsMarek Olšák2011-05-091-14/+14
| | | | | | | | | | | | | | | | | | | GLSL stopped using: BRA, EXP, LOG, LRP, NRM3, NRM4, XPD. GLSL started using: KIL, SCS, SSG, SWZ. (omg why SWZ? isn't proc_src_register flexible enough?) GLSL doesn't use these opcodes some Radeons do support: ARR, DP2A, DST, LRP, XPD. These opcodes are now unused: AND, NOT, NRM3, NRM4, OR, XOR. (plus maybe the NV extensions which are unused by Gallium) In addition to that, we don't use two-dimensional indirect addressing, which the Mesa IR can do.
* mesa: replace ONE_DIV_LN2 constant with M_LOG2EMatt Turner2011-05-061-1/+1
| | | | | | | | | | | | | 1/ln(2) is equivalent to log2(e), so define it as such. log2(e) = ln(e)/ln(2) = 1/ln(2) Worst of all, the definitions for M_LOG2E and ONE_DIV_LN2 (right beside each other!) weren't the same. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Matt Turner <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa,st/mesa: fix WPOS adjustmentChristoph Bumiller2011-05-032-3/+3
| | | | Tested-by: Marek Olšák <[email protected]>
* ir_to_mesa: remove set-but-unused variablesMarek Olšák2011-05-011-3/+2
|
* ra: Add ra_set_node_reg()Tom Stellard2011-04-302-4/+25
| | | | | | | | This function can be used to avoid creating single register classes for input/payload registers. This makes optimistic coloring less likely to fail. Reviewed-by: Eric Anholt <[email protected]>