summaryrefslogtreecommitdiffstats
path: root/src/mesa/program
Commit message (Collapse)AuthorAgeFilesLines
* mesa: Bump instruction execution limit to 65536Ian Romanick2011-08-161-1/+1
| | | | | | | | | | | Shader Model 3.0[1] requires that shaders be able to execute at least 65536 instructions. Bump Mesa maxExec to that limit. This allows several vertex shaders in the OpenGL ES 2.0 conformance test suite to run to completion. 1: http://en.wikipedia.org/wiki/High_Level_Shader_Language Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add partial constant propagation pass for Mesa IRIan Romanick2011-08-163-0/+456
| | | | | | | | | | | | This cleans up some code generated by the IR-to-Mesa pass for i915. In particular, some shaders involving arrays of constant matrices result in really bad code. v2: Silence several warnings from merging the gl_constant_value work. Fix DP[23] folding. Add support for a bunch more opcodes that appear in piglit runs on i915. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Emit a MAD(b, -a, b) for !a && bIan Romanick2011-08-161-0/+52
| | | | | | | | !a && b occurs frequently when nexted if-statements have been flattened. It should also be possible use a MAD for (a && b) || c, though that would require a MAD_SAT. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_all_equal using DP4 w/SGEIan Romanick2011-08-161-1/+12
| | | | | | | | | | | | | | | | | | | The operation ir_binop_all_equal is !(a.x != b.x || a.y != b.y || a.z != b.z || a.w != b.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives !bool((int(a.x != b.x) + int(a.y == b.y) + int(a.z == b.z) + int(a.w == b.w)). This can be implemented using a dot-product with a vector of all 1.0. After the dot-product, the value will be an integer on the range [0,4]. Previously a SEQ instruction was used to clamp the resulting logic value to [0,1] and invert the result. Using an SGE instruction on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SEQ instruction. It must be emulated using two SGE instructions and a MUL. On these architectures, the single SGE saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_any_nequal using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-2/+20
| | | | | | | | | The operation ir_binop_any_nequal is (a.x != b.x) || (a.y != b.y) || (a.z != b.z) || (a.w != b.w), and that is the same as any(bvec4(a.x != b.x, a.y != b.y, a.z != b.z, a.w != b.w)). Implement the any() part the same way the regular ir_unop_any is implemented. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_any using DP4 w/saturate or DP4 w/SLTIan Romanick2011-08-161-4/+23
| | | | | | | | | | | | | | | | | | | | | | This is just like the ir_binop_logic_or case. The operation ir_unop_any is (a.x || a.y || a.z || a.w). Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives (a.x + a.y + a.z + a.w). This can be implemented using a dot-product with a vector of all 1.0. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the dot-product has the same effect. Adding the saturate to the dot-product is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the dot-product result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Make ir_to_mesa_visitor::emit_dp return the instructionIan Romanick2011-08-161-7/+7
| | | | Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_binop_logic_or using an add w/saturate or add w/SLTIan Romanick2011-08-161-4/+21
| | | | | | | | | | | | | | | | | | | Logical-or is implemented using addition (followed by clampling to [0,1]) on values of 0.0 and 1.0. Replacing the logical-or operators with addition gives a + b which has a result on the range [0, 2]. Previously a SNE instruction was used to clamp the resulting logic value to [0,1]. In a fragment shader, using a saturate on the add has the same effect. Adding the saturate to the add is free, so (at least) one instruction is saved. In a vertex shader, using an SLT on the negation of the add result has the same effect. Many older shader architectures do not support the SNE instruction. It must be emulated using two SLT instructions and an ADD. On these architectures, the single SLT saves two instructions. Reviewed-by: Eric Anholt <[email protected]>
* ir_to_mesa: Implement ir_unop_logic_not using 1-xIan Romanick2011-08-161-1/+7
| | | | | | | Since our logic values are 0.0 (false) and 1.0 (true), 1.0 - x accurately implements logical not. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add a convenience interface for register allocator conflicts setup.Eric Anholt2011-08-102-0/+23
|
* mesa: whitespace changesBrian Paul2011-08-081-5/+8
|
* ir_to_mesa: Replace open-coded swizzle_for_size()Eric Anholt2011-08-051-8/+1
|
* ir_to_mesa: Try to avoid emitting a MOV_SAT to saturate an expression tree.Eric Anholt2011-08-051-4/+24
| | | | | | Fixes a regression in codegen quality for ff_fragment_shader conversion to GLSL -- glean texCombine produces 7.5% fewer Mesa IR instructions.
* prog_optimize: Add support for saturates to _mesa_merge_mov_into_inst.Eric Anholt2011-08-051-3/+5
| | | | | This fixes the remaining regression from ff_fragment_shader in Mesa IR instruction count, to now being a 1.9% win overall.
* mesa: pass correct constant type to _mesa_fetch_state()Brian Paul2011-08-041-1/+1
| | | | Fixes assorted warnings about float vs. gl_constant_value pointers.
* mesa: use gl_constant_value type in ARB program parserBrian Paul2011-08-042-29/+30
|
* Merge branch 'glsl-to-tgsi'Bryan Cain2011-08-049-54/+101
|\ | | | | | | | | | | Conflicts: src/mesa/state_tracker/st_atom_pixeltransfer.c src/mesa/state_tracker/st_program.c
| * mesa, glsl_to_tgsi: add native support for integers in shadersBryan Cain2011-08-012-3/+30
| | | | | | | | | | Disabled by default on all drivers. To enable it, change ctx->GLSLVersion to 130 in st_extensions.c. Currently, softpipe is the only driver with integer support.
| * mesa: support boolean and integer-based parameters in prog_parameterBryan Cain2011-08-019-49/+68
| | | | | | | | | | | | The functionality is not used by anything yet, and the glUniform functions will need to be reworked before this can reach its full usefulness. It is nonetheless a step towards integer support in the state tracker and classic drivers.
| * mesa: fix segfault when no Mesa IR is generatedBryan Cain2011-08-011-2/+3
| |
* | ir_to_mesa: Emit warnings instead of errors for IR that can't be loweredIan Romanick2011-08-021-4/+24
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Rely on the driver to do the right thing. This probably means falling back to software. Page 88 of the OpenGL 2.1 spec specifically says: "A shader should not fail to compile, and a program object should not fail to link due to lack of instruction space or lack of temporary variables. Implementations should ensure that all valid shaders and program objects may be successfully compiled, linked and executed." There is no provision for saying "No" to a valid shader that is difficult for the hardware to handle, so stop doing that. On i915 this causes a large number of piglit tests to change from FAIL to WARN. The warning is because the driver still emits messages to stderr like "i915_program_error: Unsupported opcode: BGNLOOP". It also fixes ES2 conformance CorrectFull_frag and CorrectParse1_frag on i915 (and probably other hardware that can't handle loops). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Use Add linker_error instead of fail_linkIan Romanick2011-08-021-31/+22
| | | | | | | | | | | | | | | | The functions were almost identical. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: Set unused regs to PROGRAM_UNDEFINED after CMP->MOV conversionIan Romanick2011-07-231-0/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Leaving the unused registers with other values caused assertion failures and other problems in places that blindly iterate over all sources. brw_vs_emit.c:1381: get_src_reg: Assertion `c->regs[file][index].nr != 0' failed. Fixes i965 piglit: vs-uniform-array-mat[234]-col-row-rd vs-uniform-array-mat[234]-index-col-row-rd vs-uniform-array-mat[234]-index-row-rd vs-uniform-mat[234]-col-row-rd Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Copy reladdr in src_reg(dst_reg) constructorIan Romanick2011-07-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fixes i965 piglit: vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Fixes swrast piglit: fs-temp-array-mat[234]-col-row-wr fs-temp-array-mat[234]-index-col-row-wr fs-temp-array-mat[234]-index-row-wr fs-temp-mat[234]-col-row-wr vs-temp-array-mat[234]-col-row-wr vs-temp-array-mat[234]-index-col-row-wr vs-temp-array-mat[234]-index-row-wr vs-temp-mat[234]-col-row-wr Reviewed-by: Eric Anholt <[email protected]>
* | ir_to_mesa: Add each relative address to the previousIan Romanick2011-07-231-0/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This fixes many cases of accessing arrays of matrices using non-constant indices at each level. Fixes i965 piglit: vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd Fixes swrast piglit: fs-temp-array-mat[234]-index-col-rd fs-temp-array-mat[234]-index-col-row-rd fs-temp-array-mat[234]-index-col-wr fs-uniform-array-mat[234]-index-col-rd fs-uniform-array-mat[234]-index-col-row-rd fs-varying-array-mat[234]-index-col-rd fs-varying-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-rd vs-temp-array-mat[234]-index-col-row-rd vs-temp-array-mat[234]-index-col-wr vs-uniform-array-mat[234]-index-col-rd vs-uniform-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-rd vs-varying-array-mat[234]-index-col-row-rd vs-varying-array-mat[234]-index-col-wr Reviewed-by: Eric Anholt <[email protected]>
* | prog_optimize: fix a warning that a variable may be uninitializedMarek Olšák2011-07-151-0/+3
| |
* | mesa: split _mesa_reference_program() into hot/cold paths.Dave Airlie2011-07-142-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | inline the hotpath of the reference remaining the same. This shouldn't penalise the slow path at all but improve the hot path so we don't have to jump to the function. It also moves some assert checks under an #ifndef NDEBUG. Minor clean-ups added by Brian. Signed-off-by: Dave Airlie <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* | ir_to_mesa: typo fix in a comment.Eric Anholt2011-07-111-3/+3
| |
* | ir_to_mesa: Allocate temporary instructions on the visitor's ralloc contextIan Romanick2011-07-061-16/+12
| | | | | | | | | | | | | | | | | | | | | | | | And don't delete them. Let ralloc clean them up. Deleting the temporary IR leaves dangling references in the prog_instruction. That results in a bad dereference when printing the IR with MESA_GLSL=dump. NOTE: This is a candidate for the 7.10 and 7.11 branches. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=38584 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* | ir_to_mesa: "Support" u2f, i2u, and u2i operations by doing nothing.Kenneth Graunke2011-06-291-1/+3
|/ | | | | | | | | Mesa IR actually stores all numbers as floating point, so this is totally a farce, but we may as well keep it going. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa: set parameter list StateFlags field in _mesa_layout_parameters()Pierre-Eric Pelloux-Prayer2011-05-271-0/+1
| | | | | | | | | | | | When using _mesa_layout_parameters, all params copied in the 'layout' output in the PASS 1 don't modify StateFlags (because they are simply memcpy'ed). This patch fixes the problem, assuring output gl_prog_param_list StateFlags field is the same as the input one. NOTE: This is a candidate for the 7.10 branch. Signed-off-by: Brian Paul <[email protected]>
* mesa: Include shader target in dumps of GLSL source.Eric Anholt2011-05-271-1/+2
| | | | | | This makes automatic parsing of MESA_GLSL=dump output easier. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Don't append fog code for programs that don't output color.José Fonseca2011-05-111-0/+6
| | | | | | | | | Fixes fdo 36919. NOTE: This is a candidate for the stable branches. It should be cherry-picked to the sames branches that 3aa21f93dc1329c6f956277f2746c2a0bdae5446 was.
* ir_to_mesa: Emit TXD instruction.Kenneth Graunke2011-05-091-2/+11
| | | | | | | Mesa already supports this because of NV_fragment_program. Signed-off-by: Kenneth Graunke <[email protected]> Tested-by: Marek Olšák <[email protected]>
* mesa: document instructions ir_to_mesa emitsMarek Olšák2011-05-091-14/+14
| | | | | | | | | | | | | | | | | | | GLSL stopped using: BRA, EXP, LOG, LRP, NRM3, NRM4, XPD. GLSL started using: KIL, SCS, SSG, SWZ. (omg why SWZ? isn't proc_src_register flexible enough?) GLSL doesn't use these opcodes some Radeons do support: ARR, DP2A, DST, LRP, XPD. These opcodes are now unused: AND, NOT, NRM3, NRM4, OR, XOR. (plus maybe the NV extensions which are unused by Gallium) In addition to that, we don't use two-dimensional indirect addressing, which the Mesa IR can do.
* mesa: replace ONE_DIV_LN2 constant with M_LOG2EMatt Turner2011-05-061-1/+1
| | | | | | | | | | | | | 1/ln(2) is equivalent to log2(e), so define it as such. log2(e) = ln(e)/ln(2) = 1/ln(2) Worst of all, the definitions for M_LOG2E and ONE_DIV_LN2 (right beside each other!) weren't the same. Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Matt Turner <[email protected]> Signed-off-by: Brian Paul <[email protected]>
* mesa,st/mesa: fix WPOS adjustmentChristoph Bumiller2011-05-032-3/+3
| | | | Tested-by: Marek Olšák <[email protected]>
* ir_to_mesa: remove set-but-unused variablesMarek Olšák2011-05-011-3/+2
|
* ra: Add ra_set_node_reg()Tom Stellard2011-04-302-4/+25
| | | | | | | | This function can be used to avoid creating single register classes for input/payload registers. This makes optimistic coloring less likely to fail. Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add a bunch of documentation to the register allocator.Eric Anholt2011-04-291-3/+65
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* prog_print: Add support for printing the TXD opcode.Kenneth Graunke2011-04-281-0/+7
| | | | Reviewed-by: Brian Paul <[email protected]>
* mesa: emit more info in program parser error messageBrian Paul2011-04-271-1/+5
|
* hash_table: Add an iterator for doing things like cleanup of the HT.Eric Anholt2011-04-262-0/+26
| | | | | | | Without this, consumers often have to keep linked lists of the entries, at additional malloc cost. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Fix fragment.color (no index) writes with OPTION ARB_draw_buffers.Eric Anholt2011-04-231-3/+8
| | | | | | | | | Fixes a bug in Trine where fragment.color would write FRAG_RESULT_COLOR (which is interpreted by drivers as being the "write this to all color buffers" option) instead of FRAG_RESULT_DATA0 (just the first target). Fixes piglit ATI_draw_buffers/arbfp-no-index.
* mesa: Kill gl_fragment_program::FogOption with fireIan Romanick2011-04-214-26/+30
| | | | | | | | | | | | All drivers expect this to always be GL_NONE. Don't let there be any opportunity for a bad value to leak out and infect some unsuspecting driver. If any driver for hardware that had fixed-function per-fragment fog (i915 and perhaps some r300-ish) was ever going to add support, it would have done it by now. Reviewed-by: Eric Anholt <[email protected]> Acked-by: Corbin Simpson <[email protected]> Acked-by: Alex Deucher <[email protected]>
* prog_optimize: Add simplify CMP optimization passTom Stellard2011-04-161-0/+78
| | | | | | | This pass coverts CMP T0, T1 T2 T0 -> MOV T0, T2 when the CMP instruction is the first instruction to write to register T0. This pass is useful for hardware that requires a lot of lowering passes that generate many CMP instructions.
* prog_optimize: get_src_arg_mask() respect writemask for more opcodesTom Stellard2011-04-161-0/+11
| | | | Reviewed-by: Eric Anholt <[email protected]>
* mesa: Add support for OPTION ATI_draw_buffers to ARB_fp.Eric Anholt2011-04-131-0/+10
| | | | | | Tested by piglit ati_draw_buffers-arbfp. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* mesa: Add support for the ARB_fragment_program part of ARB_draw_buffers.Eric Anholt2011-04-132-0/+30
| | | | | | | | Fixes fbo-drawbuffers-arbfp. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=34321 Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* ir_to_mesa: silence signed/unsigned comparison warningsBrian Paul2011-04-111-2/+2
|