summaryrefslogtreecommitdiffstats
path: root/src/glsl
Commit message (Collapse)AuthorAgeFilesLines
* nir: Recognize and reduce duplicated fsats.Eric Anholt2015-02-181-0/+2
| | | | | | | | No effect on vc4 shader-db. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fsat.Eric Anholt2015-02-182-1/+3
| | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44356 -> 44354 (-0.00%) instructions in affected programs: 55 -> 53 (-3.64%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering ffma.Eric Anholt2015-02-182-1/+3
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13966 -> 13791 (-1.25%) uniforms in affected programs: 435 -> 260 (-40.23%) total instructions in shared programs: 44732 -> 44356 (-0.84%) instructions in affected programs: 9599 -> 9223 (-3.92%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fneg/ineg.Eric Anholt2015-02-182-0/+12
| | | | | | | | | | | vc4 cse/algebraic-disabled stats: total instructions in shared programs: 44911 -> 44732 (-0.40%) instructions in affected programs: 11371 -> 11192 (-1.57%) v2: Fix broken iabs(isub(0, a)) transformation. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add a flag for lowering fsqrt(x) to frcp(frsqrt(x)).Eric Anholt2015-02-182-1/+3
| | | | | | | | | | | | vc4 cse/algebraic-disabled stats: total uniforms in shared programs: 13972 -> 13966 (-0.04%) uniforms in affected programs: 408 -> 402 (-1.47%) total instructions in shared programs: 44973 -> 44911 (-0.14%) instructions in affected programs: 1551 -> 1489 (-4.00%) v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* nir: Add lowering of POW instructions if the lower flag is set.Eric Anholt2015-02-181-0/+1
| | | | | | | | | | This could be done in a separate pass like we do in GLSL IR, but it seems to me like having the definitions of the transformations in the two directions next to each other makes a lot of sense. v2: Reorder the comment about the transformation. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Conditionalize the POW reconstruction on shader compiler options.Eric Anholt2015-02-183-2/+6
| | | | | | | | | | | | | Mesa has a shader compiler struct flagging whether GLSL IR's opt_algebraic and other passes should try and generate certain types of opcodes or patterns. Extend that to NIR by defining our own struct, which is automatically generated from the Mesa struct in glsl_to_nir and provided directly by the driver in TGSI-to-NIR. v2: Split out the previous two prep patches. v3: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v2)
* nir: Add an optional expression controlling nir_algebraic xforms.Eric Anholt2015-02-181-7/+32
| | | | | | | | | | | | This will be used so that we can customize the transforms for the target GPU, so we don't un-lower expressions that had already been lowered (or introduce new lowering transformations that not all GPUs want) v2: Drop the complication of having the condition->index dictionary, since we don't actually expect there to be many different conditions (change by Kenneth). Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a nir_shader_compiler_options struct pointed to by the shaders.Eric Anholt2015-02-183-4/+38
| | | | | | | | | This will be used to give the optimization passes a chance to customize behavior for the particular target device. v2: Rebase to master (no TGSI->NIR present) Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* Avoid fighting with Solaris headers over isnormal()Alan Coopersmith2015-02-171-1/+1
| | | | | | | | | | | When compiling in C99 or C++11 modes, Solaris defines isnormal() as a macro via <math.h>, which causes the function definition to become too mangled to compile. Signed-off-by: Alan Coopersmith <[email protected]> Cc: "10.5" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* Remove extraneous ; after DECL_TYPE usageAlan Coopersmith2015-02-171-33/+33
| | | | | | | | | | | | | | The macro is defined to provide a trailing ; so this caused the expansion to end in ";;" which made the Solaris Studio compilers issue warnings for every line of: "builtin_type_macros.h", line 113: Warning: extra ";" ignored. for every file that included the header, filling build logs with thousands of useless warnings. Signed-off-by: Alan Coopersmith <[email protected]> Cc: "10.5" <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* glsl: Reduce memory consumption of copy propagation passes.Kenneth Graunke2015-02-172-6/+25
| | | | | | | | | | | | | | | | | | | | | | | | | opt_copy_propagation and opt_copy_propagation_elements create new ACP and Kill sets each time they enter a new control flow block. For if blocks, they also copy the entire existing ACP set contents into the new set. When we exit the control flow block, we discard the new sets. However, we weren't freeing them - so they lived on until the pass finished. This can waste a lot of memory (57MB on one pessimal shader). This patch makes the pass allocate ACP entries using this->acp as the memory context, and Kill entries out of this->kill. It also steals kill entries when moving them from the inner kill list to the parent. It then frees the lists, including their contents. v2: Move ralloc_free(this->acp) just before this->acp = orig_acp (suggested by Eric Anholt). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Cc: "10.5 10.4" <[email protected]>
* glcpp: Silence GCC warningIan Romanick2015-02-171-1/+1
| | | | | | | | | glcpp/glcpp.c:124:1: warning: ‘static’ is not at beginning of declaration [-Wold-style-declaration] const static struct option ^ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl/tests: add IMAGE type.Ilia Mirkin2015-02-171-0/+3
| | | | | | | | This fixes a warning when running make check. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: Make gl_FrontFacing a system_valueJason Ekstrand2015-02-141-2/+8
| | | | | | | | | GLSL IR labels gl_FrontFacing as an input variable and not a system value. This commit makes NIR silently translate gl_FrontFacing to a system value so that it properly gets translated into a load_system_value intrinsic. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/lower_phis_to_scalar: Fix some logic in is_phi_scalarizableJason Ekstrand2015-02-141-3/+3
| | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add missing header to the sources listEmil Velikov2015-02-121-0/+1
| | | | | | | Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: resolve nir.h dependency list (fix make distcheck)Emil Velikov2015-02-121-1/+1
| | | | | | | | | | Use nir/nir_opcodes.h as is (w/o the absolute path), as it is the target name used to generate the actual file. Otherwise the target is missing, the file won't get generated and the build will fail. Cc: "10.5" <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Optimize (f2i(trunc x)) into (f2i x).Matt Turner2015-02-111-0/+9
| | | | | | total instructions in shared programs: 5950326 -> 5949286 (-0.02%) instructions in affected programs: 88264 -> 87224 (-1.18%) helped: 692
* glsl: Optimize round-half-up pattern.Matt Turner2015-02-111-0/+33
| | | | | Hurts some Psychonauts shaders, but after the next patch (which this enables) they're fewer instructions than before this patch.
* glsl: Add trunc() to ir_builder.Matt Turner2015-02-112-0/+6
|
* nir: Recognize open-coded fmin/fmax.Matt Turner2015-02-111-0/+2
| | | | | | | | | And unfortunately other shaders do the same thing but with >=/<= which we can't apply this optimization to because of NaNs. instructions in affected programs: 23309 -> 22938 (-1.59%) Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add algebraic opt for int comparisons with identical operands.Eric Anholt2015-02-111-0/+9
| | | | | | | | | No change on shader-db on i965. v2: Reword the comment due to feedback from Erik Faye-Lund Reviewed-by: Connor Abbott <[email protected]> (v1) Reviewed-by: Jason Ekstrand <[email protected]> (v1)
* nir: Fix load_const comparisons for CSE.Eric Anholt2015-02-111-1/+1
| | | | | | | | | | | | | | | | We want the size of a float per component, not the size of a whole vec4. NIR instructions on i965: total instructions in shared programs: 1261937 -> 1261929 (-0.00%) instructions in affected programs: 114 -> 106 (-7.02%) Looking at one of these examples (tesseract), it's from vec4 load_consts for a MRT solid fill, which do get CSEed now that we don't memcmp off the end of the const value and into the SSA def. For the 1-component loads that are common in i965, we were only memcmping off into the rest of the usually zero-filled const_value. Reviewed-by: Connor Abbott <[email protected]>
* glsl: Optimize 1/exp(x) into exp(-x).Matt Turner2015-02-101-0/+6
| | | | | | | | | | | | | Lots of shaders divide by exp2(...) which we turn into a multiplication by the reciprocal. We can avoid the reciprocal by simply negating exp2's argument. total instructions in shared programs: 5947154 -> 5946695 (-0.01%) instructions in affected programs: 118661 -> 118202 (-0.39%) helped: 380 Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Remove casts from void*.Matt Turner2015-02-104-14/+13
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Replace assert(0) with unreachable().Matt Turner2015-02-101-7/+7
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Remove unused has_indirect variable.Matt Turner2015-02-101-4/+0
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Forbid calling the constructor of any opaque type.Francisco Jerez2015-02-101-3/+3
| | | | | | The spec doesn't define any opaque type constructors. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Return correct number of coordinate components for cubemap array images.Francisco Jerez2015-02-101-2/+7
| | | | | | | | | Cubemap array images are unlike cubemap array samplers in that they don't need an additional coordinate to index individual cubemaps in the array, instead they behave like a 2D array of 6n layers, with n the number of cubemaps in the array. Take this exception into account. Reviewed-by: Ian Romanick <[email protected]>
* nir: Mark nir_print_instr's instr pointer as const.Kenneth Graunke2015-02-102-3/+3
| | | | | | | | Printing instructions doesn't modify them, so we can mark the parameter const. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Fix broken fsat recognizer.Eric Anholt2015-02-061-1/+1
| | | | | | | | We've probably never seen this ridiculous pattern in the wild, so it didn't matter. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Slightly simplify algebraic code generation by reusing a struct.Eric Anholt2015-02-061-6/+3
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: GLSL ES identifiers cannot exceed 1024 charactersIago Toral Quiroga2015-02-061-1/+7
| | | | | | | | | | | v2 (Ian Romanick) - Move the check to the lexer before rallocing a copy of the large string. Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_vertex dEQP-GLES3.functional.shaders.keywords.invalid_identifiers.max_length_fragment Reviewed-by: Ian Romanick <[email protected]>
* nir: add an optimization to remove useless phi nodesConnor Abbott2015-02-033-0/+112
| | | | | | | | | | | | | | | | | | | | This removes phi nodes whose sources all point to the same thing. Shader-db results: total NIR instructions in shared programs: 2045293 -> 2041209 (-0.20%) NIR instructions in affected programs: 126564 -> 122480 (-3.23%) helped: 615 HURT: 0 total FS instructions in shared programs: 4321840 -> 4320392 (-0.03%) FS instructions in affected programs: 24622 -> 23174 (-5.88%) helped: 138 HURT: 0 Reviewed-by: Jason Ekstrand <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Signed-off-by: Connor Abbott <[email protected]>
* nir/validate: Ensure that phi sources are SSA-onlyJason Ekstrand2015-02-031-10/+3
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/validate: Validate that only float ALU outputs are saturatedJason Ekstrand2015-02-031-0/+8
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir/lower_source_mods: Don't lower saturate for non-float outputsJason Ekstrand2015-02-031-0/+4
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: Add a pass to lower vector phi nodes to scalar phi nodesJason Ekstrand2015-02-033-0/+293
| | | | | | | | | | | | | | | | | | | | | | v2 Jason Ekstrand <[email protected]>: - Add better comments - Use nir_ssa_dest_init and nir_src_for_ssa more places - Fix some void * casts v3 Jason Ekstrand <[email protected]>: - Rework the way we determine whether or not to sccalarize a phi node to make the recursion non-bogus - Treat load_const instructions as scalarizable v4 Jason Ekstrand <[email protected]>: - Allow uniform and input loads to be scalarizable v5 Jason Ekstrand <[email protected]>: - Also consider loads of inputs (varying, uniform, or ubo) to be scalarizable. We were already doing this for load_var on uniforms and inputs. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/list: Note that exec_lists may not be realloc'd.Matt Turner2015-02-031-0/+4
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Improve precision of mod(x,y)Iago Toral Quiroga2015-02-033-28/+41
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Currently, Mesa uses the lowering pass MOD_TO_FRACT to implement mod(x,y) as y * fract(x/y). This implementation has a down side though: it introduces precision errors due to the fract() operation. Even worse, since the result of fract() is multiplied by y, the larger y gets the larger the precision error we produce, so for large enough numbers the precision loss is significant. Some examples on i965: Operation Precision error ----------------------------------------------------- mod(-1.951171875, 1.9980468750) 0.0000000447 mod(121.57, 13.29) 0.0000023842 mod(3769.12, 321.99) 0.0000762939 mod(3769.12, 1321.99) 0.0001220703 mod(-987654.125, 123456.984375) 0.0160663128 mod( 987654.125, 123456.984375) 0.0312500000 This patch replaces the current lowering pass with a different one (MOD_TO_FLOOR) that follows the recommended implementation in the GLSL man pages: mod(x,y) = x - y * floor(x/y) This implementation eliminates the precision errors at the expense of an additional add instruction on some systems. On systems that can do negate with multiply-add in a single operation this new implementation would come at no additional cost. v2 (Ian Romanick) - Do not clone operands because when they are expressions we would be duplicating them and that can lead to suboptimal code. Fixes the following 16 dEQP tests: dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.mediump_* dEQP-GLES3.functional.shaders.builtin_functions.precision.mod.highp_* Reviewed-by: Ian Romanick <[email protected]>
* glsl: can't have 'const' qualifier used with struct or interface block membersIago Toral Quiroga2015-02-031-0/+7
| | | | | | | | Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_const_fragment Reviewed-by: Ian Romanick <[email protected]>
* glsl: interface blocks must be declared at global scopeIago Toral Quiroga2015-02-031-0/+8
| | | | | | | | Fixes the following 2 dEQP tests: dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_vertex dEQP-GLES3.functional.shaders.declarations.invalid_declarations.uniform_block_in_main_fragment Reviewed-by: Ian Romanick <[email protected]>
* glsl: Pick ast_conditional branch regardless of op1/2 being constant.Kenneth Graunke2015-02-021-4/+2
| | | | | | | | | | | | | | | If the ?: operator's condition is a constant value, and both branches were pure expressions, we can just make the resulting value one or the other. Previously, we only did this if op[1] and op[2] were also constant values - but there's no actual reason for that restriction. No changes in shader-db, probably because we usually optimize this later anyway. But it does make us generate less stupid code up front. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir/opt_algebraic: Add some constant bcsel reductionsJason Ekstrand2015-01-291-2/+28
| | | | | | | | total instructions in shared programs: 5998190 -> 5997603 (-0.01%) instructions in affected programs: 54276 -> 53689 (-1.08%) helped: 293 Reviewed-by: Kenneth Graunke <[email protected]>
* nir/opt_algebraic: Add some boolean simplificationsJason Ekstrand2015-01-291-4/+5
| | | | | | | | total instructions in shared programs: 5998321 -> 5998287 (-0.00%) instructions in affected programs: 4520 -> 4486 (-0.75%) helped: 8 Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Support specifying variable as constant or by typeJason Ekstrand2015-01-292-6/+26
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir/algebraic: Fail to compile of a variable is used in a replace but not ↵Jason Ekstrand2015-01-291-0/+7
| | | | | | the search Reviewed-by: Kenneth Graunke <[email protected]>
* nir/search: Allow for matching variables based on typesJason Ekstrand2015-01-292-0/+23
| | | | | | | | | | | This allows you to match on an unknown value but only if it is of a given type. 90% of the uses of this are for matching only booleans, but adding the generality of arbitrary types is no more complex. nir_algebraic.py doesn't handle this yet but that's ok because the C language will ensure that the default type on all variables is void. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/search: Add support for matching unknown constantsJason Ekstrand2015-01-292-0/+13
| | | | | | | | | | | | | There are some algebraic transformations that we want to do but only if certain things are constants. For instance, we may want to replace a * (b + c) with (a * b) + (a * c) as long as a and either b or c is constant. While this generates more instructions, some of it will get constant folded. nir_algebraic.py doesn't handle this yet, but that's ok because the C language will make sure that false is the default for now. Reviewed-by: Kenneth Graunke <[email protected]>