aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/glsl
Commit message (Collapse)AuthorAgeFilesLines
...
* glsl: add support for caching shaders with xfb qualifiersTimothy Arceri2017-02-172-1/+121
| | | | | | | | For now this disables the shader cache when transform feedback is enabled via the GL API as we don't currently allow for it when generating the sha for the shader. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: add shader cache support for samplersTimothy Arceri2017-02-171-0/+18
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: add basic support for resource list to shader cacheTimothy Arceri2017-02-171-0/+121
| | | | | | This initially adds support for simple uniforms and varyings. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: fix uniform remap table cache when explicit locations usedTimothy Arceri2017-02-171-7/+49
| | | | | | | | V2: don't store pointers use an enum instead to flag what should be restored. Also do the work in a helper that we will later use for the subroutine remap table. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: Serialize three additional hash tables with program metadataCarl Worth2017-02-171-0/+74
| | | | | | | | | | | | | | | | The three additional tables are AttributeBindings, FragDataBindings, and FragDataIndexBindings. The first table (AttributeBindings) was identified as missing by trying to test the shader cache with a program that called glGetAttribLocation. Many thanks to Tapani Pälli <[email protected]>, as it was review of related work that he had done previously that pointed me to the necessity to also save and restore FragDataBindings and FragDataIndexBindings. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use correct shader source in case of cache fallbackTimothy Arceri2017-02-171-1/+10
| | | | | | | | | | | | | | | | | The scenario is: glShaderSource glCompileShader <-- deferred due to cache hit of shader glShaderSource <-- with new source code glAttachShader glLinkProgram <-- no cache hit for program At this point we need to compile the original source when we fallback. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: make use of on disk shader cacheTimothy Arceri2017-02-172-0/+21
| | | | | | | | | | | | | | | The hash key for glsl metadata is a hash of the hashes of each GLSL source string. This commit uses the put_key/get_key support in the cache put the SHA-1 hash of the source string for each successfully compiled shader into the cache. This allows for early, optimistic returns from glCompileShader (if the identical source string had been successfully compiled in the past), in the hope that the final, linked shader will be found in the cache. This is based on the intial patch by Carl. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: add initial implementation of shader cacheTimothy Arceri2017-02-172-0/+627
| | | | | | | | | | | | | | | | | | | | | | | | | | | This uses disk_cache.c to write out a serialization of various state that's required in order to successfully load and use a binary written out by a drivers backend, this state is referred to as "metadata" throughout the implementation. This initial version is intended to work with all stages beside compute. This patch is based on the initial work done by Carl. V2: extend the file's doxygen comment to cover some of the design decisions. V3: - skip cache for fixed function shaders - add int64 support - fix glsl IR program parameter caching/restore and cache the parameter values which are used by gallium backends. - use new link status enum V4: - add compute program support Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: Handle packed_type == ivec4[] in lower_packed_varyings().Kenneth Graunke2017-02-141-1/+2
| | | | | | | | | | For GS input arrays, we may turn a packed_type of ivec4 into an array of ivec4s. We still want flat qualification. Found by inspection. Not known to help anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* mesa: Add EXT_frag_depth bits and enable it on all driversAnuj Phogat2017-02-133-0/+6
| | | | | | | | | Passes the newly added piglit test for this extension on i965. V2: Fix comments by Ilia. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Drop resize-to-MaxPatchVertices hack.Kenneth Graunke2017-02-124-43/+0
| | | | | | | | | | | | | | TCS and TES inputs without an array size are implicitly sized to gl_MaxPatchVertices. But TCS outputs are apparently not: "If no size is specified, it will be taken from the output patch size (gl_VerticesOut) declared in the shader." Fixes dEQP-GLES31.functional.program_interface_query.program_output. array_size.separable_tess_ctrl.var. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: Update a comment about link errors for TCS && !TES.Kenneth Graunke2017-02-121-1/+9
| | | | | | | | OpenGL ES actually has spec text to prohibit this. It's just OpenGL that's confusing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1Jose Maria Casanova Crespo2017-02-101-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | From GLSL ES 3.10 spec, section 4.1.9 "Arrays": "If an array is declared as the last member of a shader storage block and the size is not specified at compile-time, it is sized at run-time. In all other cases, arrays are sized only at compile-time." In desktop GLSL it is allowed to have unsized-arrays that are not last, as long as we can determine that they are implicitly sized, which is detected at link-time. With this patch Mesa reports a compilation error as glslang does with the following shader: buffer SSBO { vec4 data[]; vec4 moreData;}; void main (void) { } Fixes: dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader Cc: "17.0" <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...Matt Turner2017-02-094-4/+14
| | | | | | | | | | | | | | | | | | Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed an OpenGL 3.3 compatibility profile context (with various unimplemented features and bugs), but still refused to compile shaders with #version 330 compatibility This patch simply adds a small bit of plumbing to let that through. Of course the same caveats apply: compatibility profile is still not supported (and will not be supported), so there are no guarantees that anything will work. Tested-by: Dylan Baker <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: add opcode to perform int64 to bool conversionsSamuel Iglesias Gonsálvez2017-02-091-0/+1
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* glsl: add param to force shader recompileTimothy Arceri2017-02-093-3/+4
| | | | | | This will be used to skip checking the cache and force a recompile. Reviewed-by: Anuj Phogat <[email protected]>
* st/mesa/i965: create link status enumTimothy Arceri2017-02-092-4/+4
| | | | | | | | | | | | For the on-disk shader cache we want to be able to differentiate between a program that was linked and one that was loaded from cache. V2: - don't return the new enum directly to the application when queried, instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome corruptions when using cache. Reviewed-by: Anuj Phogat <[email protected]>
* glsl: correct compute shader checks for memoryBarrier functionsMarc Di Luzio2017-02-061-6/+12
| | | | | | | | | | | | | | | | As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez2017-01-311-1/+21
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Rewrite atan2 implementation to fix accuracy and handling of ↵Francisco Jerez2017-01-311-36/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <[email protected]>
* glsl/ir_builder: Add rcp builder.Francisco Jerez2017-01-312-0/+7
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Fix constant evaluation of the rcp op.Francisco Jerez2017-01-311-1/+1
| | | | | | | | | | Will avoid a regression in a future commit that introduces some additional rcp operations. According to the GLSL 4.10 specification: "Dividing by 0 results in the appropriately signed IEEE Inf." Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: fix heap-buffer-overflowBartosz Tomczyk2017-01-311-1/+1
| | | | | | | | | The `end+1` skips the ']', whereas the `strlen+1` includes the final '\0' in the move to terminate the string. Cc: [email protected] Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: add new uniform fields to be used to restore state from cacheCarl Worth2017-01-311-0/+4
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Switch to disable-by-default for the GLSL shader cacheCarl Worth2017-01-311-0/+5
| | | | | | | | | | | | | | | | | The shader cache is expected to be developed incrementally over a fairly long series of commits. For that period of instability, we require users to opt into the shader cache by setting: MESA_GLSL_CACHE_ENABLE=1 In the future, when the shader cache is complete, we can revert this commit so that the cache will be on by default. The user can always disable the cache with MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this commit, (nor will it be affected by the future revert). Reviewed-by: Eric Anholt <[email protected]>
* glsl: remove explicit __STDC_FORMAT_MACROS defineEmil Velikov2017-01-273-3/+0
| | | | | | | | Correctly handled by all the build systems. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* glsl: lower constant arrays to uniform arrays before optimisation loopTimothy Arceri2017-01-251-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the constant array would not get copy propagated until the backend did its GLSL IR opt loop. I plan on removing that from i965 shortly which caused huge regressions in Deus-ex and Tomb Raider which have large constant arrays. Moving lowering before the opt loop in the GLSL linker fixes this and unexpectedly improves some compute shaders also. shader-db results BDW: instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%) total instructions in shared programs: 13060313 -> 13059877 (-0.00%) instructions in affected programs: 1756 -> 1320 (-24.83%) helped: 3 HURT: 0 total cycles in shared programs: 256586698 -> 256561910 (-0.01%) cycles in affected programs: 2066104 -> 2041316 (-1.20%) helped: 3 HURT: 0 V3: only call the opt loop if lowering progressed (Suggested by Eric) V2: call opts before and after lowering (Suggested by Ken) Reviewed-by: Eric Anholt <[email protected]>
* glsl: fix compile errors with mingw due to missing PRIx64 definitionsRoland Scheidegger2017-01-242-0/+4
| | | | | | | | | | | | | | | | | | | | | | define __STDC_FORMAT_MACROS and include <inttypes.h> (same as ir_builder_print_visitor.cpp already does). Otherwise, some mingw build errors out (since 8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with: src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before ‘PRIu64’ case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; (Note even with that fix I get other format specifier warnings: src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: unknown conversion type character ‘a’ in format [-Wformat=] fprintf(f, "%a", ir->value.f[i]); ^ src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: too many arguments for format [-Wformat-extra-args] but it still compiles at least) Reviewed-by: Jose Fonseca <[email protected]>
* glsl: split DIV_TO_MUL_RCP into single- and double-precision flagsNicolai Hähnle2017-01-232-9/+14
| | | | | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: fix tes linking regressionTimothy Arceri2017-01-231-2/+2
| | | | | Fixes regression caused by cbeba6bd48da2c. I accidentally pushed the wrong version of the patch.
* mesa/glsl: set and get cs layouts to and from shader_infoTimothy Arceri2017-01-231-20/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get gs layouts directly to and from shader_infoTimothy Arceri2017-01-231-33/+37
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl/i965: set and get tes layouts directly to and from shader_infoTimothy Arceri2017-01-231-31/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use last_vert_prog to get last {clip,cull}_distance_array_sizeTimothy Arceri2017-01-232-16/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set {clip,cull}_distance_array_size directly in gl_programTimothy Arceri2017-01-232-18/+16
| | | | | | | There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa/glsl: change xfb_program field to last_vert_progTimothy Arceri2017-01-232-25/+27
| | | | | | | | | | Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: Rename [u]int64_t tokens.Kenneth Graunke2017-01-202-5/+5
| | | | | | | | | | basetsd.h on Windows defines INT64 and UINT64 typedefs which conflict with these. Append "_TOK" to avoid conflicts. Should fix the Windows build. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Enable 64-bit integer support for almost all unary and binary operationsIan Romanick2017-01-201-9/+15
| | | | | | | | | | v2: Don't up-convert the shift count parameter if shift instructions. Suggested by Connor. Add type_is_singed() function. This will make adding 8- and 16-bit types easier. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]> Cc: Jason Ekstrand <[email protected]>
* nir: Add 64-bit integer support for conversions and bitcastsIan Romanick2017-01-201-0/+37
| | | | | | | | | | | | | | v2 (idr): "cut them down later" => Remove ir_unop_b2u64 and ir_unop_u642b. Handle these with extra i2u or u2i casts just like uint(bool) and bool(uint) conversion is done. v3 (idr): Make the "from" type in a cast unsized. This reduces the number of required cast operations at the expensive slightly more complex code. However, this will be a dramatic improvement when other sized integer types are added. Suggested by Connor. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* nir: Add 64-bit integer constant supportIan Romanick2017-01-201-0/+16
| | | | | | | v2: Rebase on 19a541f (nir: Get rid of nir_constant_data) Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Connor Abbott <[email protected]> [v1]
* glsl: Optimize redundant pack(unpack()) and unpack(pack()) combinationsIan Romanick2017-01-201-0/+28
| | | | | | | | | | | The lowering passes 64-bit integer operations will generate a lot of these. v2: Modify the HANDLE_PACK_UNPACK_INVERSE so that the breaks apply to the switch instead of the 'do { } while(true)' loop. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a lowering pass for 64-bit integer modulusIan Romanick2017-01-202-0/+12
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64%64 => 64 modulusIan Romanick2017-01-205-0/+508
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a lowering pass for 64-bit integer divisionIan Romanick2017-01-202-0/+12
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64/64 => 64 divisionIan Romanick2017-01-205-2/+763
| | | | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! v2: Use function inlining. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a lowering pass for 64-bit integer sign()Ian Romanick2017-01-202-0/+8
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" function for 64-bit integer sign()Ian Romanick2017-01-206-0/+251
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add a lowering pass for 64-bit integer multiplicationIan Romanick2017-01-203-0/+820
| | | | | | | | v2: Rename lower_64bit.cpp and lower_64bit_test.cpp to lower_int64. Suggested by Matt. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Add "built-in" functions to do 64x64 => 64 multiplicationIan Romanick2017-01-208-3/+115
| | | | | | | | | | | These functions are directly available in shaders. A #define is added to detect the presence. This allows these functions to be tested using piglit regardless of whether the driver uses them for lowering. The GLSL spec says that functions and macros beginning with __ are reserved for use by the implementation... hey, that's us! Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: Move builtin_function related prototypes to a separate fileIan Romanick2017-01-2011-19/+55
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Matt Turner <[email protected]>