summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* spirv: handle SpvOpUConvert in proper place.Dave Airlie2017-02-161-1/+1
| | | | | | | | This was falling into the quantizetof16 path. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: add support for Int64 capabilityDave Airlie2017-02-162-1/+4
| | | | | | | | This just adds the support at the spirv->nir level for the Int64 cap. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv/nir: add support for int64Dave Airlie2017-02-162-2/+32
| | | | | | | This adds the spirv->nir conversion for int64 types. Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir/types: add C accessors for 64-bit integer types.Dave Airlie2017-02-162-0/+14
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* spirv: Add support for SpvCapabilityStorageImageReadWithoutFormat.Bas Nieuwenhuizen2017-02-152-1/+5
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* glsl: Handle packed_type == ivec4[] in lower_packed_varyings().Kenneth Graunke2017-02-141-1/+2
| | | | | | | | | | For GS input arrays, we may turn a packed_type of ivec4 into an array of ivec4s. We still want flat qualification. Found by inspection. Not known to help anything. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* spirv: Add support for SpvCapabilityStorageImageWriteWithoutFormatAlex Smith2017-02-143-2/+9
| | | | | | | | | | | Allow that capability if the driver indicates that it is supported, and flag whether images are read-only/write-only in the nir_variable (based on the NonReadable and NonWritable decorations), which drivers may need to implement this. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/spirv: do not require a format with images that are not sampledIago Toral Quiroga2017-02-141-2/+0
| | | | | | | As soon as we support shaderStorageImageWriteWithoutFormat we can see write-only images (sampled == 2) that don't have a format specified. Reviewed-by: Jason Ekstrand <[email protected]>
* mesa: Add EXT_frag_depth bits and enable it on all driversAnuj Phogat2017-02-133-0/+6
| | | | | | | | | Passes the newly added piglit test for this extension on i965. V2: Fix comments by Ilia. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* glsl: Drop resize-to-MaxPatchVertices hack.Kenneth Graunke2017-02-124-43/+0
| | | | | | | | | | | | | | TCS and TES inputs without an array size are implicitly sized to gl_MaxPatchVertices. But TCS outputs are apparently not: "If no size is specified, it will be taken from the output patch size (gl_VerticesOut) declared in the shader." Fixes dEQP-GLES31.functional.program_interface_query.program_output. array_size.separable_tess_ctrl.var. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: Update a comment about link errors for TCS && !TES.Kenneth Graunke2017-02-121-1/+9
| | | | | | | | OpenGL ES actually has spec text to prohibit this. It's just OpenGL that's confusing. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* glsl: non-last member unsized array on SSBO must fail compilation on GLSL ES 3.1Jose Maria Casanova Crespo2017-02-101-4/+7
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | From GLSL ES 3.10 spec, section 4.1.9 "Arrays": "If an array is declared as the last member of a shader storage block and the size is not specified at compile-time, it is sized at run-time. In all other cases, arrays are sized only at compile-time." In desktop GLSL it is allowed to have unsized-arrays that are not last, as long as we can determine that they are implicitly sized, which is detected at link-time. With this patch Mesa reports a compilation error as glslang does with the following shader: buffer SSBO { vec4 data[]; vec4 moreData;}; void main (void) { } Fixes: dEQP-GLES31.functional.debug.negative_coverage.log.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.callbacks.shader.compile_compute_shader dEQP-GLES31.functional.debug.negative_coverage.get_error.shader.compile_compute_shader Cc: "17.0" <[email protected]> Signed-off-by: Jose Maria Casanova Crespo <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl: Allow compatibility shaders with MESA_GL_VERSION_OVERRIDE=...Matt Turner2017-02-094-4/+14
| | | | | | | | | | | | | | | | | | Previously if you used MESA_GL_VERSION_OVERRIDE=3.3COMPAT, Mesa exposed an OpenGL 3.3 compatibility profile context (with various unimplemented features and bugs), but still refused to compile shaders with #version 330 compatibility This patch simply adds a small bit of plumbing to let that through. Of course the same caveats apply: compatibility profile is still not supported (and will not be supported), so there are no guarantees that anything will work. Tested-by: Dylan Baker <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: add opcode to perform int64 to bool conversionsSamuel Iglesias Gonsálvez2017-02-092-0/+2
| | | | | | Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660 Reviewed-by: Lionel Landwerlin <[email protected]>
* glsl: add param to force shader recompileTimothy Arceri2017-02-093-3/+4
| | | | | | This will be used to skip checking the cache and force a recompile. Reviewed-by: Anuj Phogat <[email protected]>
* st/mesa/i965: create link status enumTimothy Arceri2017-02-092-4/+4
| | | | | | | | | | | | For the on-disk shader cache we want to be able to differentiate between a program that was linked and one that was loaded from cache. V2: - don't return the new enum directly to the application when queried, instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome corruptions when using cache. Reviewed-by: Anuj Phogat <[email protected]>
* spirv: Add more asserts in vtn_vector_constructJason Ekstrand2017-02-071-0/+15
| | | | | Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99465
* glsl: correct compute shader checks for memoryBarrier functionsMarc Di Luzio2017-02-061-6/+12
| | | | | | | | | | | | | | | | As per the spec - "The functions memoryBarrierShared() and groupMemoryBarrier() are available only in compute shaders; the other functions are available in all shader types." Conform to this by adding another delegate to check for compute shader support instead of only whether the current stage is compute This allows some fragment shaders in Dirt Rally to compile Cc: "17.0" <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* spirv: add SPV_KHR_shader_draw_parameters supportLionel Landwerlin2017-02-013-0/+17
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* compiler: add missing enums for debugLionel Landwerlin2017-02-011-0/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nir/spirv/glsl450: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez2017-01-311-1/+21
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Implement IEEE-compliant handling of atan2(±∞, ±∞).Francisco Jerez2017-01-311-1/+21
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* nir/spirv/glsl450: Rewrite atan2 implementation to fix accuracy and handling ↵Francisco Jerez2017-01-311-22/+55
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | of zero/infinity. See "glsl: Rewrite atan2 implementation to fix accuracy and handling of zero/infinity." for the rationale, but note that the instruction count benefit discussed there is somewhat less important for the SPIRV implementation, because the current code already emitted no control flow instructions -- Still this saves us one hardware instruction per scalar component on Intel SKL hardware. Fixes the following Vulkan CTS tests on Intel hardware: dEQP-VK.glsl.builtin.precision.atan2.highp_compute.scalar dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec3 dEQP-VK.glsl.builtin.precision.atan2.highp_compute.vec4 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec2 dEQP-VK.glsl.builtin.precision.atan2.mediump_compute.vec4 Note that most of the test-cases above expect IEEE-compliant handling of atan2(±∞, ±∞), which this patch doesn't explicitly handle, so except for the last two the test-cases above weren't expected to pass yet. The reason they do is that the i965 back-end implementation of the NIR fmin and fmax instructions is not quite GLSL-compliant (it complies with IEEE 754 recommendations though), because fmin/fmax of a NaN and a non-NaN argument currently always return the non-NaN argument, which causes atan() to flush NaN to one and return the expected value. The front-end should probably not be relying on this behavior for correctness though because other back-ends are likely to behave differently -- A follow-up patch will handle the atan2(±∞, ±∞) corner cases explicitly. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <[email protected]>
* glsl: Rewrite atan2 implementation to fix accuracy and handling of ↵Francisco Jerez2017-01-311-36/+60
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | zero/infinity. This addresses several issues of the current atan2 implementation: - Negative zero (and negative denorms which end up getting flushed to zero) isn't handled correctly by the current implementation. The reason is that it does 'y >= 0' and 'x < 0' comparisons to decide on which side of the branch cut the argument is, which causes us to return incorrect results (off by up to 2π) for very small negative values. - There is a serious precision problem for x values of large enough magnitude introduced by the floating point division operation being implemented as a mul+rcp sequence. This can lead to the quotient getting flushed to zero in some cases introducing an error of over 8e6 ULP in the result -- Or in the most catastrophic case will cause us to return NaN instead of the correct value ±π/2 for y=±∞ and x very large. We can fix this easily by scaling down both arguments when the absolute value of the denominator goes above certain threshold. The error of this atan2 implementation remains below 25 ULP in most of its domain except for a neighborhood of y=0 where it reaches a maximum error of about 180 ULP. - It emits a bunch of instructions including no less than three if-else branches per scalar component that don't seem to get optimized out later on. This implementation uses about 13% less instructions on Intel SKL hardware and doesn't emit any control flow instructions. v2: Fix up argument scaling to take into account the range and precision of exotic FP24 hardware. Flip coordinate system for arguments along the vertical line as if they were on the left half-plane in order to avoid division by zero which may give unspecified results on non-GLSL 4.1-capable hardware. Sprinkle in some more comments. Reviewed-by: Ian Romanick <[email protected]>
* glsl/ir_builder: Add rcp builder.Francisco Jerez2017-01-312-0/+7
| | | | | Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: Fix constant evaluation of the rcp op.Francisco Jerez2017-01-311-1/+1
| | | | | | | | | | Will avoid a regression in a future commit that introduces some additional rcp operations. According to the GLSL 4.10 specification: "Dividing by 0 results in the appropriately signed IEEE Inf." Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Juan A. Suarez Romero <[email protected]>
* glsl: fix heap-buffer-overflowBartosz Tomczyk2017-01-311-1/+1
| | | | | | | | | The `end+1` skips the ']', whereas the `strlen+1` includes the final '\0' in the move to terminate the string. Cc: [email protected] Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: add new uniform fields to be used to restore state from cacheCarl Worth2017-01-311-0/+4
| | | | | Signed-off-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* glsl: Switch to disable-by-default for the GLSL shader cacheCarl Worth2017-01-311-0/+5
| | | | | | | | | | | | | | | | | The shader cache is expected to be developed incrementally over a fairly long series of commits. For that period of instability, we require users to opt into the shader cache by setting: MESA_GLSL_CACHE_ENABLE=1 In the future, when the shader cache is complete, we can revert this commit so that the cache will be on by default. The user can always disable the cache with MESA_GLSL_CACHE_DISABLE=1. That functionality is not affected by this commit, (nor will it be affected by the future revert). Reviewed-by: Eric Anholt <[email protected]>
* glsl: remove explicit __STDC_FORMAT_MACROS defineEmil Velikov2017-01-273-3/+0
| | | | | | | | Correctly handled by all the build systems. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* nir: add extra const notation in compare_blocks()Emil Velikov2017-01-271-2/+2
| | | | | | | | | | | MSVC warns about different const qualifiers. Add the extra const to silence it. nir_phi_builder.c(244) : warning C4090: 'initializing' : different 'const' qualifiers nir_phi_builder.c(245) : warning C4090: 'initializing' : different 'const' qualifiers Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: silence implicit conversion to 64bitEmil Velikov2017-01-271-1/+1
| | | | | | | | | | | | MSVC warns about implicit conversion as below. Annotate the literal appropriately to silence the warning. nir_gather_info.c(249) : warning C4334: '<<' : result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: handle undefined components for OpVectorShuffleLionel Landwerlin2017-01-261-15/+38
| | | | | | | | | | Fixes: dEQP-VK.spirv_assembly.instruction.compute.opspecconstantop.vector_related dEQP-VK.spirv_assembly.instruction.graphics.opspecconstantop.vector_related* Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* spirv: handle OpUndef as part of the variable parsing passLionel Landwerlin2017-01-262-0/+7
| | | | | | | | | | | | | | | | | | | | | Looking at the following bit of SPIRV shader : ... %zero = OpConstant %i32 0 %ivec3_0 = OpConstantComposite %ivec3 %zero %zero %zero %vec3_undef = OpUndef %ivec3 %sc_0 = OpSpecConstant %i32 0 %sc_1 = OpSpecConstant %i32 0 %sc_2 = OpSpecConstant %i32 0 ... Our compiler currently stops parsing variables & types on the OpUndef and switches to instructions, leaving the following sc_[0-2] variables untreated. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: "17.0 13.0" <[email protected]>
* spirv: bump headers to SPIRV 1.1Lionel Landwerlin2017-01-253-9/+86
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: add default handler for new enumsLionel Landwerlin2017-01-252-0/+15
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix typosLionel Landwerlin2017-01-251-3/+3
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: handle gl_SampleMaskIago Toral Quiroga2017-01-251-2/+6
| | | | | | | | | | | | | SPIR-V maps both gl_SampleMask and gl_SampleMaskIn to the same builtin (SampleMask). The only way to tell which one we are dealing with is to check if it is an input or an output. Fixes: dEQP-VK.pipeline.multisample_shader_builtin.sample_mask.write.* Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: acknowledge multisampled input attachmentsIago Toral Quiroga2017-01-251-3/+8
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: bump loop max unroll limitTimothy Arceri2017-01-251-1/+1
| | | | | | | | | | | | | | | | | | The original number was chosen in an attempt to match the limits applied to GLSL IR. A look at the git history of the why these limits were chosen for GLSL IR shows it was more to do with the slow speed of unrolling large loops in GLSL IR than anything else. The speed of loop unrolling in NIR is not a problem so we may wish to bump this even higher in future. No shader-db change, however a furture change will disbale the GLSL IR optimisation loop in the i965 backend results in 4 loops from The Talos Principle failing to unroll. Bumping the limit allows them to unroll which results in the instruction count matching the previous output from when the GLSL IR opts were still enabled. Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: lower constant arrays to uniform arrays before optimisation loopTimothy Arceri2017-01-251-13/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the constant array would not get copy propagated until the backend did its GLSL IR opt loop. I plan on removing that from i965 shortly which caused huge regressions in Deus-ex and Tomb Raider which have large constant arrays. Moving lowering before the opt loop in the GLSL linker fixes this and unexpectedly improves some compute shaders also. shader-db results BDW: instructions helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 204 -> 194 (-4.90%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1010 -> 741 (-26.63%) instructions helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 542 -> 385 (-28.97%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/318.shader_test CS SIMD8: 1831382 -> 1818492 (-0.70%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/144.shader_test CS SIMD8: 216238 -> 206180 (-4.65%) cycles helped: shaders/closed/steam/deus-ex-mankind-divided/374.shader_test CS SIMD16: 18484 -> 16644 (-9.95%) total instructions in shared programs: 13060313 -> 13059877 (-0.00%) instructions in affected programs: 1756 -> 1320 (-24.83%) helped: 3 HURT: 0 total cycles in shared programs: 256586698 -> 256561910 (-0.01%) cycles in affected programs: 2066104 -> 2041316 (-1.20%) helped: 3 HURT: 0 V3: only call the opt loop if lowering progressed (Suggested by Eric) V2: call opts before and after lowering (Suggested by Ken) Reviewed-by: Eric Anholt <[email protected]>
* glsl: fix compile errors with mingw due to missing PRIx64 definitionsRoland Scheidegger2017-01-242-0/+4
| | | | | | | | | | | | | | | | | | | | | | define __STDC_FORMAT_MACROS and include <inttypes.h> (same as ir_builder_print_visitor.cpp already does). Otherwise, some mingw build errors out (since 8e7e1ae0365ddc7edb0d4d98250ab46728e6c14a and bbce1c538dc0cb8bf3769510283d11847dc07540 presumably) with: src/compiler/glsl/ir_print_visitor.cpp:479:40: error: expected ‘)’ before ‘PRIu64’ case GLSL_TYPE_UINT64:fprintf(f, "%" PRIu64, ir->value.u64[i]); break; (Note even with that fix I get other format specifier warnings: src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: unknown conversion type character ‘a’ in format [-Wformat=] fprintf(f, "%a", ir->value.f[i]); ^ src/compiler/glsl/ir_print_visitor.cpp:473:47: warning: too many arguments for format [-Wformat-extra-args] but it still compiles at least) Reviewed-by: Jose Fonseca <[email protected]>
* glsl: split DIV_TO_MUL_RCP into single- and double-precision flagsNicolai Hähnle2017-01-232-9/+14
| | | | | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]>
* glsl: fix tes linking regressionTimothy Arceri2017-01-231-2/+2
| | | | | Fixes regression caused by cbeba6bd48da2c. I accidentally pushed the wrong version of the patch.
* mesa/glsl: set and get cs layouts to and from shader_infoTimothy Arceri2017-01-231-20/+15
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set and get gs layouts directly to and from shader_infoTimothy Arceri2017-01-231-33/+37
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl/i965: set and get tes layouts directly to and from shader_infoTimothy Arceri2017-01-231-31/+33
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* glsl: use last_vert_prog to get last {clip,cull}_distance_array_sizeTimothy Arceri2017-01-232-16/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa/glsl: set {clip,cull}_distance_array_size directly in gl_programTimothy Arceri2017-01-232-18/+16
| | | | | | | There are some line wrapping violations here but those lines will get deleted in the following patch. Reviewed-by: Nicolai Hähnle <[email protected]>
* st/mesa/glsl: change xfb_program field to last_vert_progTimothy Arceri2017-01-232-25/+27
| | | | | | | | | | Now that the i965 backend doesn't depend on this field we can make it more generic and short circuit a bunch of code paths. The new field will be used in a following patch for another clean-up. Reviewed-by: Nicolai Hähnle <[email protected]>