summaryrefslogtreecommitdiffstats
path: root/src/compiler
Commit message (Collapse)AuthorAgeFilesLines
* glsl: use remap location when serialising uniform program resource dataTimothy Arceri2019-01-291-7/+26
| | | | | | | | | | | | | This allows us to avoid expensive string compares since we already have a map to the pointers. These compares were taking ~30 seconds for a single shader compile in Godot due to it using 64,000+ uniforms. Fixes: c4cff5f40254 ("glsl: add basic support for resource list to shader cache") Reviewed-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109229
* spirv: Don't use special semantics when counting vertex attribute sizeNeil Roberts2019-01-281-6/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Under Vulkan, the double vertex attributes take up the same size regardless of whether they are vertex inputs or any other stage interface. Under OpenGL (ARB_gl_spirv), from GLSL 4.60 spec, section 4.3.9 Interface Blocks: "It is a compile-time error to have an input block in a vertex shader or an output block in a fragment shader. These uses are reserved for future use." So we also don't need to check if it is an vertex input or not, and use false in any case. v2: (changes made by Alejandro Piñeiro) * Update required after "spirv: Handle location decorations on block interface members" own updates (original patch was sent several months ago) * After Neil suggesting it, confirm that this change can be also done for OpenGL (ARB_gl_spirv). Expand commit message. v3: update after changing name of main method on a previous patch Signed-off-by: Neil Roberts <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl_types: Rename parameter of glsl_count_attribute_slotsNeil Roberts2019-01-284-10/+17
| | | | | | | | | | | | glsl_count_attribute_slots takes a parameter to specify whether the type is being used as a vertex input because on GL double attributes only take up one slot. Vulkan doesn’t make this distinction so this patch renames the argument to is_gl_vertex_input in order to make it more clear that it should always be false on Vulkan. v2: minor variable renaming (s/member/member_type) (Tapani) Reviewed-by: Tapani Pälli <[email protected]>
* spirv/nir: handle location decorations on block interface membersNeil Roberts2019-01-282-9/+66
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously the code was taking any location decoration on the block and using that to calculate the member locations for all of the members. I think this was assuming that there would only be one location decoration for the entire block. According to the Vulkan spec it is possible to add location decorations to individual members: “If the structure type is a Block but without a Location, then each of its members must have a Location decoration. If it is a Block with a Location decoration, then its members are assigned consecutive locations in declaration order, starting from the first member which is initially the Block. Any member with its own Location decoration is assigned that location. Each remaining member is assigned the location after the immediately preceding member in declaration order.” This patch makes it instead keep track of which members have been assigned an explicit location. It also has a space to store the location for the struct as a whole. Once all the decorations have been processed it iterates over each member to fill in the missing locations using the rules described above. So, this commit is needed to get working a case like this, on both Vulkan and OpenGL using SPIR-V (ARB_gl_spirv): out block { layout(location = 2) vec4 c; layout(location = 3) vec4 d; layout(location = 0) vec4 a; layout(location = 1) vec4 b; } name; v2: (changes made by Alejandro Piñeiro) * Update after introducing struct member splitting (See commit b0c643d) * Update after only exposing interface_type for blocks, not to any struct * Update after last changes done for xfb support v3: use "assign" instead of "add" on the new method added (Tapani) Signed-off-by: Neil Roberts <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: fix block member alignment validation for vec3Niklas Haas2019-01-271-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Section 7.6.2.2 (Standard Uniform Block Layout) of the GL spec says: The base offset of the first member of a structure is taken from the aligned offset of the structure itself. The base offset of all other structure members is derived by taking the offset of the last basic machine unit consumed by the previous member and adding one. The current code does not reflect this last sentence - it effectively instead aligns up the next offset up to the alignment of the previous member. This causes an issue in exactly one case: layout(std140) uniform block { layout(offset=0) vec3 var1; layout(offset=12) float var2; }; As per section 7.6.2.1 (Uniform Buffer Object Storage) and elsewhere, a vec3 consumes 3 floats, i.e. 12 basic machine units. Therefore, `var1` in the example above consumes units 0-11, with 12 being the first available offset afterwards. However, before this commit, mesa incorrectly assumes `var2` must start at offset=16 when using explicit offsets, which results in a compile-time error. Without explicit offsets, the shaders actually work fine, indicating that mesa is already correctly aligning these fields internally. (Just not in the code that handles explicit buffer offset parsing) This patch should fix piglit tests: ssbo-explicit-offset-vec3.vert ubo-explicit-offset-vec3.vert Signed-off-by: Niklas Haas <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* spirv: Add support for SPV_EXT_physical_storage_bufferJason Ekstrand2019-01-265-3/+55
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Implement OpConvertPtrToU and OpConvertUToPtrJason Ekstrand2019-01-262-2/+75
| | | | | | | This only implements the actual opcodes and does not implement support for using them with specialization constants. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Handle OpTypeForwardPointerJason Ekstrand2019-01-261-33/+66
| | | | | | | | | | | | We handle forward declarations by creating the pointer type with it's storage type based on storage class and just waiting to fill out the actual deref type until we get the OpTypePointer. Because any composites using the forward declared type only care about the storage type (i.e. uint64_t, uvec2, etc.) when creating their glsl_type, this works fine and we can defer the actual deref_type as far as we need. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* spirv: Drop a bogus assertJason Ekstrand2019-01-261-1/+0
| | | | | | | | | | This was valid back when the only valid types of pointers were uint32 and uvec2. Now that we're allowing more variety, it could be just about anything so we'll just drop the assert. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: Allow SSBOs and global to aliasJason Ekstrand2019-01-261-1/+6
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/validate: Allow array derefs of vectors for nir_var_mem_globalJason Ekstrand2019-01-261-1/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir/lower_io: Add support for nir_var_mem_globalJason Ekstrand2019-01-261-0/+12
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir/lower_io: Add a 32 and 64-bit global address formatsJason Ekstrand2019-01-262-30/+123
| | | | | | These are simple scalar addresses. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Add load/store/atomic global intrinsicsJason Ekstrand2019-01-263-1/+39
| | | | | | | | | These correspond roughly to reading/writing OpenCL global pointers. The idea is that they just take a bare address and load/store from it. Of course, exactly what this address means is driver-dependent. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: Length of boolean vtn_value now is 1Sergii Romantsov2019-01-231-3/+6
| | | | | | | | | | | | | | During conversion type-length was lost due to math. v2 (Jason Ekstrand): - Use a size/offset of 4 bytes Fixes: 44227453ec03 (nir: Switch to using 1-bit Booleans for almost everything) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109353 Signed-off-by: Sergii Romantsov <[email protected]> Tested-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Add pipeline cache support for xfb_infoJason Ekstrand2019-01-221-1/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/xfb: distinguish array of structs vs array of blocksAlejandro Piñeiro2019-01-221-7/+17
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* nir/xfb: Properly handle arrays of blocksJason Ekstrand2019-01-221-20/+41
| | | | Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir/xfb: don't assert when xfb_buffer/stride is present but not xfb_offsetAlejandro Piñeiro2019-01-221-7/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | In order to allow nir_gather_xfb_info to be used on OpenGL, specifically ARB_gl_spirv. So, from OpenGL 4.6 spec, section 11.1.2.1, "Output Variables": "outputs specifying both an *XfbBuffer* and an *Offset* are captured, while outputs not specifying both of these are not captured. Values are captured each time the shader writes to such a decorated object." This implies that are captured if both are present, and not if one of those are lacking. Technically, it doesn't explicitly point that having just one or the other is a mistake. In some cases, glslang is adding some extra XfbBuffer without XfbOffset around, and mentioning that technically that is not a bug (see issue#1526) And for the case of Vulkan, as the same glslang issue mentions, it is not clear if that should be a mistake or not. But even if it is a mistake, it is not really needed to be checked on the driver, and we can let the validation layers to check that. v2: simplify explicit_xfb_buffer and explicit_offset checks (Jason). Reviewed-by: Jason Ekstrand <[email protected]>
* nir/xfb: Fix offset accounting for dvec3/4Jason Ekstrand2019-01-221-2/+2
| | | | | | | | | | Before, we were double-counting the component slots when we had a dvec3 or dvec4. Instead, just add them in once and manually offset the recorded output offset. Fixes: 19064b8c "nir: Add a pass for gathering transform feedback info" Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: Preserve offsets in lower_io_to_scalar_earlyJason Ekstrand2019-01-221-0/+8
| | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: fix lowering arrays to elements for XFB outputsSamuel Pitoiset2019-01-221-2/+11
| | | | | | | | | | | | | | | | | | | | If we have a transform feedback output like: float[2] x2_out (VARYING_SLOT_VAR1.x, 0, 0) which is lowered by nir_lower_io_arrays_to_elements to, float x2_out (VARYING_SLOT_VAR1.x, 0, 0) float x2_out@5 (VARYING_SLOT_VAR2.x, 0, 0) We have to update the destination offset to avoid overwriting the same value. v2 (Jason Ekstrand): - Compute the correct offsets for arrays of vectors and/or doubles Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Alejandro Piñeiro <[email protected]>
* nir: do not remove varyings used for transform feedbackSamuel Pitoiset2019-01-221-0/+3
| | | | | | | | When a xfb buffer is explicitely declared on a varying variable, we shouldn't remove it at link time. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Only set interface_type on blocksJason Ekstrand2019-01-221-9/+25
| | | | | | | | Instead of setting interface_type to whatever the per-vertex type is, we only set it on blocks. This allows later passes to tell the difference between variables that are in blocks and those that aren't. Reviewed-by: Alejandro Piñeiro <[email protected]>
* spirv: Only split blocksJason Ekstrand2019-01-221-3/+8
| | | | | | | | | Instead of splitting every per-vertex struct, just split the ones that are actually blocks. The reason for the split is so that we have separate variables for separate locations, qualifiers, and builtin decorations. The vulkan spec only allows these on members of blocks. Reviewed-by: Alejandro Piñeiro <[email protected]>
* spirv: Initialize struct member offsets to -1Jason Ekstrand2019-01-221-0/+1
| | | | | | | This is the "no offset specified" value. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: cleanup glsl_get_struct_field_offset, glsl_get_explicit_strideTapani Pälli2019-01-222-5/+5
| | | | | | | | | Take away const qualifier from return type of these functions as -Wignored-qualifiers points out it is ignored for these cases. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Update the JSON and headers from Khronos masterJason Ekstrand2019-01-212-124/+336
| | | | | | This corresponds to commit 79b6681aadcb53c27d1052e on GitHub. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Mark deref UBO and SSBO access as non-scalarJason Ekstrand2019-01-211-1/+3
| | | | | Fixes: 63b9aa2e2574 "spirv: Add support for using derefs for..." Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/spirv: handle ContractionOff execution modeKarol Herbst2019-01-216-3/+17
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/vtn: add caps for some cl related capabilitiesRob Clark2019-01-213-5/+20
| | | | | | | | | | | vtn supports these, so don't squalk if user is happy with enabling these. v2: add new members sorted Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vtn: handle SpvExecutionModelKernelKarol Herbst2019-01-211-0/+2
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* mesa: add MESA_SHADER_KERNELKarol Herbst2019-01-215-8/+26
| | | | | | | | used for CL kernels Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add bit_size parameter to system values with multiple allowed bit sizesKarol Herbst2019-01-213-6/+16
| | | | | | | | | v2: add assert to verify we have at least one valid bit_size v3: fix use of load_front_face in nir_lower_two_sided_color and tgsi_to_nir Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add legal bit_sizes to intrinsicsKarol Herbst2019-01-214-13/+30
| | | | | | | | | | | | | | | | | | | With OpenCL some system values match the address bits, but in GLSL we also have some system values being 64 bit like subgroup masks. With this it is possible to adjust the builder functions so that depending on the bit_sizes the correct bit_size is used or an additional argument is added in case of multiple possible values. v2: validate dest bit_size v3: generate hex values in python code remove useless imports rename and move bit_sizes v4: add 1 to legal bit_sizes for front_face Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/validate: allow to check against a bitmask of bit_sizesKarol Herbst2019-01-211-17/+17
| | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl/lower_output_reads: set invariant and precise flags on temporariesKarol Herbst2019-01-211-0/+4
| | | | | | | | | | | | | | | | | fixes a couple of deqp tests (on nvc0 and potential other drivers): dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.highp.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.mediump.common_subexpression_3 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_1 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_2 dEQP-GLES3.functional.shaders.invariance.lowp.common_subexpression_3 CC: <[email protected]> Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nir/spirv: handle SpvStorageClassCrossWorkgroupKarol Herbst2019-01-195-0/+12
| | | | | | | | | | v2: rename nir_var_global to nir_var_mem_global Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_shared to nir_var_mem_sharedKarol Herbst2019-01-1910-19/+19
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_ssbo to nir_var_mem_ssboKarol Herbst2019-01-1912-34/+34
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_ubo to nir_var_mem_uboKarol Herbst2019-01-197-11/+11
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-1919-59/+59
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: rename nir_var_private to nir_var_shader_tempKarol Herbst2019-01-1915-27/+27
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: be much more aggressive when skipping shader compilationTimothy Arceri2019-01-192-6/+10
| | | | | | | | | | | | | | | | | | | | Currently we only add a cache key for a shader once it is linked. However games like Team Fortress 2 compile a whole bunch of shaders which are never actually linked. These compiled shaders can take up a bunch of memory. This patch changes things so that we add the key for the shader to the cache as soon as it is compiled. This means on a warm cache we can avoid the wasted memory from these shaders. Worst case scenario is we need to compile the shaders at link time but this can happen anyway if the shader has been evicted from the cache. Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a warm cache from start up to the game menu. V2: only add key to cache when compilation is successful. Acked-by: Marek Olšák <[email protected]>
* Revert "glsl: be much more aggressive when skipping shader compilation"Timothy Arceri2019-01-192-10/+6
| | | | | | This reverts commit 64b8c86d37ebb1e1d286c69d642d52b7bcf051d3. Reverting for now as it was causing some segfaults.
* glsl: be much more aggressive when skipping shader compilationTimothy Arceri2019-01-192-6/+10
| | | | | | | | | | | | | | | | | | Currently we only add a cache key for a shader once it is linked. However games like Team Fortress 2 compile a whole bunch of shaders which are never actually linked. These compiled shaders can take up a bunch of memory. This patch changes things so that we add the key for the shader to the cache as soon as it is compiled. This means on a warm cache we can avoid the wasted memory from these shaders. Worst case scenario is we need to compile the shaders at link time but this can happen anyway if the shader has been evicted from the cache. Reduces memory use in Team Fortress 2 from 1.3GB -> 770MB on a warm cache from start up to the game menu. Acked-by: Marek Olšák <[email protected]>
* glsl: don't skip GLSL IR opts on first-time compilesTimothy Arceri2019-01-192-32/+1
| | | | | | | | | | | | | | This basically reverts c2bc0aa7b188. By running the opts we reduce memory using in Team Fortress 2 from 1.5GB -> 1.3GB from start-up to game menu. This will likely increase Deus Ex start up times as per commit c2bc0aa7b188. However currently 32bit games like Team Fortress 2 can run out of memory on low memory systems, so that seems more important. Reviewed-by: Marek Olšák <[email protected]>
* nir: check NIR_SKIP to skip passes by nameCaio Marcelo de Oliveira Filho2019-01-181-0/+24
| | | | | | | | | | | | | | Passes' function names, separated by comma, listed in NIR_SKIP environment variable will be skipped in debug mode. The mechanism is hooked into the _PASS macro, like NIR_PRINT. The extra macro NIR_SKIP is available as a developer convenience, to skip at pointer other than the passes entry points. v2: Fix typo in NIR_SKIP macro. (Bas) Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: Account for atomics in copy propagation.Bas Nieuwenhuizen2019-01-181-1/+24
| | | | | | | | | | | Otherwise writes get propagated across atomics if no barrier is used. Without barrier writes should still be visible in the same invocation, so an atomic has to be considered a write. CC: <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Fixes: b3c61469255 "nir: Copy propagation between blocks" Fixes: 62332d139c8 "nir: Add a local variable-based copy propagation pass"
* nir: Add a bool to float32 lowering passJason Ekstrand2019-01-144-0/+181
| | | | | | | | | | | From @jekstrand's nir-1-bit-bool branch, with improved ior/inot lowering. ior: fmax instead of fadd allows removing the fsat. inot: seq(x, 0) can be better than fsub(1, x). On a2xx, it works better with the scalar instruction set. Reviewed-by: Jonathan Marek <[email protected]>