summaryrefslogtreecommitdiffstats
path: root/src/glsl
Commit message (Collapse)AuthorAgeFilesLines
* nir/cf: add split_block_before_instr()Connor Abbott2015-08-241-0/+18
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: add a cursor structureConnor Abbott2015-08-241-0/+91
| | | | | | | | | | | | | | | | | | | For now, it allows us to refactor the control flow insertion API's so that there's a single entrypoint (with some wrappers). More importantly, it will allow us to reduce the combinatorial explosion in the extract function. There, we need to specify two points to extract, which may be at the beginning of a block, the end of a block, or in the middle of a block. And then there are various wrappers based off of that (before a control flow node, before a control flow list, etc.). Rather than having 9 different functions, we can have one function and push the actual logic of determining which variant to use down to the split function, which will be shared with nir_cf_node_insert(). In the future, we may want to make the instruction insertion API's as well as the builder use this, but that's a future cleanup. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: fix link_blocks() when there are no successorsConnor Abbott2015-08-241-1/+2
| | | | | | | | | | | | | | When we insert a single basic block A into another basic block B, we will split B into C and D, insert A in the middle, and then splice together C, A, and D. When we splice together C and A, we need to move the successors of A into C -- except A has no successors, since it hasn't been inserted yet. So in move_successors(), we need to handle the case where the block whose successors are to be moved doesn't have any successors. Fixing link_blocks() here prevents a segfault and makes it work correctly. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: clean up jumps when cleaning up CF nodesConnor Abbott2015-08-241-1/+4
| | | | | | | | | | | | | We may delete a control flow node which contains structured jumps to other parts of the program. We need to remove the jump as a predecessor, as well as remove any phi node sources which reference it. Right now, the same problem exists for blocks that don't end in a jump instruction, but with the new API it shouldn't be an issue, since blocks that don't end in a jump must either point to another block in the same extracted CF list or not point to anything at all. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: remove uses of SSA definitions that are being deletedConnor Abbott2015-08-241-8/+24
| | | | | | | | | | | | Unlike calling nir_instr_remove(), calling nir_cf_node_remove() (and later in the series, the nir_cf_list_delete()) implies that you're removing instructions that may still have uses, except those instructions are never executed so any uses will be undefined. When cleaning up a CF node for deletion, we must clean up any uses of the deleted instructions by making them point to undef instructions instead. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: handle jumps better in stitch_blocks()Connor Abbott2015-08-241-6/+16
| | | | | | | | | | | | In particular, handle the case where the earlier block ends in a jump and the later block is empty. In that case, we want to preserve the jump and remove any traces of the later block. Before, we would only hit this case when removing a control flow node after a jump, which wasn't a common occurance, but we'll need it to handle inserting a control flow list which ends in a jump, which should be more common/useful. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: handle jumps in split_block_end()Connor Abbott2015-08-241-1/+8
| | | | | | | | | | | | Before, we would only split a block with a jump at the end if we were inserting something after a block with a jump, which never happened in practice. But now, we want to use this to extract control flow lists which may end in a jump, in which case we really need to do the correct patching up. As a side effect, when removing jumps we now correctly insert undef phi sources in some corner cases, which can't hurt. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: add block_ends_in_jump()Connor Abbott2015-08-241-0/+8
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: handle phi nodes better in split_block_beginning()Connor Abbott2015-08-241-0/+13
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: split up and improve nir_handle_remove_jumps()Connor Abbott2015-08-241-54/+81
| | | | | | | | | | | | | | Before, the process of removing a jump and wiring up the remaining block correctly was atomic, but with the new control flow modification it's split into two parts: first, we extract the jump, which creates a new block with re-wired successors as well as a free-floating jump, and then we delete the control flow containing the jump, which removes the entry in the predecessors and any phi node sources. Split up nir_handle_remove_jumps() to accomodate this, and add the missing support for removing phi node sources. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: add remove_phi_src() helperConnor Abbott2015-08-241-0/+17
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: add nir_foreach_phi_src_safe()Connor Abbott2015-08-241-0/+2
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/cf: add insert_phi_undef() helperConnor Abbott2015-08-241-0/+25
| | | | | Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: move control flow modification to its own fileConnor Abbott2015-08-248-687/+799
| | | | | | | | | | | | | | | | We want to start reworking and expanding this code, but it'll be a lot easier to do once we disentangle it from the rest of the stuff in nir.c. Unfortunately, there are a few unavoidable dependencies in nir.c on methods we'd rather not expose publicly, since if not used in very specific situations they can cause Bad Things (tm) to happen. Namely, we need to do some magical control flow munging when adding/removing jumps. In the future, we may disallow adding/removing jumps in nir_instr_insert_*() and nir_instr_remove(), and use separate functions that are part of the control flow modification code, but for now we expose them and put them in a separate, private header. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: make cleanup_cf_node() not use remove_defs_uses()Connor Abbott2015-08-241-2/+4
| | | | | | | | | | | | | | cleanup_cf_node() is part of the control flow modification code, which we're going to split into its own file, but remove_defs_uses() is an internal function used by nir_instr_remove(). Break the dependency by making cleanup_cf_node() use nir_instr_remove() instead, which simply calls remove_defs_uses() and then removes the instruction from the list. nir_instr_remove() does do extra things for jumps, though, so we avoid calling it on jumps which matches the previous behavior (this will be fixed later in the series). Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: inline block_add_pred() a few placesConnor Abbott2015-08-241-3/+2
| | | | | | | | | It was being used to initialize function impls and loops, even though it's really a control flow modification helper. It's pretty trivial, so just inline it to avoid the dependency. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/validate: check successors/predecessors more carefullyConnor Abbott2015-08-241-11/+84
| | | | | | | We should be checking almost everything now. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Delete the nir_function_impl::start_block field.Kenneth Graunke2015-08-246-9/+15
| | | | | | | | | | | | It's simply the first nir_cf_node in the nir_function_impl::body list, which is easy enough to access - we don't to store a pointer to it explicitly. Removing it means we don't need to maintain the pointer when, say, splitting the start block when modifying control flow. Thanks to Connor Abbott for suggesting this. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Connor Abbott <[email protected]>
* glsl: fix error message when validating tcs output declsIlia Mirkin2015-08-211-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* i965: allow image_size on float imagesMartin Peres2015-08-211-1/+2
| | | | | | | | | | | | This got missed because the piglit test only tested int images to avoid a combinatiorial explosion of format, targets, stages and sizes which takes more than 5 minutes to test on nvidia's driver. This patch also drops the IMAGE_FUNCTION_AVAIL_ATOMIC which is not applicable to the image_size codepath but was not hurting in any way. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* glsl: fix binding validation for interface blocksTimothy Arceri2015-08-211-12/+18
| | | | | | V2: rebase on SSBO changes Reviewed-by: Ian Romanick <[email protected]>
* glsl: interleave constant propagation and foldingTimothy Arceri2015-08-211-2/+43
| | | | | | | | | | | The constant folding pass can take a long time to complete so rather than running through the entire pass each time a new constant is propagated (and vice versa) interleave them. This change helps ES31-CTS.arrays_of_arrays.InteractionFunctionCalls1 go from around 2 min -> 23 sec. Reviewed-by: Ian Romanick <[email protected]>
* glsl: expose textureQueryLod in GLSL 4.00+ fragment shadersIlia Mirkin2015-08-201-37/+82
| | | | | | | | | | | | | | | See issue from the ARB_texture_query_lod spec for LOD vs Lod confusion: (3) The core specification uses the "Lod" spelling, not "LOD". Should this extension be modified to use "Lod"? RESOLVED: The "Lod" spelling is the correct spelling for the core specification and the preferred spelling for use. However, use of "LOD" also exists, as the extension predated the core specification, so this extension won't remove use of "LOD". Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl: check if return_deref in lower_subroutine_visitor::visit_leave isn't NULLKai Wasserbäch2015-08-211-1/+1
| | | | | | | | | Fixes a crash in Piglit's spec@arb_shader_subroutine@[email protected] for me. Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nir: convert the glsl intrinsic image_size to nir_intrinsic_image_sizeMartin Peres2015-08-202-6/+17
| | | | | | | | | | | | | | | | | v2, review from Francisco Jerez: - make the destination variable as large as what the nir instrinsic defines (4) instead of the size of the return variable of glsl. This is still safe for the already existing code because all the intrinsics affected returned the same amount of components as expected by glsl IR. In the case of image_size, it is not possible to do so because the returned number of component depends on the image type and this case is not well handled by nir. v3: - Style fix Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* glsl: add support for the imageSize builtinMartin Peres2015-08-201-16/+92
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | The code is heavily inspired from Francisco Jerez's code supporting the image_load_store extension. Backends willing to support this builtin should handle __intrinsic_image_size. v2: Based on the review of Ilia Mirkin - Enable the extension for GLES 3.1 - Fix indentation - Fix the return type (float to int, number of components for CubeImages) - Add a warning related to GLES 3.1 v3: Based on the review of Francisco Jerez - Refactor the code to share both add_image_function and _image with the other image-related functions v4: Based on Topi Pohjolainen's comments - Do not add parenthesis for the return value v5: based on Francisco Jerez's comments: - Fix a few indent issues - Reduce the size of a condition by testing the dimension and array properties instead of enumerating all the formats. Signed-off-by: Martin Peres <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* main: add extension GL_ARB_shader_image_sizeMartin Peres2015-08-203-0/+6
| | | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Signed-off-by: Martin Peres <[email protected]>
* glsl: Parse the allowed image format qualifiers in GLSL ES 3.1.Francisco Jerez2015-08-201-41/+50
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This includes the minimum required desktop/ES GLSL version in the format qualifier table in anticipation of new GLSL versions extending the set of supported image formats. According to section 4.4.7 of the GLSL ES 3.1 spec: "The format layout qualifier identifiers for image variable declarations are: [...] rgba32f rgba16f r32f rgba8 rgba8_snorm [...] rgba32i rgba16i rgba8i r32i [...] rgba32ui rgba16ui rgba8ui r32ui" Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Recognise image memory qualifiers in GLSL ES 3.1.Francisco Jerez2015-08-201-5/+5
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Define image-related built-in constants required by GLSL ES 3.1.Francisco Jerez2015-08-201-7/+15
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Remove duplicate definition of gl_MaxTess*ImageUniforms built-in ↵Francisco Jerez2015-08-201-2/+0
| | | | | | | | | | | constants. These seem to have been re-added at some point during the ARB_tessellation_shader implementation work. AFAICT the second (correct) definition of each constant would have had no effect because the symbols were already defined. Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Accept atomic_uint type in GLSL ES 3.1.Francisco Jerez2015-08-201-1/+1
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Accept supported image types in GLSL ES 3.1.Francisco Jerez2015-08-202-24/+24
| | | | | | | | These are a subset of the image types supported by desktop GL, excluding 1D, 1D array, rectangle, buffer, cube array, 2D MS and 2D MS array texture targets. Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Expose image load and store built-ins in GLSL ES 3.1.Francisco Jerez2015-08-201-1/+1
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Use a separate availability class for image atomic built-ins.Francisco Jerez2015-08-201-11/+23
| | | | | | | These are not part of unextended GLSL ES 3.1. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Allow precision qualifiers on general opaque types.Francisco Jerez2015-08-201-4/+6
| | | | | | | | | | From the GLSL ES 3.1 spec, section 4.7.3: "Any floating point, integer, opaque type declaration can have the type preceded by one of these precision qualifiers: [...] highp [...], mediump [...], lowp [...]." Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Implement GLSL ES restriction on images being either readonly or ↵Francisco Jerez2015-08-201-0/+18
| | | | | | writeonly. Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Require that all image uniforms have a format qualifier in GLSL ES.Francisco Jerez2015-08-201-4/+10
| | | | | | | | | | | | | | | | | | | | Note that this is slightly more permissive than the spec language requires: "Any image variable must specify a format layout qualifier." The GLSL ES spec seems really sketchy regarding format layout qualifiers on function formal parameters -- On the one hand they are required, but on the other hand it doesn't provide any syntax to specify them (see section 6.1.1), they don't participate in parameter type matching for overload resolution, and are in fact explictly forbidden ("Layout qualifiers cannot be used on formal function parameters"). Of course none of the image built-in functions defined by the spec specify format layout qualifiers (and they probably couldn't sensibly), to contradict its own requirement. This probably qualifies for a spec bug, but in the meantime do the sensible thing and require layout qualifiers on uniforms *only*. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Add support for image binding qualifiers.Francisco Jerez2015-08-202-8/+33
| | | | | | | | Support for binding an image to an image unit explicitly in the shader source is required by both GLSL 4.2 and GLSL ES 3.1, but not by the original ARB_shader_image_load_store extension. Reviewed-by: Timothy Arceri <[email protected]>
* glsl: Forbid non-constant image array indexing in GLSL ES 3.1.Francisco Jerez2015-08-201-0/+15
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Rename MaxCombinedImageUnitsAndFragmentOutputs to ↵Francisco Jerez2015-08-204-4/+4
| | | | | | | | | | | | | MaxCombinedShaderOutputResources. The name of both the GLSL built-in variable and the glGetInteger param with the same value changed in GLSL ES 3.1 and GL 4.5. Its semantics also changed slightly, since the limit now also takes into account the number of SSBs in use. Switch our internal data structures to the up-to-date name. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* nir: Use nir_builder in nir_lower_io's get_io_offset().Kenneth Graunke2015-08-191-28/+14
| | | | | | | Much more readable. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Pull nir_lower_io's load_op selection into a helper function.Kenneth Graunke2015-08-191-17/+22
| | | | | | | Makes the function a bit smaller. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Fix up GL_ARB_compute_shader for GLSL ES 3.1Marta Lofstedt2015-08-192-3/+7
| | | | | | | | | | GL_ARB_compute_shader is limited for GLSL version 430. This enables for GLSL ES version 310. V2: Updated error string to also include GLSL 3.10 Signed-off-by: Marta Lofstedt <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: enable textureSize and texelFetch on GLSL ES 3.10 with MS samplersTapani Pälli2015-08-191-6/+13
| | | | | | | | Patch separates array samplers from the texture_multisample check so that we can enable only [iu]sampler2DMS, [iu]sampler2DMSArray are not supported. Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa: Move varying slots and FS output names to shader_enums.hEric Anholt2015-08-181-0/+98
| | | | | | | | | | They're used by glsl_to_nir.cpp, and I want to use them in TGSI-to-NIR as well (our use of the var->index slot to store slot properties no longer works since it got truncated). The *_MAX defines are left in mtypes.h, because they depend on config.h. Acked-by: Kenneth Graunke <[email protected]>
* nir: Simplify feq(fneg(a), a)) -> feq(a, 0.0)Thomas Helland2015-08-181-0/+1
| | | | | | | | | | The positive and negative value of a float can only be equal to each other if it is -0.0f and 0.0f. This is safe for Nan and Inf, as -Nan != Nan, and -Inf != Inf This gives no changes in my shader-db Signed-off-by: Thomas Helland <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Simplify fne(fneg(a), a) -> fne(a, 0.0)Thomas Helland2015-08-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | -NaN != NaN, and -Inf != Inf, so this should be safe. Found while working on my VRP pass. Shader-db results on my IVB: total instructions in shared programs: 1698267 -> 1698067 (-0.01%) instructions in affected programs: 15785 -> 15585 (-1.27%) helped: 36 HURT: 0 GAINED: 0 LOST: 0 Some shaders was found to have the following pattern in NIR: vec1 ssa_26 = fneg ssa_21 vec1 ssa_27 = fne ssa_21, ssa_26 Make that: vec1 ssa_27 = fne ssa_21, 0.0f This is found in Dota2 and Brutal Legend. One shader is cut by 8%, from 323 -> 296 instructons in SIMD8 Signed-off-by: Thomas Helland <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* glsl: add missing MS sampler builtin types for GLSL ES 3.10Tapani Pälli2015-08-172-6/+7
| | | | | Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add a glsl_uint_type() wrapper.Kenneth Graunke2015-08-162-0/+7
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>