aboutsummaryrefslogtreecommitdiffstats
path: root/src/compiler/nir/nir_intrinsics.py
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add gl_PointCoord system valueAndreas Baierl2019-07-181-0/+1
| | | | | | | | | | gl_PointCoord handling needs some special bits set in lima/ppir code generation. Treating gl_PointCoord as a system value makes it easier to distinguish from a regular varying. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: add a V3D-specific intrinsic for per-sample color writesIago Toral Quiroga2019-07-181-0/+9
| | | | | | | | | | | For per-sample color writes we need the output intrinsic to pack the sample index, which is not provided with regular store_output intrinsics unless we figured out a way to encode it into the base or the offset. v2: - Drop the writemask (Eric) Reviewed-by: Eric Anholt <[email protected]>
* nir: add a new v3d-specific intrinsic for tile buffer color readsIago Toral Quiroga2019-07-121-0/+9
| | | | | | | | | | This is intended to be used, for example, with OpenGL logic operations. It takes a render target as source and a sample index in the base index for MSAA color reads. v2: drop the CAN_ELIMINATE and CAN_REORDER flags (Eric). Reviewed-by: Eric Anholt <[email protected]>
* nir: Add Panfrost-specific blending intrinsicAlyssa Rosenzweig2019-07-091-0/+16
| | | | | | | | | | This gives more flexibility than the normal store_deref/store_output versions (particularly, it allows us to abuse the type system in awful ways, which is necessary for efficient format conversion in blend shaders.) Signed-off-by: Alyssa Rosenzweig <[email protected]> Acked-by: Karol Herbst <[email protected]>
* nir: Add demote and is_helper_invocation intrinsicsCaio Marcelo de Oliveira Filho2019-07-081-0/+10
| | | | | | | | | From SPV_EXT_demote_to_helper_invocation. Demote will be implemented as a variant of discard, so mark uses_discard if it is used. v2: Add CAN_ELIMINATE flag to the new intrinsic. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* compiler: Add color system valueConnor Abbott2019-07-081-0/+6
| | | | | | | | This is nice to have with radeonsi, where color varyings are handled specially to avoid recompiles. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: add pass to lower load_interpolated_inputRob Clark2019-07-021-0/+13
| | | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* nir: Allow qualifiers on copy_deref and image instructionsConnor Abbott2019-06-191-2/+5
| | | | | | | | In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <[email protected]>
* nir: add intrinsics for AMD_shader_ballotDaniel Schürmann2019-06-131-0/+10
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: add type information to load uniform/input and store output intrinsicsJonathan Marek2019-05-311-3/+5
| | | | | | | This type information will be used by gather_ssa_types to get usable results Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add blend_const_color_rgba sysvalAlyssa Rosenzweig2019-05-101-1/+4
| | | | | | | | | | | | | This represents a float vec4 constant color, as passed to glBlendColor. While the existing 4 shader sysvals are retained to minimize code churn, a single vectorized intrinsic is required for efficient blending on vector architectures. (This may also apply to archictectures like Bifrost where ALU is scalar but load/store is vector; it largely depends on how blending is implemented per-driver.) Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/ir3: lower load_barycentric_at_offsetRob Clark2019-04-251-0/+3
| | | | | | | | | Calculates i,j at specified offset within a pixel. A new load_size_ir3 intrinsic is used in conjunction with fddx/fddy to translate the offset into primitive space and adjust the i,j from load_barycentric_pixel accordingly. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower load_barycentric_at_sampleRob Clark2019-04-251-0/+7
| | | | | | | This lowers load_barycentric_at_sample to load_sample_pos_from_id plus load_barycentric_at_offset. Signed-off-by: Rob Clark <[email protected]>
* nir: Add a pass for selectively lowering variables to scratch spaceJason Ekstrand2019-04-121-1/+4
| | | | | | | | | | This commit adds new nir_load/store_scratch opcodes which read and write a virtual scratch space. It's up to the back-end to figure out what to do with it and where to put the actual scratch data. v2: Drop const_index comments (by anholt) Reviewed-by: Eric Anholt <[email protected]>
* nir: Add a comment about how intrinsic definitions work.Eric Anholt2019-04-121-0/+11
| | | | | | I was thinking about a refactor, and needed to read this first. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Drop remaining references to const_index in favor of the call to use.Eric Anholt2019-04-121-5/+5
| | | | | | Please don't make me read a const_index[] expression ever again. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Drop comments about the constant_index slots for load/stores.Eric Anholt2019-04-121-21/+15
| | | | | | | | | | The constant_index slots are named right there in the intrinsic definition, and the comment is just a chance to get out of sync. Noticed while reviewing the lower_to_scratch changes that copy-and-pasted wrong comments, and load_ubo and load_per_vertex_output had incorrect comments currently. Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add access qualifiers on load_ubo intrinsic.Bas Nieuwenhuizen2019-04-101-1/+1
| | | | | | | | | Otherwise nir_lower_non_uniform_access crashes when it tries to get the access of a load_ubo. Fixes: 8ed583fe523 "spirv: Handle the NonUniformEXT decoration" Fixes: e50ab2c0f23 "nir: Add access flags to deref and SSBO atomics" Reviewed-by: Samuel Pitoiset <[email protected]>
* nir: Add "viewport vector" system valuesAlyssa Rosenzweig2019-04-041-0/+5
| | | | | | | | | | | | While a partial set of viewport system values exist, these are scalar values, which is a poor fit for viewport transformations on vector ISAs like Midgard (where the vec3 values for scale and offset each need to be coherent in a vec4 uniform slot to take advantage of vectorized transform math). This patch adds vec3 scale/offset fields corresponding to the 3D Gallium viewport / glViewport+depth Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir: Add access flags to deref and SSBO atomicsJason Ekstrand2019-03-251-28/+28
| | | | | | | We will need them for a new ACCESS_NON_UNIFORM flag that's about to be added in the next commit. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Add texture sources and intrinsics for bindlessJason Ekstrand2019-03-251-6/+9
| | | | | | | | | On Intel, we have both bindless and bindful and we'd like to use them at the same time if we can so we need to be able to distinguish at the NIR level between the two. This also fixes nir_lower_tex to properly handle bindless in its tex_texture_size and get_texture_lod helpers. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir/spirv: support physical pointersKarol Herbst2019-03-191-0/+2
| | | | | | | | v2: add load_kernel_input Signed-off-by: Karol Herbst <[email protected]> squash! nir/spirv: support physical pointers
* nir/lower_io: Add a new buffer_array_length intrinsic and loweringJason Ekstrand2019-03-151-0/+4
| | | | | Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add ir3-specific version of most SSBO intrinsicsEduardo Lima Mitev2019-03-131-0/+27
| | | | | | | | | | | These are ir3 specific versions of SSBO intrinsics that add an extra source to hold the element offset (dword), which is what the backend instructions need. The original byte-offset source provided by NIR is not replaced because on a4xx and a5xx the backend still needs it. Reviewed-by: Rob Clark <[email protected]>
* nir/vtn: add support for SpvBuiltInGlobalLinearIdKarol Herbst2019-03-051-0/+1
| | | | | | | | v2: use formula with fewer operations Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir: add support for address bit sized system valuesKarol Herbst2019-03-051-1/+1
| | | | | | | | | | | v2: add assert in else clause make local group intrinsics 32 bit wide v3: always use 32 bit constant for local_size v4: add comment by Jason Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Move the stores for fixed function VS output reads into NIR.Eric Anholt2019-03-051-0/+9
| | | | | | | | | | | | | | | This lets us emit the VPM_WRITEs directly from nir_intrinsic_store_output() (useful once NIR scheduling is in place so that we can reduce register pressure), and lets future NIR scheduling schedule the math to generate them. Even in the meantime, it looks like this lets NIR DCE some more code and make better decisions. total instructions in shared programs: 6429246 -> 6412976 (-0.25%) total threads in shared programs: 153924 -> 153934 (<.01%) total loops in shared programs: 486 -> 483 (-0.62%) total uniforms in shared programs: 2385436 -> 2388195 (0.12%) Acked-by: Ian Romanick <[email protected]> (nir)
* spirv: Use the same types for resource indices as pointersJason Ekstrand2019-03-051-4/+4
| | | | | | | | We need more space than just a 32-bit scalar and we have to burn all that space anyway so we may as well expose it to the driver. This also fixes a subtle bug when UBOs and SSBOs have different pointer types. Reviewed-by: Lionel Landwerlin <[email protected]>
* nir: Add load/store/atomic global intrinsicsJason Ekstrand2019-01-261-0/+34
| | | | | | | | | These correspond roughly to reading/writing OpenCL global pointers. The idea is that they just take a bare address and load/store from it. Of course, exactly what this address means is driver-dependent. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: add legal bit_sizes to intrinsicsKarol Herbst2019-01-211-12/+17
| | | | | | | | | | | | | | | | | | | With OpenCL some system values match the address bits, but in GLSL we also have some system values being 64 bit like subgroup masks. With this it is possible to adjust the builder functions so that depending on the bit_sizes the correct bit_size is used or an additional argument is added in case of multiple possible values. v2: validate dest bit_size v3: generate hex values in python code remove useless imports rename and move bit_sizes v4: add 1 to legal bit_sizes for front_face Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* spirv: Add support for using derefs for UBO/SSBO accessJason Ekstrand2019-01-081-0/+2
| | | | | | | | | For now, it's hidden behind a cap. Hopefully, we can eventually drop that along with all the manual offset code in spirv_to_nir. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* nir/vulkan: Add a descriptor type to vulkan resource intrinsicsJason Ekstrand2019-01-081-2/+5
| | | | | Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/intrinsics: Add access flags to load/store_derefJason Ekstrand2019-01-081-3/+4
| | | | | Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/intrinsics: Allow deref sources to consume anythingJason Ekstrand2019-01-081-20/+20
| | | | | | | | | | | This commit adds a new num_components value for intrinsic sources of -1 which means that it consumes everything and the number of components effectively isn't validated. This is useful for deref sources which just take the result of the deref and we leave it up to the driver to decide what that size should be. Reviewed-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: Revert INTEL_fragment_shader_ordering supportMatt Turner2018-12-031-1/+0
| | | | | | | | | | | | | | | | This extension is not properly tested (testing for GL_ARB_fragment_shader_interlock is not sufficient), and since this was noted in review on August 28th no tests have been sent. Revert "i965: Add INTEL_fragment_shader_ordering support." Revert "mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_ordering" This reverts commit 03ecec9ed2099f6e2b62994b33dc948dc731e7b8. This reverts commit 119435c8778dd26cb7c8bcde9f04b3982239fe60. Cc: [email protected] Acked-by: Jason Ekstrand <[email protected]> Acked-by: Eric Anholt <[email protected]>
* nir: Add alignment parameters to SSBO, UBO, and shared accessJason Ekstrand2018-11-151-10/+16
| | | | | | | | | | This also changes spirv_to_nir and glsl_to_nir to set them. The one place that doesn't set them is shared memory access lowering in nir_lower_io. That will have to be updated before any consumers of it can effectively use these new alignments. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Acked-by: Karol Herbst <[email protected]>
* spirv/nir: handle memory access qualifiers for SSBO loads/storesSamuel Pitoiset2018-10-121-2/+2
| | | | | | | | | | | v2: - change how the access qualifiers are accumulated v3: - duplicate members in struct_member_decoration_cb() - handle access qualifiers on variables - remove access qualifiers handling in _vtn_variable_load_store() - fix setting access qualifiers on type->array_element Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]
* anv,i965: Lower away image derefs in the driverJason Ekstrand2018-08-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add handle/index-based image intrinsicsJason Ekstrand2018-08-291-18/+32
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Do image load/store lowering to NIRJason Ekstrand2018-08-291-0/+9
| | | | | | | | | | | | | | | | | | | | | | | This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Make image load/store intrinsics variable-widthJason Ekstrand2018-08-291-2/+2
| | | | | | | | | Instead of requiring 4 components, this allows them to potentially use fewer. Both the SPIR-V and GLSL paths still generate vec4 intrinsics so drivers which assume 4 components should be safe. However, we want to be able to shrink them for i965. Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: Add GL/GLSL plumbing for INTEL_fragment_shader_orderingKevin Rogovin2018-08-281-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | This extension provides new GLSL built-in function beginFragmentShaderOrderingIntel() that guarantees (taking wording of GL_INTEL_fragment_shader_ordering extension) that any memory transactions issued by shader invocations from previous primitives mapped to same xy window coordinates (and same sample when per-sample shading is active), complete and are visible to the shader invocation that called beginFragmentShaderOrderingINTEL(). One advantage of INTEL_fragment_shader_ordering over ARB_fragment_shader_interlock is that it provides a function that operates as a memory barrie (instead of a defining a critcial section) that can be called under arbitary control flow from any function (in contrast the begin/end of ARB_fragment_shader_interlock may only be called once, from main(), under no control flow. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* nir: Add floating point atomic min, max, and compare-swap instrinsicsIan Romanick2018-08-221-1/+10
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add floating point atomic add instrinsicsIan Romanick2018-08-221-0/+4
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: add lowering for gl_HelperInvocationRob Clark2018-07-181-0/+3
| | | | | | | | | v2: reword comment about lower_helper_invocations to be more clear that it might not work on all hardware v3: add special variant of load_sample_id which does not imply per- sample shading Signed-off-by: Rob Clark <[email protected]>
* nir: fixup intrinsic commentRob Clark2018-07-181-1/+1
| | | | | | | Now the deref is the first src. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* nir/spirv: implement BuiltInWorkDimRob Clark2018-07-151-0/+1
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* nir: Fix OpAtomicCounterIDecrement for uniform atomic countersAntia Puentes2018-07-031-1/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From the SPIR-V 1.0 specification, section 3.32.18, "Atomic Instructions": "OpAtomicIDecrement: <skip> The instruction's result is the Original Value." However, we were implementing it, for uniform atomic counters, as a pre-decrement operation, as was the one available from GLSL. Renamed the former nir intrinsic 'atomic_counter_dec*' to 'atomic_counter_pre_dec*' for clarification purposes, as it implements a pre-decrement operation as specified for GLSL. From GLSL 4.50 spec, section 8.10, "Atomic Counter Functions": "uint atomicCounterDecrement (atomic_uint c) Atomically 1. decrements the counter for c, and 2. returns the value resulting from the decrement operation. These two steps are done atomically with respect to the atomic counter functions in this table." Added a new nir intrinsic 'atomic_counter_post_dec*' which implements a post-decrement operation as required by SPIR-V. v2: (Timothy Arceri) * Add extra spec quotes on commit message * Use "post" instead "pos" to avoid confusion with "position" Signed-off-by: Antia Puentes <[email protected]> Signed-off-by: Alejandro Piñeiro <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a concept of constant data associated with a shaderJason Ekstrand2018-07-021-0/+2
| | | | | | | | | | | | | | | | | | | This commit adds a concept to NIR of having a blob of constant data associated with a shader. Instead of being a UBO or uniform that can be manipulated by the client, this constant data considered part of the shader and remains constant across all invocations of the given shader until the end of time. To access this constant data from the shader, we add a new load_constant intrinsic. The intention is that drivers will eventually lower load_constant intrinsics to load_ubo, load_uniform, or something similar. Constant data will be used by the optimization pass in the next commit but this concept may also be useful for OpenCL. v2 (Jason Ekstrand): - Rename num_constants to constant_data_size (anholt) Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Remove old-school deref chain supportJason Ekstrand2018-06-221-79/+4
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>