summaryrefslogtreecommitdiffstats
path: root/src/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3: fixes for half reg in/outRob Clark2019-04-304-13/+39
| | | | | | | Needs to update max_half_reg, or be remapped to full reg and update max_reg accordingly, depending on generation.. Signed-off-by: Rob Clark <[email protected]>
* turnip: update to use the new features struct namesEric Engestrom2019-04-301-5/+5
| | | | | | | | | These were updated in version 1.1.106 of vulkan.h to make more sense with the extension names. We may as well keep with the times. See also: 90108deb277d33d19233 "anv: Update to use the new features struct names" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* freedreno/ir3: switch fragcoord to sysvalRob Clark2019-04-291-48/+44
| | | | | | Because who are we kidding... it is a sysval. Signed-off-by: Rob Clark <[email protected]>
* freedreno/drm: Quiet pointer to u64 conversion warningKristian H. Kristensen2019-04-261-1/+1
|
* turnip: drop dead close(master_fd)Emil Velikov2019-04-261-2/+0
| | | | | | | | The fd is -1, thus the block of if (fd != -1) close(fd) is dead code. Cc: Chad Versace <[email protected]> Reviewed-by: Chia-I Wu <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* freedreno/ir3: sample-shading supportRob Clark2019-04-255-8/+113
| | | | | | | | | | The compiler support for: OES_sample_shading OES_sample_variables OES_shader_multisample_interpolation Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix load_interpolated_input slotRob Clark2019-04-251-1/+1
| | | | | | | The so->inputs[] table is in units of vec4 Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rename frag_vcoord -> ij_pixelRob Clark2019-04-253-10/+16
| | | | | | | Since this is what the value actually is. Cleanup the name before adding more different i,j related values for sample-shading. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove bogus assertRob Clark2019-04-251-2/+0
| | | | | | tex instruction can actually return 16b values. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower load_barycentric_at_offsetRob Clark2019-04-253-0/+126
| | | | | | | | | Calculates i,j at specified offset within a pixel. A new load_size_ir3 intrinsic is used in conjunction with fddx/fddy to translate the offset into primitive space and adjust the i,j from load_barycentric_pixel accordingly. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: lower load_barycentric_at_sampleRob Clark2019-04-253-0/+132
| | | | | | | This lowers load_barycentric_at_sample to load_sample_pos_from_id plus load_barycentric_at_offset. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2019-04-259-95/+133
| | | | | | Pull in updates for sample shading. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: cleanup instruction builder macrosRob Clark2019-04-251-46/+27
| | | | | | | | De-duplicate the "normal" and "flags" versions of the macros, and while at it go ahead and add "flags" versions for all the remaining macros, since we'll at least need INSTR1F in a following commit. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: more emit-cat5 fixesRob Clark2019-04-251-0/+2
| | | | | | Couple more opcodes which don't take a sampler id as first arg. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix rgetpos decodingRob Clark2019-04-251-1/+1
| | | | | | It takes an argument. Signed-off-by: Rob Clark <[email protected]>
* compiler: rename SYSTEM_VALUE_VARYING_COORDRob Clark2019-04-253-3/+3
| | | | | | | And add corresponding enums for different sorts of varying interpolation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/drm: update for robustnessRob Clark2019-04-253-0/+44
| | | | | | | | | Update UABI header and add FD_PP_PGTABLE and FD_NR_FAULTS params. Robustness can be supported by a kernel which provides the new ABI if it also indicates that per-process pagetables are in use. Signed-off-by: Rob Clark <[email protected]>
* vulkan/wsi: Add X11 adaptive sync support based on dri options.Bas Nieuwenhuizen2019-04-231-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | The dri options are optional. When the dri options are not provided the WSI will not use adaptive sync. FWIW I think for xf86-video-amdgpu this still requires an X11 config option, so only people who opt in can get possible regressions from this. So then the remaining question is: why do this in the WSI? It has been suggested in another MR that the application sets this. However, I disagree with that as I don't think we'll ever get a reasonable set of applications setting it. The next questions is whether this can be a layer. It definitely can be as implemented now. However, I think this generally fits well with the function of the WSI. Furthemore, for e.g. the DISPLAY WSI this is much harder to do in a layer. Of course, most of the WSI could almost be a layer, but I think this still fits best in the WSI. Acked-by: Jason Ekstrand <[email protected]>
* freedreno/ir3: fix const assertRob Clark2019-04-191-1/+0
| | | | | Fixes: fe8c57e859d freedreno/ir3: use nir_src_as_uint in a few places Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Mark ir3_context_error() as NORETURNKristian H. Kristensen2019-04-182-3/+3
| | | | | | Fixes a few warnings. Signed-off-by: Kristian H. Kristensen <[email protected]>
* Delete autotoolsDylan Baker2019-04-151-74/+0
| | | | | | | | | | Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Marek Olšák <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Matt Turner <[email protected]>
* nir: make nir_const_value scalarKarol Herbst2019-04-142-2/+2
| | | | | | | | | v2: remove & operator in a couple of memsets add some memsets v3: fixup lima Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> (v2)
* freedreno/ir3: use nir_src_as_uint in a few placesKarol Herbst2019-04-145-51/+20
| | | | | | | | v2 (Jason Ekstrand): - Add even more places Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir/i965/freedreno/vc4: add a bindless bool to type size functionsTimothy Arceri2019-04-122-2/+2
| | | | | | | This required to calculate sizes correctly when we have bindless samplers/images. Reviewed-by: Marek Olšák <[email protected]>
* nir: Get rid of global registersJason Ekstrand2019-04-092-6/+0
| | | | | | | | | We have a pass to lower global registers to locals and many drivers dutifully call it. However, no one ever creates a global register ever so it's all dead code. It's time we bury it. Acked-by: Karol Herbst <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/radv: remove restrictions on opt_if_loop_last_continue()Timothy Arceri2019-04-091-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When I implemented opt_if_loop_last_continue() I had restricted this pass from moving other if-statements inside the branch opposite the continue. At the time it was causing a bunch of spilling in shader-db for i965. However Samuel Pitoiset noticed that making this pass more aggressive significantly improved the performance of Doom on RADV. Below are the statistics he gathered. 28717 shaders in 14931 tests Totals: SGPRS: 1267317 -> 1267549 (0.02 %) VGPRS: 896876 -> 895920 (-0.11 %) Spilled SGPRs: 24701 -> 26367 (6.74 %) Code Size: 48379452 -> 48507880 (0.27 %) bytes Max Waves: 241159 -> 241190 (0.01 %) Totals from affected shaders: SGPRS: 23584 -> 23816 (0.98 %) VGPRS: 25908 -> 24952 (-3.69 %) Spilled SGPRs: 503 -> 2169 (331.21 %) Code Size: 2471392 -> 2599820 (5.20 %) bytes Max Waves: 586 -> 617 (5.29 %) The codesize increases is related to Wolfenstein II it seems largely due to an increase in phis rather than the existing jumps. This gives +10% FPS with Doom on my Vega56. Rhys Perry also benchmarked Doom on his VEGA64: Before: 72.53 FPS After: 80.77 FPS v2: disable pass on non-AMD drivers Reviewed-by: Ian Romanick <[email protected]> (v1) Acked-by: Samuel Pitoiset <[email protected]>
* freedreno/ir3: convert to "new style" frag inputsRob Clark2019-03-302-2/+33
| | | | | | | | | | Add support for load_barycentric_pixel, load_interpolated_input, and friends. For now, this retains support for old-style inputs, which can probably be dropped with some ttn work. Prep work for sample-shading support. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add pass to move varying loadsRob Clark2019-03-305-0/+151
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rework varying packingRob Clark2019-03-301-30/+98
| | | | | | | | | Originally we kept track of a table of inputs. But with new-style frag inputs this becomes awkward. Re-work it so that initially we assigned un-packed varying locations, and then after the shader is compiled scan to find actual used inputs, and re-pack. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: re-indent commentRob Clark2019-03-301-4/+4
| | | | | | | Make it more clear that it applies to the following 'case' statements, rather than the previous one. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: reads/writes to unrelated arrays are not dependentRob Clark2019-03-281-1/+30
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sched fixRob Clark2019-03-281-1/+1
| | | | | | | Not sure why new-style frag inputs start triggering this. But we probably shouldn't consider src's from other blocks. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: Add workaround for VS samgqKristian H. Kristensen2019-03-286-4/+29
| | | | | | | | | | | | | | | | | This instruction needs a workaround when used from vertex shaders. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.texturegradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler2dshadow_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_fixed_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgradoffset.sampler3d_float_vertex dEQP-GLES3.functional.shaders.texture_functions.textureprojgrad.sampler2dshadow_vertex Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Don't access beyond available regsKristian H. Kristensen2019-03-281-4/+7
| | | | | | | | emit_cat5() needs to check if the last optional reg is there before it accesses it. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Push UBOs to constant fileKristian H. Kristensen2019-03-273-12/+118
| | | | | | | | We have a rather big constant file and it seems that the best way to use it is to upload all UBOs and lower UBO access the load_uniform. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMSKristian H. Kristensen2019-03-277-13/+119
| | | | | | | | | | | | | | | | | | | This commit turns on the gallium cap and adds a pass to lower the load_ubo intrinsics for block 0 back to load_uniform intrinsics and adjust the backend where the cap switches units from vec4s to dwords. As we stop using ir3_glsl_type_size() for uniform layout, this also corrects an issue where we would allocate a vec4 slot for samplers in uniforms, fixing: dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_fragment dEQP-GLES3.functional.shaders.struct.uniform.sampler_array_vertex dEQP-GLES3.functional.shaders.struct.uniform.sampler_nested_fragment dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_vertex dEQP-GLES2.functional.shaders.struct.uniform.sampler_nested_fragment Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Fix operand order for DSX/DSYKristian H. Kristensen2019-03-251-0/+15
| | | | | | | | | | Most cat5 instructions are constructed using ir3_SAM, which uses regs[1] for the (sampler, tex) src. Not DSX/DSY though, so we look up src1 and src2 differently for those two. Fixes: 1dffb089 ("freedreno/ir3: fix sam.s2en encoding") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Track whether shader needs derivativesKristian H. Kristensen2019-03-254-3/+9
| | | | | | | | | | | | | In 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") we started counting number of samplers based on the uniform vars instead of number of cat5 instructions. We used the number of samplers to determine whether to enable derivatives, but when we only use derivatives and no samplers, that now breaks. Track whether we need derivatives explicitly and use that to enable the state. Fixes: 1088b788 ("freedreno/ir3: find # of samplers from uniform vars") Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* spirv,nir: lower frexp_exp/frexp_sig inside a new NIR passSamuel Pitoiset2019-03-221-0/+1
| | | | | | | | | | This lowering isn't needed for RADV because AMDGCN has two instructions. It will be disabled for RADV in an upcoming series. While we are at it, factorize a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* freedreno/ir3: disable early-z for SSBO/image writesRob Clark2019-03-221-0/+12
| | | | | | | | | | | Fixes: dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_depth_fbo dEQP-GLES31.functional.image_load_store.early_fragment_tests.no_early_fragment_tests_stencil_fbo Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: rename has_kill to no_earlyzRob Clark2019-03-223-4/+4
| | | | | | | There are other cases where we need to disable early-z, like image writes. So rename to something more generic. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: dynamic UBO indexing vs 64b pointersRob Clark2019-03-211-2/+2
| | | | | | | Fixes dEQP-GLES31.functional.shaders.opaque_type_indexing.ubo.uniform_fragment and similar things with multiple UBOs Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix bit_countRob Clark2019-03-211-2/+23
| | | | | | | | | Seems like it can only work 16b at a time. Fixes dEQP-GLES31.functional.shaders.builtin_functions.integer.bitcount.* TODO need to check if this limitation applies to a3xx as well. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: additional loweringRob Clark2019-03-211-0/+6
| | | | | | | | | For some things that show up when we expose higher glsl TODO check blob traces to see if we have instructions for some of this? I guess we don't but worth a check.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: optimize sam.s2en to samRob Clark2019-03-213-6/+36
| | | | | | | Detect when sampler/texture idx are immediate and switch to non s2en encoding. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: enable indirect tex/samp (sam.s2en)Rob Clark2019-03-212-22/+73
| | | | | | | | | | For now it uses indirect for everything. The next step is for the ir3_cp pass to detect the case that tex and samp idx are immediate and convert the sam instruction back to the non .s2en variant. But doing that in a following patch so we can shake out the bugs with .s2en more easily. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: find # of samplers from uniform varsRob Clark2019-03-213-13/+13
| | | | | | | When we have indirect samplers, we cannot tell the max sampler referenced. Instead just refer to the number of sampler uniforms. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix regmask for merged regsRob Clark2019-03-212-3/+13
| | | | | | | | On a6xx+ with half-regs conflicting with full-regs, the legalize pass needs to set appropriate sync bits, such as (sy), on writes to full regs that conflict with half regs, and visa-versa. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix sam.s2en encodingRob Clark2019-03-212-9/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix sam.s2en decodingRob Clark2019-03-211-3/+5
| | | | Signed-off-by: Rob Clark <[email protected]>