| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Seems like no one ever depended on it to actually return false when fifo
is empty.
Fixes: 6e61d062093 ("util: Add super simple fifo")
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4366>
|
|
|
|
|
|
|
|
|
|
| |
'fraghalf' is unused (superceeded by actually lowering output based on
the precision information in nir). And glsl140 support in ir3 is long
past the experimental stage, so the glsl120 option is no longer needed.
So remove them and free up some bits for new things.
Signed-off-by: Rob Clark <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4366>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 87839680c0a48, a very subtle mistake was made with the CFG walking
recursion. Instead of setting the local has_nested_loop variable when
process child loops, has_nested_loop_out was passed directly into the
process_loop_in_block call. This broke nested loop detection heuristics
and caused loop unrolling to run massively out of control. In
particular, it makes the following CTS test compile virtually forever:
dEQP-VK.spirv_assembly.instruction.graphics.16bit_storage.struct_mixed_types.uniform_buffer_block_geom
Fixes: 87839680c0 "nir: Fix breakage of foreach_list_typed_safe..."
Closes: #2710
Reviewed-by: Danylo Piliaiev <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4380>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4380>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In trying to track down the new failure in #2670, I found that I could get
the flaky test set down to 4 tests, and dropping any remaining test
wouldn't trigger the failure (a bad 8x4 block in the middle of
dEQP-GLES3.functional.fbo.msaa.4_samples.r16f's render target). Disabling
gmem or bypass didn't help, and adding lots of CCU flushing didn't help.
What did help was disabling blitting, or this memset to initialize the
UBWC area after we (presumably) pull a BO out of the BO cache. My guess
is that the 2D blitter can't handle some rare set of state in the flags
buffer and emits some garbage.
I've run 8 gles3 and 7 gles31 runs with this branch now so hopefully I've got the4 right set of flakes marked for removal.
Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2670
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4290>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The batch might not have stage == FD_STAGE_BLIT set because
fd_blitter_pipe_begin was sticking the stage on some random batch (or none
at all) rather than the one that would be used in the meta operation.
What we actually wanted to be looking at was set_active_query_state(),
which is already called by util_blitter and whose state we just needed to
track.
Fixes piglit occlusion_query_meta_no_fragments. I haven't changed
query_hw.c's stage handling to clean the rest up because I don't have a
db410c/db820c at home to iterate over the piglit tests.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
| |
It's about the special case of an overwrite of a level meaning we can
discard old batch contents.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were returning the same kind of result as time_elapsed (an end - start
time in ns), which on a timestamp query is approximately zero since
begin/end are at the same point in time. What we're supposed to return is
a converted-to-ns timestamp based on the GPU clock. Remove the _pause()
function for time_elapsed to reduce the command stream overhead, and just
capture start (which is, unfortunately, going to happen on each tile and
thus the final start value we ready will be the last tile of the frame,
not the first).
Fixes piglit spec/arb_timer_query/query gl_timestamp
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
| |
Fixes 0 gpu time reported for glBlitFramebuffer in apitrace replay --pgpu.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
| |
Otherwise, a result query with wait won't trigger flushing the batch, and
we can end up with zeroed results.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
|
|
|
| |
When we switch batches and start a new draw, we need to cap the queries in
the previous batch and start queries again in the new one.
FD_STAGE_NULL got renamed to 0 so that it would naturally return
!is_active and end the queries at the end of the batch.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
| |
The state tracker only gets to begin/query/destroy when !active and end
when active, so we have no need to try to track this ourselves.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
| |
You should do failure-prone allocation in create_query, not begin, anyway.
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4356>
|
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
CC: <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
|
|
|
|
|
|
|
|
|
|
|
| |
Insertions can modify entry->data. Seems to fix random Fossilize crashes.
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
CC: <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4335>
|
|
|
|
|
|
|
| |
Fixes: fb64954d9dd "nir: Validate that memory load/store ops work on..."
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4378>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4378>
|
|
|
|
|
|
|
| |
Signed-off-by: Rhys Perry <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4333>
|
|
|
|
|
|
|
|
|
|
|
| |
The failures are fixed, but I didn't notice this had been silently
disabled in !4272.
Re-enable the VS2019 build.
Signed-off-by: Daniel Stone <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4374>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4374>
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
By inserting a b2b1 around the load_ubo, load_input, etc. intrinsics
generated by nir_lower_io, we can ensure that the intrinsic has the
correct destination bit size. Not having the right size can mess up
passes which try to optimize access. In particular, it was causing
brw_nir_analyze_ubo_ranges to ignore load_ubo of booleans which meant
that booleans uniforms weren't getting pushed as push constants. I
don't think this is an actual functional bug anywhere hence no CC to
stable but it may improve perf somewhere.
Shader-db results on ICL with iris:
total instructions in shared programs: 16076707 -> 16075246 (<.01%)
instructions in affected programs: 129034 -> 127573 (-1.13%)
helped: 487
HURT: 0
helped stats (abs) min: 3 max: 3 x̄: 3.00 x̃: 3
helped stats (rel) min: 0.45% max: 3.00% x̄: 1.33% x̃: 1.36%
95% mean confidence interval for instructions value: -3.00 -3.00
95% mean confidence interval for instructions %-change: -1.37% -1.29%
Instructions are helped.
total cycles in shared programs: 338015639 -> 337983311 (<.01%)
cycles in affected programs: 971986 -> 939658 (-3.33%)
helped: 362
HURT: 110
helped stats (abs) min: 1 max: 1664 x̄: 97.37 x̃: 43
helped stats (rel) min: 0.03% max: 36.22% x̄: 5.58% x̃: 2.60%
HURT stats (abs) min: 1 max: 554 x̄: 26.55 x̃: 18
HURT stats (rel) min: 0.03% max: 10.99% x̄: 1.04% x̃: 0.96%
95% mean confidence interval for cycles value: -79.97 -57.01
95% mean confidence interval for cycles %-change: -4.60% -3.47%
Cycles are helped.
total sends in shared programs: 815037 -> 814550 (-0.06%)
sends in affected programs: 5701 -> 5214 (-8.54%)
helped: 487
HURT: 0
LOST: 2
GAINED: 0
The two lost programs were SIMD16 shaders in CS:GO. However, CS:GO was
also one of the most helped programs where it shaves sends off of 134
programs. This seems to reduce GPU core clocks by about 4% on the first
1000 frames of the PTS benchmark.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
|
| |
No shader-db changes on ICL with iris
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
|
|
|
|
| |
The implementations here just clone i2b32 and i2b1. This means that
b2b32 doesn't technically generate true NIR 0/-1 booleans but it should
be fine as it's only ever generated for shared variable writes which
will always be consumed by something which will then run it through an
i2b again.
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These exist to convert between different types of boolean values. In
particular, we want to use these for uniform and shared memory
operations where we need to convert to a reasonably sized boolean but we
don't care what its format is so we don't want to make the back-end
insert an actual i2b/b2i. In the case of uniforms, Mesa can tweak the
format of the uniform boolean to whatever the driver wants. In the case
of shared, every value in a shared variable comes from the shader so
it's already in the right boolean format.
The new boolean conversion opcodes get replaced with mov in
lower_bool_to_int/float32 so the back-end will hopefully never see them.
However, while we're in the middle of optimizing our NIR, they let us
have sensible load_uniform/ubo intrinsics and also have the bit size
conversion.
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
| |
No shader-db impact on ICL with iris.
Reviewed-by: Kenneth Graunke <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4338>
|
|
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
|
|
| |
We can reuse pipe_scissor_state.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
|
|
| |
Also round up the max bounds.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
|
|
| |
This moves the whole clipping calculation out of the emit function.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
|
|
|
|
| |
The only difference between e.g. SE_SCISSOR_RIGHT and SE_CLIP_RIGHT
is the used margin value. With that information we can remove
SE_CLIP_* and apply the different margins during emit time.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jonathan Marek <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4278>
|
|
|
|
|
|
|
|
| |
Based on the discussion in https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4352
Reviewed-by: Daniel Stone <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4363>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4363>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Helps some Wolfenstein II and Wolfenstein Youngblood shaders.
pipeline-db (VEGA10/ACO):
Totals from affected shaders:
SGPRS: 17904 -> 17904 (0.00 %)
VGPRS: 14492 -> 14492 (0.00 %)
Spilled SGPRs: 20 -> 20 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 1753152 -> 1749708 (-0.20 %) bytes
Max Waves: 2581 -> 2581 (0.00 %)
pipeline-db (VEGA10/LLVM):
Totals from affected shaders:
SGPRS: 26656 -> 26656 (0.00 %)
VGPRS: 23780 -> 23780 (0.00 %)
Spilled SGPRs: 2112 -> 2112 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Code Size: 2552712 -> 2549236 (-0.14 %) bytes
Max Waves: 3359 -> 3359 (0.00 %)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Alyssa Rosenzweig <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4353>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prunes out all targets except libgl-gdi, libgl-xlib, and svga, as
suggested by Marek Olšák.
libgl-xlib will be remove once I have had time to confirm no automated
tests we have rely upon it.
There are also a bunch of Makefile.sources which become orphaned as
result, that are not taken care of in this change.
v2: Prune remainders of swr support.
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4348>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4348>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Totals:
Code Size: 254764624 -> 254745104 (-0.01 %) bytes
Totals from affected shaders:
VGPRS: 12132 -> 12112 (-0.16 %)
Code Size: 573364 -> 553844 (-3.40 %) bytes
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We know that in this case, the LS and HS invocations are working
on the exact same vertex, so it's safe to skip the LDS.
Totals:
VGPRS: 3960744 -> 3961844 (0.03 %)
Code Size: 254824300 -> 254764624 (-0.02 %) bytes
Max Waves: 1053748 -> 1053574 (-0.02 %)
Totals from affected shaders:
VGPRS: 26152 -> 27252 (4.21 %)
Code Size: 1496600 -> 1436924 (-3.99 %) bytes
Max Waves: 4860 -> 4686 (-3.58 %)
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
| |
Will be used by LS output stores.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Clear the workgroup size for all supported shader stages.
Also, unify the workgroup size calculation accross various places.
As a result, insert_waitcnt can use the proper workgroup size
which means that some waits can be dropped from tessellation
shaders. Also, in cases where the previous calculation was wrong,
we now insert s_barrier instructions.
Totals from affected shaders (GFX10):
Code Size: 340116 -> 338484 (-0.48 %) bytes
Fixes: a8d15ab6daf0a07476e9dfabe513c0f1e0f3bf82
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
| |
Will be required by the workgroup size calculation.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
| |
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The following new fields are added to tess shader info:
* `tcs_cross_invocation_inputs_read`
* `tcs_cross_invocation_outputs_read`
These are I/O masks that are a subset of inputs_read and outputs_read
and they contain which per-vertex inputs and outputs are read
cross-invocation.
Additionall, the following new fields are added to shader_info:
* `inputs_read_indirectly`
* `outputs_accessed_indirectly`
* `patch_inputs_read_indirectly`
* `patch_outputs_accessed_indirectly`
These new fields can be used for optimizing TCS in a back-end compiler.
If you can be sure that the TCS doesn't use cross-invocation inputs
or outputs, you can choose a different strategy for storing VS and TCS
outputs. However, such optimizations might need to be disabled when
the inputs/outputs are accessed indirectly due to backend limitations,
so this information is also collected.
Example: RADV currently has to store all VS and TCS outputs in LDS, but
for shaders when only inputs and/or outputs belonging to the current
invocation ID are used, it could skip storing these in LDS entirely.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It can be further optimized in the future, but
the new sequence already has a few advantages:
* Uses fewer instructions
* Uses even fewer instructions in wave32 mode
* Doesn't use the VALU at all
Totals from affected shaders (GFX10):
VGPRS: 43504 -> 43496 (-0.02 %)
Code Size: 2436000 -> 2423688 (-0.51 %) bytes
Max Waves: 8704 -> 8705 (0.01 %)
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Daniel Schürmann <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When TCS has an equal number of input and output, it means that the
number of VS and TCS invocations (LS and HS) are the same; and that
the HS invocations operate on the same vertices as the LS.
When this is the case, this commit removes the else-if between
the merged VS and TCS halves, making it possible to schedule
and optimize the code accross the two halves.
Totals:
SGPRS: 5577367 -> 5581735 (0.08 %)
VGPRS: 3958592 -> 3960752 (0.05 %)
Code Size: 254867144 -> 254838244 (-0.01 %) bytes
Max Waves: 1053887 -> 1053747 (-0.01 %)
Totals from affected shaders:
SGPRS: 29032 -> 33400 (15.05 %)
VGPRS: 35664 -> 37824 (6.06 %)
Code Size: 1979028 -> 1950128 (-1.46 %) bytes
Max Waves: 7310 -> 7170 (-1.92 %)
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the tess factors were loaded individually, but now they can
be loaded using a single LDS load instruction.
Note that the inner and outer tess factors are not yet combined.
Totals (GFX10):
Code Size: 254896008 -> 254879212 (-0.01 %) bytes
Totals from affected shaders (GFX10):
Code Size: 2028352 -> 2011556 (-0.83 %) bytes
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some copypasta may have stuck in the code.
This was left on false by mistake.
Totals (GFX10):
Code Size: 254939248 -> 254896008 (-0.02 %) bytes
Totals from affected shaders (GFX10):
VGPRS: 16196 -> 16212 (0.10 %)
Code Size: 1126332 -> 1083092 (-3.84 %) bytes
Max Waves: 2336 -> 2334 (-0.09 %)
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no need to check whether they are written using indirect
indices, because all tess factors should be written to VMEM only
at the end of the shader.
No pipeline db changes.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
| |
Also clear up should_write_tcs_output_to_lds a little bit.
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the passes after isel to assume that the exports are
always correct, and also allows to schedule these null exports later.
Additionally, it ensures that the correct exec mask is used for
these exports.
Totals from affected shaders (GFX10):
SGPRS: 84224 -> 84344 (0.14 %)
VGPRS: 23088 -> 23076 (-0.05 %)
Code Size: 882892 -> 894368 (1.30 %) bytes
Signed-off-by: Timur Kristóf <[email protected]>
Reviewed-by: Rhys Perry <[email protected]>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4165>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
foreach_list_typed_safe works with assumption that even if current node
becomes invalid, the next will be still valid.
However process_loops broke this assumption, because during iteration
when immediate child is unrolled - not only current node could be removed
but also the one after it.
This doesn't cause issues now but it will cause issues when undefined
behaviour in foreach* macros is fixed.
Signed-off-by: Danylo Piliaiev <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Tested-by: Marge Bot <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4189>
Part-of: <https://gitlab.freedesktop.org/mesa/mesa/-/merge_requests/4189>
|