| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We add a lowering pass for nir intrinsics. This pass can replace nir
intrinsics with driver specific nir lower code.
We lower the gl_LocalInvocationIndex intrinsic based on a uniform
which is loaded with a thread specific ID.
We also lower the gl_LocalInvocationID based on
gl_LocalInvocationIndex.
v2:
* Create variable during lowering pass. (Ken)
v3:
* Don't create a variable, but instead just insert an intrisic call
to load a uniform from the allocated location. (Jason)
v4:
* Don't run this pass if thread_local_id_index < 0
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This thread ID uniform will be used to compute the
gl_LocalInvocationIndex and gl_LocalInvocationID values.
It is important for this uniform to be added in the last push constant
register. fs_visitor::assign_constant_locations is updated to make
sure this happens.
The reason this is important is that the cross-thread push constant
registers are loaded first, and the per-thread push constant registers
are loaded after that. (Broadwell adds another push constant upload
mechanism which reverses this order, but we are ignoring this for
now.)
v2:
* Add variable in intrinsics lowering pass
* Make sure the ID is pushed last in assign_constant_locations, and
that we save a spot for the ID in the push constants
v3:
* Simplify code based with Jason's suggestions.
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v4:
* Force thread_local_id_index to -1 for now, and have
fs_visitor::setup_cs_payload look at thread_local_id_index. This
enables us to more easily cut over from the old local ID layout to
the new layout, as suggested by Jason.
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2:
* simd16/32 fixes (curro)
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2:
* Move lower flag to context constants. (Ken)
Cc: "12.0" <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes 64 Vulkan CTS tests per gen
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96299
Reviewed-by: Francisco Jerez <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
When lowering, we always want to use the clip dist varying.
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
So that memory use will eventually decrease again after a temporary peak.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
As a consequence, CE IB size never triggers a flush anymore.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
We will chain multiple chunks together and will keep pointers to the older
chunks to support IB dumping.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
While we're at it, add packet printing in si_debug.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids allocating giant IBs from the outset, especially for CE and DMA.
Since we now limit max_dw only by the size that the buffer happens to be
(which, due to the buffer cache, can be even larger than the rounded-up size
we request), the new function amdgpu_ib_max_submit_dwords controls when we
submit an IB.
With this change, we effectively never flush prematurely due to the CE IB,
after an initial warm-up phase.
v2:
- clean up buffer_size calculation
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
The latter function allows getting the containing amdgpu_cs from any IB
(including non-main ones).
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Adding the buffer when we start using it for the IB makes the logic for
chaining a bit simpler.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We'll want to have an amdgpu_cs pointer for future changes.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
v2: style change
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Isolines aren't reversed. commit 5b2d8c2273c6f fixed this for the vec4
TES backend, but not the scalar one.
Found while debugging GL45-CTS.tessellation_shader.
tessellation_control_to_tessellation_evaluation.gl_tessLevel.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
v2: require PIPE_CAP_SAMPLER_VIEW_TARGET; technically only needed for some of
the texture targets, but all hardware that has shader images should also
have this cap.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
For better bisectability given that the order of some of the fallback tests
in the blit path are rearranged.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will be used to select a slice of a 3D texture.
v2: fix a comment (Marek)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For downloads, the fragment shader must know the source texture target, hence
we may cache multiple fragment shaders.
v2: break long line (Marek)
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
At the same time, rename members that are upload-specific to say so.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Because apparently layout(max_vertices=0) is a thing.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
This can occur when max_vertices=0 is explicitly specified.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The string "[0]\0" is the same as "[0]" as far as the C string datatype
is concerned. That string has length 3. strncmp(s, length_3_string, 4)
is the same as strcmp(s, length_3_string), so make it be strcmp.
v2: Not the same as strncmp(..., 3). Noticed by Ilia.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
| |
Sadly, this doesn't affect SI and VI in any way.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
This mimics Vulkan. It also documents how to fix stencil texturing.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This improves throughput by keeping TTM overhead down.
Some piglit tests such as texelFetch and streaming-texture-leak will
use less memory now.
v2: use gart_size / 4 as the threshold
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Next commits will add other things around this.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
to allow reallocating the texture storage with different parameters
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
it will be used by texture reallocation
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
v2: fix set_shader_images(..., NULL). Found by Christoph Haag.
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
| |
mainly the fields that can change by reallocating a texture and changing
the tile mode
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|