| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The parameter lists were not being created nor filled since i965
doesn't use them. In Gallium they are used for uniform handling, so
add a way to fill them.
The gl_uniform_storage struct got two new fields that let us go
- from a Parameter to the matching UniformStorage and,
- from the variable to the *first* UniformStorage
without relying on names -- since they are optional for ARB_gl_spirv.
Later patches will make use of them.
v2: Do not fill parameters for i965. (Timothy)
Use uint32_t for the new attributes. (Marek)
v3: Serialize the new fields. (Timothy)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The gl_register_file doesn't need 16 bits, so shorten it and use the
extra room for 'Padded' (also mark it as a single bit). This shrinks
the struct size from 32 bytes to 24 bytes.
See also 4794fbc86e3 ("mesa: reduce the size of gl_program_parameter")
that shrinked from 40 to 24 and later 7536af670b7 ("glsl: fix shader
cache for packed param list") that added `Padded`.
v2: Use just 5 bits for gl_register_file. (Timothy)
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Every uniform that have the "gl_" name also have some state slots. So
use the state_slots like we did in 57b61849310 ("i965: account for NIR
uniforms without name").
This removes the dependency on names, which are optional when using
ARB_gl_spirv.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Don't use the UNMAPPED_UNIFORM_LOC (-1) to set the unsigned
max_uniform_location. Those unmapped uniforms don't have to be
accounted at this point.
Fixes: 7a9e5cdfbb9 ("nir/linker: Add gl_nir_link_uniforms()")
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
| |
v4: - Don't wrap a single file in a list to match mesa style
- Use null_dep instead of empty list
Reviewed-by: Eric Anholt <[email protected]> (v3)
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Which will allow meson to build a shared glapi build with mingw.
v2: - Add symbol to symbol check test
Reviewed-by: Eric Anholt <[email protected]> (v1)
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
It doesn't compile due to undefined symbols, which are in
libglapi_static, so I don't understand the problem.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Currently the praser for s expressions assumes that newlines will be \n,
resulting in incorrect parsing on windows, where the newline is \r\n.
This patch just adds \r? to the regular expression used to parse the s
expressions, which fixes at 1 test on windows.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since the system value refactor, we've accidentally only been setting
cbuf->buffer_size in the UBO case, and not in the uploaded-constants
case. We use cbuf->buffer_size to fill out the SURFACE_STATE entry,
so it needs to be initialized in both cases.
Fixes: 3b6d787e404 ("iris: move sysvals to their own constant buffer")
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes some interactions when NGG GS is enabled. It fixes:
- dEQP-VK.clipping.user_defined.clip_cull_distance_dynamic_index.*geom*
- dEQP-VK.tessellation.geometry_interaction.passthrough.*
For some reasons, using the computed ESGS ring size randomly hangs
with CTS. For now, just use the maximum LDS size for ESGS.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This shouldn't be in NIR->LLVM because ACO also needs the shader
info. This will also help for computing some NGG values that are
necessary for declaring LDS symbols.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
| |
Only the pipeline layout and the shader keys are needed.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
possible code sharing with radv
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
| |
ac_surface computes it for amdgpu.
radeon_drm_surface computes it for radeon.
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
This controls FMASK and CMASK computation for MSAA.
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
| |
Cc: 19.2 <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
LLVM 10 won't support 2048.
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
| |
and unify the code.
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
|
|
|
| |
This fixes a crash.
Cc: 19.2 <[email protected]>
Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
|
|
|
| |
Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
Fixes: 900a80f9e4f ("virgl: virgl_transfer should own its virgl_resource")
Signed-off-by: Lepton Wu <[email protected]>
Reviewed-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps
appending data, and calling glFlushMappedBufferRange(). We were
invalidating the VF cache each time it flushed a new range, which
results in a ton of VF flushes.
If the contents of the destination in the target range are undefined
(never even possibly written), this patch makes us assume that it's
likely not in the cache and so cache invalidations are required. If
the destination range is defined, we continue cache flushing as we may
need to expunge stale data.
This eliminates 88% of the VF cache invalidates on Manhattan 3.0.
Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU
frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
|
|
|
|
|
|
| |
This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in
the J2DBench images test. For some reason, the state tracker is calling
bind_sampler_state with the same sampler state in a bunch of cases.
|
|
|
|
| |
This can be useful for debugging missing flushes.
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
x == !GLX_DIRECT_COLOR is a fancy way of writing x == 0, which is
clearly not what was meant.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The line stipple pattern and factor only matter if line stippling is
actually enabled. Otherwise, we can safely ignore it.
PBO upload may give us zero for line stipple information, while normal
drawing tends to give us an actual stipple pattern such as 0xffff. This
was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to
useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus
very expensive.
Improves performance in Manhattan 3.0 on Skylake GT4e by
0.149261% +/- 0.0380796% (n=210). On an Icelake 8x8 with the GPU
frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).
|
|
|
|
|
|
|
|
|
| |
These are supposed to be lowered into sge/slt/seq/sne equivalents.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
int_to_float emits ftrunc and ftrunc lowering generates bool ops.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes following warning:
../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’:
../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds]
241 | "xyzw"[src - gpir_codegen_src_attrib_x]);
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
GP doesn't support fceil so we need to lower it.
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Erico Nunes <[email protected]>
Reviewed-by: Andreas Baierl <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The entire point of schedule_first is that the node has to be scheduled
as soon as possible without any moves because it doesn't produce a
proper floating-point value, or its value changes depending on where you
read it. We were still introducing a move for preexp2 in some cases
though, even if it got scheduled as soon as possible, which broke some
exp() tests. Fix that.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The whole point of schedule_first nodes is that they need to be
scheduled as soon as possible, so if a schedule_first node is the
successor in a fake dependency that prevents it from being scheduled
after its parent, that can cause problems. We need to add these fake
dependencies to the parent as well, and we need to guarantee that the
pre-RA scheduler puts schedule_first nodes right before their parents in
order to prevent this from adding cycles to the dependency graph.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The idea was to make sure schedule_first nodes were always first in the
ready list. I made sure they were inserted first, but not that other
nodes wouldn't later be scheduled ahead of them. Fixes
[email protected]@execution@built-in-functions@vs-exp-float and probably
others.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The point of the function is to avoid creating a complex move which is
used by certain slots in the next instruction, but unscheduled
successors will never be in the next instruction. Found while debugging
a crash that the previous commit fixed.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The scheduler assumes that load nodes are always duplicated so that they
can always be scheduled eventually and therefore they never need to be
spilled. But some lowerings were running after the pre-RA scheduler,
whereas duplication has to happen before then since it's needed for the
scheduler to do a better job reducing register pressure. This meant
that lowerings were introducing multiple uses of a load instruction,
which broke the scheduler's expectation and resulted in infinite loops
in situations where the only nodes available to spill were load nodes.
Spilling load nodes would be silly, so we want to fix the lowerings
rather than the scheduler. Just do all lowerings before the pre-RA
scheduler, which also helps with reducing pressure since the scheduler
can more accurately compute the pressure.
Fixes lima/mesa#104.
Reviewed-by: Vasily Khoruzhick <[email protected]>
Tested-by: Vasily Khoruzhick <[email protected]>
|