| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like floats, we should use the round toward 0 mode instead of the
nearest one (which is the default) for doubles to integers.
This fixes all arb_gpu_shader_fp64 piglits which convert doubles to
integers (16 tests).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
This is mostly academic as this is not reachable from GL, which only has
the packed RGB10_A2UI vertex format.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We should not call nouveau_bufctx_reset() inside a validate function.
This only affects Fermi where images are aliased between 3D and CP.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Images invalidation is a bit weird on Fermi and there is already a hack
which forces invalidating all images when launching a computer shader
to help in fixing 3D<->CP interaction.
However, we need to re-validate images for compute because
nvc0_compute_invalidate_surfaces() will destroy the previous binding.
This is not really good for performance purposes but this might be
improved later.
This fixes the following piglits:
- spec/arb_compute_shader/execution/basic-uniform-access
- spec/arb_compute_shader/execution/mutiple-texture-reading
- spec/arb_compute_shader/execution/multiple-workgroups
- spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests)
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We would revalidate images when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the images.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We would revalidate buffers when anything was touched at all. Which is
unfortunate, since the state tracker does not use CSO's to reduce the
workload. So instead implement a protocol to ensure that something has
changed before revalidating all the SSBOs.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
Loosely based on radeonsi (Thanks to Nicolai).
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: 12.0 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This really only hits for bitsets with a size of a multiple of 32. We
can end up with pos = -1 as a result of the ffs, which we in turn decide
is a valid position (since we fall through the loop and i == 1, we end
up adding 32 to it, so end up returning 31 again).
Up until recently this was largely unreachable, as the register file
sizes were all 63 or 255. However with the advent of compute shaders
which can restrict the number of registers, this can now happen.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This effectively limits registers to 32 and 64 for fermi and kepler when
1024 threads are used, but allows the full amount to be used with
smaller thread sizes.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Pierre Moreau <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
The lane mask only applies to real mov's, while here we're using PSET.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
(a) Make sure to update the TIC in case of an updated buffer address
(b) Mark newly-inactive textures dirty so that we update the handle in
set_tex_handles.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Push offset down to drivers when importing dmabuf. This is needed
to more fully support EGL_EXT_image_dma_buf_import when a non-zero
offset is specified.
Tesing has been done for freedreno, and compile tested following
gallium drivers:
nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo
Signed-off-by: Stanimir Varbanov <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This is apparently legal. Drop any emit/restarts, and pass a 1 to the
hardware.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.2 11.1" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This code was used for validating surfaces with compute but now we use
pipe_image_view instead. Anyway, surfaces support should be
re-introduced properly once OpenCL happens.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Constant buffers are aliased between 3D and CP on Fermi, but we should
only invalidate them when a compute shader actually uses CBs and not
all the time after a lauching grid.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a load locked, we might not use the first result but the second
result is the predicate result of the locking. In that case the load
splitting logic doesn't apply (which is designed for splitting 128-bit
loads). Instead we take the predicate and move it into the first
position (as having a dead result in first def's position upsets all
sorts of things including RA). Update the emitters to deal with this as
well.
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For user-supplied constbufs, fileIndex is 0. In that case, when we
subtract 1, we'll end up loading from constbuf offset -16. This is
illegal, and there are asserts to avoid it. Normally we'd just DCE it,
but no point in generating the instructions if they're not going to be
used.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Hans de Goede <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Like constant buffers, samplers and textures are aliased on Fermi and
we need to invalidate the state when switching from 3D to CP and vice
versa.
This fixes rendering issues in the UE4 demos.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To read out MP perf counters we use a compute shader and need to upload
input data like a 64-bits addr used to store the values and a sequence
ID for synchronization. Currently, this input data is uploaded as user
uniforms which means that it's sticked to c0[], but if a compute shader
from a real application is used, monitoring those performance counters
will just overwrite some data and miserably crash.
Instead, sticking the 64-bits addr and the sequence into the driver
constant buffer seems like much better and will allow to monitor
counters with GL 4.3 apps.
Tested on GF119 and GK110, but should not hurt anything on GK104.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Not piping this all the way through yet, but no better place to note
this down. This will can be used with NV_viewport_array2.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For fermi, this likely will require use of linked tsc mode. However on
bindless architectures, we can have as many as we want. As it stands,
the AUX_TEX_INFO has 32 teture handles reserved.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The GALLIUM_HUD does not yet expose a description for each events, but
this might be useful for developers who want to have a long description
of hw perf counters directly in the source code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
We apparently pass all the relevant CTS tests. There are probably some
shortcomings, but they can be addressed down the line.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some hardware supports primitive restart on patch primitives, and other
hardware does not. Modern GL and ES include a query for this feature;
adding a capability bit will allow us to answer it.
As far as I know, AMD hardware does not support this feature, while
NVIDIA and Intel hardware does. However, most Gallium drivers do not
appear to support tessellation shaders yet. So, I've enabled it for
nvc0 and disabled it everywhere else.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Constbufs are only aliased on Fermi and this will reduce the number of
flushes when we switch between 3d and compute.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This happens with dEQP tests. The code doesn't at all protect against
this condition, so while unhandled, this is an expected situation.
Also avoid using more than the first 16 registers for nv3x vertex
programs.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
On nv30, for example, there is no hardware index buffer support. So all
of those will be created entirely in user memory.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the array doesn't start at 0 we need to account for su->tex.r.
While we are at it, make sure to avoid out of bounds access by masking
the index.
This fixes GL45-CTS.shading_language_420pack.binding_image_array.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reported-by: Dave Airlie <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Apparently the stencil mask applies to clears on nv30/nv40. Reset it to
0xff before doing a stencil clear. This fixes gl-1.0-readpixsanity and
a number of other piglit tests.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
The mesa state tracker has recently started to query this.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a lot of INVALID_VALUE errors reported by the card when
running dEQP tests.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Left over from the pre-mainline tess support. Adapt to use the new
defines.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This makes sure that rIndirectSrc and other things stay updated.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|