| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Pierre Moreau <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't need to support all the color buffers for advanced blend, just
cb0. For Fermi, we use the special binding slots so that we don't
overlap with user textures, while Kepler+ gets a dedicated position for
the fb handle in the driver constbuf.
This logic is only triggered when a FBFETCH is actually present so it
should be a no-op most of the time.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The state tracker tries to attach the info to the wrong shader. This is
easy enough to protect against.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Cc: 12.0 13.0 <[email protected]>
|
|
|
|
|
|
|
|
| |
When the chipset is forced with NV50_PROG_CHIPSET, we actually
only want to output the binary if NV50_PROG_DEBUG is also
enabled. Otherwise, this pollutes the shader-db output.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
| |
Currently, program binaries are only dumped at upload time, but
when the chipset has been forced via NV50_PROG_CHIPSET we might
want to show the generated code, especially with shaderdb.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds a new envvar called NV50_PROG_CHIPSET which allows to
compile shaders with a different target, especially useful for
shader-db.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When an application uses a ton of shaders, we need to evict them
when the code segment is full but this is not really a good solution
if monster shaders are used because code eviction will happen a lot.
To avoid this, it seems better to dynamically resize the code
segment area after each eviction. The maximum size is arbitrary
fixed to 8MB which should be enough.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a very old issue which happens when the code segment
size is full. A bunch of real applications like Tomb Raider,
F1 2015, Elemental, hit that issue because they use a ton of shaders.
In this case, all shaders are evicted (for freeing space) but all
currently bound shaders also need to be re-uploaded and SP_START_ID
have to be updated accordingly.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This refactoring will help for fixing the "out of code space"
eviction issue because we will need to reupload the code for
all currently bound shaders but it's slightly different than
uploading a new fresh code.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This has never been used because info->immd.bufSize is always 0
and anyways this is an experimental code which has never been
completed.
This gets rid of some unused code in the program validation process.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The number of outputs patch (limited to 255) has moved in the TCP
header, but blob seems to also set the old position. Also, the high
8-bits are now located inbetween the min/max parallel output read
address at position 20.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware can only do alphatest when using a blendable format. This
means that the various *16 norm formats didn't work with alphatest. It
appears that Talos Principle uses such formats, as well as alpha tests,
for some internal renders, which made them be incorrect. However this
does not appear to affect the final renders, but in a different game it
easily could.
The approach we take is that when alphatests are enabled and a suitable
format is used (which we anticipate is the vast minority of the time),
we insert code into the shader to perform the comparison and discard.
Once inserted, that code lives in the shader forever, and we re-upload
it each time the function changes with a fixed-up compare. To avoid
re-uploading too often, if we switch back to a blendable format, the
test is (effectively) disabled and the hw alphatest functionality is
used.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Add support for SV_WORK_DIM for nvc0 and nve4.
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO
and NVC0_CB_AUX_TEX_INFO.
Signed-off-by: Hans de Goede <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Basically we just have to scale up the coordinates and then add the
relevant sample offset. The code to handle this was already largely
present from Christoph's earlier attempts to pipe images through back in
the dark ages, this just hooks it all up.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This is apparently legal. Drop any emit/restarts, and pass a 1 to the
hardware.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Not piping this all the way through yet, but no better place to note
this down. This will can be used with NV_viewport_array2.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes a lot of INVALID_VALUE errors reported by the card when
running dEQP tests.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.1 11.2" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Cull distances are just a special case of clip distances as far as the
hardware is concerned. Make sure that the relevant "planes" are enabled,
and flip the clip mode to cull for those.
Signed-off-by: Tobias Klausmann <[email protected]>
[imirkin: add enables on nvc0, add nv50 support]
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tobias Klausmann <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The SAMPLEMASK semantic should only return the bits set covered by the
current invocation. However we were always retrieving the covmask, which
returns the covered samples of the whole pixel.
When not doing per-sample invocation, this is precisely what we want.
However when doing per-sample invocation, we have to select the
sampleid'th bit and only return that. Furthermore, this means that we
have to have a 1:1 correlation for invocations and samples.
This fixes most
dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.*
tests. A few failures remain due to disagreements about nr_samples==1
logic as well as what happens with MSAA x2 RTs when the shading fraction
is 0.5.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Similar to surfaces validation for compute shaders.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Old surfaces validation code will be removed once images are completely
done for Fermi/Kepler, that explains why I only disable it for now.
This also introduces nvc0_get_surface_dims() which computes correct
dimensions regarding the given target.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Acked-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using the screen->parm buffer object which will be removed,
upload auxiliary constants to uniform_bo to be consistent regarding
what we already do for Fermi.
This breaks surfaces support (for compute only) but this will be
properly re-introduced later for ARB_shader_image_load_store.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
To not overwrite buffers and surfaces information, we need to use
a different offset in the driver constant buffer. Currently, OP_SUQ
is only supported for buffers but this will be slightly updated for
images support.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead make use of constants to improve readability.
The first 32 bytes of the driver constant buffer are unknown... This
doesn't seem to be used in the codegen part, but if the texBindBase
offset is shifted from 0x20 to 0x00, this breaks the universe for
really weird reasons. This sounds like to be related to textures.
Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be
enough for now.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Having two different variables for the driver constant buffer slot
is confusing and really useless.
Signed-off-by: Samuel Pitoiset <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
Acked-by: Pierre Moreau <[email protected]>
|
|
|
|
|
|
|
|
| |
This is loosely based on 3D. Shader buffers are bound on c15 (the
driver constbuf) at offset 0x200.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Changes from v3:
- add new validation state for COMPUTE driver constbuf
Changes from v2:
- always bind the driver consts even if user params come in via clover
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
This largely leaves the existing image logic alone. When image support
is added this will have to be harmonized somehow.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Quads need to respect winding order, too - not just triangles.
Fixes rendering in GFXBench 4.0's tessellation benchmark.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: "11.0 11.1" <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately flatshading is an all-or-nothing proposition on nvc0,
while GL 3.0 calls for the ability to selectively specify explicit
interpolation parameters on gl_Color/gl_SecondaryColor which would
override the flatshading setting. This allows us to fix up the
interpolation settings after shader generation based on rasterizer
settings.
While we're at it, we can add support for dynamically forcing all
(non-flat) shader inputs to be interpolated per-sample, which allows
st/mesa to not generate variants for these.
Fixes the remaining failing glsl-1.30/execution/interpolation piglits.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Indications are that if the colormask indicates a single bit set on
fermi, that value will always be read from $r0 instead of a potentially
higher register (if e.g. green is set). Not to upset the counting logic,
always set the header up with a full color mask for each RT. Such a
situation can basically only ever happen with generated blit shaders.
Fixes the following piglit on Fermi (Kepler is unaffected):
fbo-stencil blit GL_DEPTH32F_STENCIL8
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently this is necessary in order for tess factors to work in a tess
eval program without a tess control program bound. Probably because it
uses the fake program's shader header to work out the number of patch
constants.
Fixes vs-tes-tessinner-tessouter-inputs
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no need to attempt to avoid overlapping generic i/o with patch
i/o. By the same token, we can't merge patch and non-patch loads/stores.
This fixes at least the
tes-both-input-array-*-index-rd
tessellation variable-indexing tests.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Also add proper handling for PATCH semantics
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Uncomment the various functionality that was already there and add in
obvious missing bits that parallel vp/gp/fp functionality.
Signed-off-by: Ilia Mirkin <[email protected]>
|