| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
Protect signal-related code with PIPE_OS_UNIX test.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Apparently, this has been a bug since 2010 (c30f6e5d).
Also use ARRAY_SIZE instead of open coding it.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
There are no shaders, so it doesn't even make sense to expose the
extension.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
builtin_functions.cpp:5289:52: warning: unused parameter 'num_arguments' [-Wunused-parameter]
unsigned num_arguments,
^
builtin_functions.cpp:5290:52: warning: unused parameter 'flags' [-Wunused-parameter]
unsigned flags)
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I think the intention was to mark the "this" parameter as const, but
const goes on the other end to do that.
In file included from glsl_symbol_table.cpp:26:0:
ast.h:339:35: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
const bool is_single_dimension()
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows arbitrary non-constant indices on GS input arrays,
both for the vertex index, and any array offsets beyond that.
All indirects are handled via the pull model. We could potentially
handle indirect addressing of pushed data as well, but it would add
additional code complexity, and we usually have to pull inputs anyway
due to the sheer volume of input data. Plus, marking pushed inputs
as live due to indirect addressing could exacerbate register pressure
problems pretty badly. We'd need to be careful.
v2: Use updated MOV_INDIRECT opcode.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Abdiel Janulgue <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
- env GALLIUM_HUD_VISIBLE: control default visibility
- env GALLIUM_HUD_SIGNAL_TOGGLE: toggle visibility via signal
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This commit adds code for testing nir_shader_clone by running it after each
and every optimization pass and throwing away the old shader. Testing
nir_shader_clone is hidden behind a new INTEL_CLONE_NIR environment
variable.
Reviewed-by: Rob Clark <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit is heavily based on one by Rob Clark <[email protected]> but
reworked to re-use nir_create functions and do less hashing.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Failing to call nir_metadata_preserve() can have nasty consequences:
some pass breaks dominance information, but leaves it marked as valid,
causing some subsequent pass to go haywire and probably crash.
This pass adds a simple validation mechanism to ensure passes handle
this properly. We add a new bogus metadata flag that isn't used for
anything in particular, set it before each pass, and ensure it *isn't*
still set after the pass. nir_metadata_preserve will reset the flag,
so correct passes will work, and bad passes will assert fail.
(I would have made these functions static inline, but nir.h is included
in C++, so we can't bit-or enums without lots of casting...)
Thanks to Dylan Baker for the idea.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OPT() is the normal macro for passes that return booleans, while OPT_V()
is a variant that works for passes that don't properly report progress.
(Such passes should be fixed to return a boolean, eventually.)
These macros take care of calling nir_validate_shader() and setting
progress appropriately. In the future, it would be easy to add shader
dumping similar to INTEL_DEBUG=optimizer by extending the macro.
v2 (Jason Ekstrand):
- Fix an unused variable warning
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
This will simplify things somewhat in clone.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
No users.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The a4xx bits corresponding to 'freedreno/a3xx: add fake RGTC support
(required for GL3)'
TODO some more r/e.. maybe we get lucky and hw supports some of this
directly? For now this will help us enable gl3.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The main issue is that the current logic looked into cso->u.tex, which
is the wrong side of the union to look into for texture buffers. While I
was at it, it was easy enough to add the logic to handle offsets
(first_element).
- reduce texture buffer size limit (determined experimentally)
- don't look at first/last levels, instead look at first/last element
- include the first element offset
- set offset alignment to 16 (determined experimentally)
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A smarter implementation would make it possible to attach this to emit
state for the BY_REGION versions to avoid breaking the tiling. But this
is a start.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also throw in LATC while we're at it (same exact format). This could be
made more efficient by keeping a shadow compressed texture to use for
returning at map time. However... it's not worth it for now...
presumably compressed textures are not updated often.
Lastly fix up Z32S8 transfers to non-0 layers.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The previously RE'd formats were from an ES driver implementing
OES_vertex_type_10_10_10_2 and thus backwards. A future change could add
the 2_10_10_10 support.
Signed-off-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'd end up in a state where shader uses no inputs, yet num_elements is
greater than zero. Triggered by a TF vertex shader which did:
gl_Position = vec4(0.0, 0.0, 0.0, 0.0);
resulting in a binning pass variant with no inputs.
Includes equiv fix in a4xx, even though we don't have binning-pass
enabled yet on a4xx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
point_size_per_vertex is always TRUE for GLES, causing us to configure
the hw as if gl_PointSize was written, even if it was not. Which makes
for grumpy hw.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In nv50, and in the python script that Rob circulated, we do:
bld.mkCmp(OP_SET, CC_GE, TYPE_U32, (s = bld.getSSA()), TYPE_U32, m, b);
Do the same in the nir div lowering pass. This fixes the large-udiv-udiv
piglit tests on freedreno.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch disables the use of VSX instructions, as they cause some
piglit tests to fail
For more details, see: https://llvm.org/bugs/show_bug.cgi?id=25503#c7
With this patch, ppc64le reaches parity with x86-64 as far as piglit test
suite is concerned.
v2:
- Added check that we have at least LLVM 3.4
- Added the LLVM bug URL as a comment in the code
v3:
- Only disable VSX if Altivec is supported, because if Altivec support
is missing, then VSX support doesn't exist anyway.
- Change original patch description.
Signed-off-by: Oded Gabbay <[email protected]>
Cc: "11.0" <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
Looks like this was forgotten in the commit which added the AFETCH
logic.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
3DSTATE_TE has partitioning, output topology, and domain fields,
each of which has several enumerated values. We'll also need to
switch on the domain, so enums (rather than #defines) seem like a
natural fit.
I chose to put these in brw_compiler.h because they'll be stored
in struct brw_tes_prog_data, which will live there.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We always want to prefer the VGPU10 formats over the VGPU9 ones when
we have VGPU10 support.
Original patch by Jose and updated by Brian.
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is important for the case of sampling from a depth texture. In
that case, we need to sample the texture as if it were a single-channel
color texture. For other/color formats, we can use the format as-is.
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
This will be important for perfcounter queries.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
[Fixed a rebase conflict and re-tested before pushing.]
|
|
|
|
|
|
| |
We will need the clear_result override for the batch query implementation.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The idea here is that driver queries implemented outside of common code
will use the same query buffer handling with different logic for starting
and stopping the corresponding counters.
Reviewed-by: Marek Olšák <[email protected]>
[Fixed a rebase conflict and re-tested before pushing.]
|
|
|
|
|
|
|
|
|
| |
Move r600_query and r600_query_hw into the header because we will want to
reuse the buffer handling and suspend/resume logic outside of the common
radeon code.
Reviewed-by: Marek Olšák <[email protected]>
[Fixed a rebase conflict and re-tested before pushing.]
|
|
|
|
|
|
|
|
| |
Software queries are all queries that do not require suspend/resume
and explicit handling of result buffers.
Reviewed-by: Marek Olšák <[email protected]>
[Fixed a rebase conflict and re-tested before pushing.]
|
|
|
|
|
|
|
|
| |
The goal here is to be able to move the implementation details of hardware-
specific queries (in particular, performance counters) out of the common code.
Reviewed-by: Marek Olšák <[email protected]>
[Fixed a rebase conflict and re-tested before pushing.]
|
|
|
|
|
|
|
| |
More query-related structures will have to be moved into their own
header file to support hardware-specific performance counters.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are currently a bunch of formats that behave strangely when
sampling the cleared color from the MCS buffer on SKL. They seem to
mostly be formats that don't have an alpha component, although it's
not all of them, and we haven't yet found anything in the specs which
would explain this. For now to be on the safe side this patch just
prevents fast clears for MSRTs on SKL altogether so that when fast
clears are eventually enabled it will only be for single-sampled
surfaces. The assumption is that clears are probably more likely to be
used in single-sampled applications anyway so we can at least get them
working and we can enable MSRTs later once we understand the problem
better.
This patch should have no functional effect other than perhaps
receiving fewer perf_debug messages on SKL+.
v2: Improve the commit message to avoid saying the patch disables fast
clears because it will be merged before fast clears are enabled
for any surfaces so it doesn't actually disable anything.
Reviewed-by: Ben Widawsky <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
DEQP likes to do math on uniforms, and the "fmaxabs dst, uni, uni" to get
the absolute value would get lowered. The lowering doesn't bother to try
to restrict the lifetime of the lowered uniforms, so we'd end up register
allocation failng due to this on 5 of the tests (More tests still fail in
RA, which look like we'll need to reduce lowered uniform lifetimes to
fix).
No changes on shader-db, though fewer extra MOVs are generated on even
glxgears (MOVs pair well enough that it ends up being the same instruction
count).
|
|
|
|
|
| |
This does actually happen in the wild (particularly fabs of a uniform), so
we'd like to support it.
|
| |
|
|
|
|
|
|
|
|
| |
It looks like nir_lower_idiv is going to use it soon, so add support.
With Ilia's change, this fixes one case in fs-op-div-large-uint-uint (with
GL 3.0 forced on).
Cc: "11.0" <[email protected]>
|
|
|
|
| |
PIPE_CONTOL!!!
|