| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Note that 'geometry shader properties' should be carried in the
selector state over the shader state in any case.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
It confuses my editor.
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Cc: [email protected]
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The hardware is capable of dealing with GL1-style user clip planes.
No clip vertex, no clip distances. Fixes a number of ucp tests, as well
as neverball.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
| |
According to NVIDIA, local performance counters (MP) are prefixed
with SM, while global performance counters (PCOUNTER) are called PM.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This code was broken by the tess merge, and I totally missed it
until now. I'm not sure this fixes anything but it stops the assert.
Cc: "11.0" <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise this will crash on 32-bit, and it gets rid of
warnings building on 32-bit.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On 32-bit we need to use PRIu64 flags for printfs,
otherwise this segfaults in R600_DEBUG=help otherwise.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shaders that contain instruction data after an instruction with EOP could end
up parsing that as an instruction, leading to various crashes and asserts in
SB as it gets very confused if it sees for instance a loop start instruction
jumping off to some random point.
Add a couple of asserts, and print EOP bit if set in old asm printer.
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
e8e443 missed adding check for undef values also in
unreserve function, leading to an assert triggering.
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR cursor API is exactly what we want for the builder's insertion
point. This simplifies the API, the implementation, and is actually
more flexible as well.
This required a bit of reworking of TGSI->NIR's if/loop stack handling;
we now store cursors instead of cf_node_lists, for better or worse.
v2: Actually move the cursor in the after_instr case.
v3: Take advantage of nir_instr_insert (suggested by Connor).
v4: vc4 build fixes (thanks to Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]> [v4]
Acked-by: Connor Abbott <[email protected]> [v4]
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If the packet encoding is defined in the same format as register definitions,
the python script can process them automatically and the parser support
becomes trivial.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds trace points to all IBs and the parser prints them and also
prints which trace points were reached (executed) by the CP.
This can help pinpoint a problematic packet, draw call, etc.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
Some of it is left there and it will be re-used in the next commit.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
GPU hang detection must be enabled by setting: GALLIUM_DDEBUG=[timeout in ms]
This may print too much information that we might not understand yet,
but some of the bits are very useful.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
This will be used by the IB parser.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes writing a good IB parser a lot easier.
It generates 2 tables:
- packet3 table
- register table with all registers, fields, and named values
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instruction encoding isn't needed in Mesa.
The border color address registers were duplicated.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
This is usually called after a draw call.
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: lots of improvements
This is like identity or trace, but simpler. It doesn't wrap most states.
Run with:
GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.
If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
/home/$username/dd_dumps/$processname_$pid_$index.
Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.
You can also do:
GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.
Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.
v2: drop pipe-loader changes, drop SConscript
rename dd.h -> dd_pipe.h
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows creating compute-only and debug contexts.
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise we get:
warning: 'num_user_sgprs' may be used uninitialized in this function
...
Reviewed-by: Michel Dänzer <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly. The proper way is to set the insertion point and
then simply insert things there.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it easy for NIR passes to inspect what kind of shader they're
operating on.
Thanks to Michel Dänzer for helping me figure out where TGSI stores the
shader stage information.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit:
commit b49371b8ede380f10ea3ab333246a3b01ac6aca5
Author: Connor Abbott <[email protected]>
AuthorDate: Tue Jul 21 19:54:18 2015 -0700
nir: move control flow modification to its own file
split out some control flow related APIs into a separate header, but did
not update drivers.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The commit:
commit 8e0d4ef3410ea07d9621df3e083bc3e7c1ad2ab0
Author: Kenneth Graunke <[email protected]>
AuthorDate: Thu Aug 6 18:18:40 2015 -0700
nir: Delete the nir_function_impl::start_block field.
removed the start_block field without fixing up drivers..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes glamor, which wants to use R8 integer textures.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 567394112d904096abff1d994ab952f475dfb444.
It regressed performance. It looks like smaller IBs are better, because
the GPU goes idle quicker and there is less waiting for buffers and fences.
Cc: 11.0 <[email protected]>
|
|
|
|
|
|
|
| |
This fixes bin/ext_framebuffer_multisample-formats all_samples
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
|
| |
Same as commit 1af0641db but for nvc0. If an integer texture is
bound to RT0, don't do alpha-to-one or alpha-to-coverage.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|