| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
We're going to use "disassemble" for the function that disassembles
the whole program.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Has been misaligned since we added instruction offset prefixes.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
brw_disasm doesn't disassemble compacted instructions, so we uncompact
before disassembling them which would unset the compaction control bit.
Instead pass it as a separate argument.
Acked-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This gets us disasm of atomic ops.
v2: Fix fallthrough on pre-gen7. (bug caught by Ilia Mirkin).
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit e57d77280efcbfd6579a88f071426653287ef833, I fixed this for
destinations in the Vec4 backend, and sources in the scalar backend.
But not both types in both backends.
To prevent this mess from continuing, make the reg_encoding table
static, so only the disassembler can use it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
<4,1,1> isn't a real thing. We meant <4,4,1>, i.e., each component of
the whole register.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On released hardware, values 4-6 are overloaded. For normal registers,
they mean UB/B/DF. But for immediates, they mean UV/VF/V.
Previously, we just created #defines for each name, reusing the same
value. This meant we could directly splat the brw_reg::type field into
the assembly encoding, which was fairly nice, and worked well.
Unfortunately, Broadwell makes this infeasible: the HF and DF types are
represented as different numeric values depending on whether the
source register is an immediate or not.
To preserve sanity, I decided to simply convert BRW_REGISTER_TYPE_* to
an abstract enum that has a unique value for each register type, and
write translation functions. One nice benefit is that we can add
assertions about register files and generations.
I've chosen not to convert brw_reg::type to the enum, since converting
it caused a lot of trouble due to C++ enum rules (even though it's
defined in an extern "C" block...).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Three-source instructions use a different encoding for register types
(and have a much more limited set to choose from).
Previously, we translated those into BRW_REGISTER_TYPE_* values, then
reused the existing reg_encoding mapping.
Doing it directly is more straightforward and actually less code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
UB types have never been supported as immediates. On Gen4-5, register
encoding 4 is "Reserved." On Gen6+, it means UV.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
sed -i -e 's/GLuint/unsigned/g' -e 's/GLint/int/g' \
-e 's/GLfloat/float/g' -e 's/GLubyte/uint8_t/g' \
-e 's/GLshort/int16_t/g' \
brw_eu* brw_disasm.c brw_structs.h
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Using the ADDC and SUBB instructions on Gen7.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Specifically
bfe - for bitfieldExtract()
bfi1 and bfi2 - for bitfieldInsert()
bfrev - for bitfieldReverse()
cbit - for bitCount()
fbh - for findMSB()
fbl - for findLSB()
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
Never existed? At least never supported. Doesn't appear in 965, G45,
or ILK documentation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Like MAD, this is another three-source instruction.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLSL ES 3.00 operations packHalf2x16 and unpackHalf2x16 will emit
these opcodes.
- Define the opcodes BRW_OPCODE_{F32TO16,F16TO32}.
- Add the opcodes to the brw_disasm table.
- Define convenience functions brw_{F32TO16,F16TO32}.
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Paul Berry <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The bug: The printed horizontal stride was the numerical value of the
BRW_HORIZONTAL_$N enum.
The fix: Translate the enum before printing.
Note: This is a candidate for the stable releases.
Reviewed-by: Eric Anholt <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
Gen7 stores the JIP/UIP bits in different places.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This makes our output more consistent with other disasm tools, and
will be necessary when we start using f0.1.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We've been calling it a register number, it's actually the subregister,
and things will get confusing once we start using it if it isn't fixed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is super basic, but it let me visualize a problem I had with
opt_compute_to_mrf().
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
v2: Make the strings in the tables const, too.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
v2: Fix MRF handling on gen7.
Reviewed-by: Kenneth Graunke <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
| |
msg_type moved by a bit, so the message type was being disassembled
incorrectly. In particular, render target writes were showing up as
"OWORD block write".
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Compared to sampler_gen5, simd_mode shifted by a bit and msg_type grew
by a bit. So we were printing slightly incorrect numbers.
NOTE: This is a candidate for stable release branches.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We care about the jump distance, not that the first src is always the
ip register.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Finding this bit in the documentation proved challenging. It wasn't in
the SEND instruction's message descriptor section, nor the data port
message descriptor section. It turns out to be part of the Render
Target Write message's control bits, and in the documentation is named
"Last Render Target Select".
Shaders that use Multiple Render Targets should set this bit on the last
RT write, but not on any prior ones.
The GPU does update the Pixel Scoreboard appropriately, but doesn't
document this bit as directly causing a scoreboard clear.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After printing the details of a specific message, we always print out
the message length and response length with nice "mlen" and "rlen"
labels.
For Gen5+ URB writes, we were dumping mlen and rlen a second time:
urb 0 urb_write interleave used complete mlen 5, rlen 0 mlen 5 rlen 0
Also, for Gen6 data port messages, we were including mlen and rlen in
the tuple of undecipherable integers.
Both of these are completely redundant. So, remove them.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When reading the data port code, it was not clear to me what these
values meant, nor where I could find them in the documentation.
Especially since the latest BSpec and older PRMs document them in
radically different places...neither of which are near the descriptions
of individual messages.
Cite the documentation, and rename them to SFID to signify that these
are Shared Function IDs that one can read about in the GPU overview,
rather than arbitrary bitfields. While we're add it, make them an enum.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The opcodes and strings were reversed. Quotient means division, and
modulus means remainder.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
This is actually just the message descriptor for Gen6+ dataport access;
it has nothing to do with the render cache. Access to the sampler cache
and constant cache also would use this struct; rename for clarity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
It's mostly like gen4 message descriptor setup, except that the sizes
of type/control changed to be like gen5. Fixes 21 piglit cases on
gm45, including the regressions in bug #32311 from increased VS
constant buffer usage.
|
| |
|
| |
|
|
|
|
|
|
| |
This is apparently required, as the thread will be initiated while it
still has dependencies, and this is what waits for those to be
resolved before writing color.
|
| |
|
|
|
|
|
|
| |
The jump delta is now in the part of the instruction where the
destination fields used to be, and the src args are ignored (or not,
for the new non-predicated IF that we don't use yet).
|
|
|
|
| |
It instead sensibly appears in the src0 slot.
|
| |
|
| |
|