| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
While we're at it, we break it into two nicely named functions.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't go all the way of avoiding the txf_ms if it's fast-cleared,
however it does at least make us only do it once. This should improve
performance of MSAA resolves in the presence of lots of clear color.
Without the patch, enabling fast-clears in the multisampling Sascha demo
drops the framerate by about 10%. With this patch, enabling fast-clears
increases the demo's framerate by 25%.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That name is already taken by one of the helpers in blorp_nir_builder.h
and, while we haven't moved the guts of blorp_blit.c there yet, we'd
like to start using some things from that header.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we split an instruction that reads an uniform value
(vstride 0) we need to respect the vstride on the second
half of the instruction (that is, the second half should
read the same region as the first).
We were doing this already, but we didn't account for
stages that have interleaved input attributes which also
have a vstride of 0 and need the same treatment.
Fixes the following on Haswell:
KHR-GL45.enhanced_layouts.varying_locations
KHR-GL45.enhanced_layouts.varying_array_locations
KHR-GL45.enhanced_layouts.varying_structure_locations
Reviewed-by: Matt Turner <[email protected]>
Acked-by: Andres Gomez <[email protected]>
|
|
|
|
|
|
|
| |
This removes a few hundred warnings on debug builds with asserts off.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the kernel support flagging our BO, let's mark batch &
instruction BOs for capture so then can be included in the error
state.
v2: Only add EXEC_CAPTURE if supported (Kristian)
v3: Fix operator precedence issue (Lionel)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This will allow to set the flags on any anv_bo created/filled from a
state pool or block pool later.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The gen had to be changed from 4 to 6 so that we could test MAD, which
is new on Gen6.
mad_imm_float_neg_mov_sat tests the case fixed by the previous commit.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
MADs don't take immediate sources, but we allow them in the IR since it
simplifies a lot of things. I neglected to consider that case.
Fixes: 4009a9ead490 ("i965/fs: Allow saturate propagation to propagate
negations into MADs.")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103616
Reported-and-Tested-by: Ruslan Kabatsayev <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <[email protected]>
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_disasm_info header is included by certain tools in order to get
shader assembly from binaries so it's a semi-external header. Including
brw_cfg.h also pulls in brw_shader.h so you end up getting quite a bit
of our back-end compiler internals. Instead, make the couple of forward
declarations we need and make the header more stand-alone. This fixes
the meson build.
Reviewed-by: Matt Turner <[email protected]>
Fixes: 4f82b17287194ca7d10816f6cfe4712a3e0a03fc
|
|
|
|
|
|
|
|
| |
Fixes: 4f82b1728719 ("i965: Rewrite disassembly annotation code")
Cc: Matt Turner <[email protected]>
Signed-off-by: Andres Gomez <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
It was the only file named intel_* in the compiler.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old code used an array to store each "instruction group" (the new,
better name than the old overloaded "annotation"), and required a
memmove() to shift elements over in the array when we needed to split a
group so that we could add an error message. This was confusing and
difficult to get right, not the least of which was because the array
has a tail sentinel not included in .ann_count.
Instead use a linked list, a data structure made for efficient
insertion.
Acked-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I'm going to change the call in a later patch and with the difference in
indentation level it wasn't immediately obvious that the calls were
identical.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise, if the image is not bound to the start of the buffer, we're
going to be reading and writing its fast clear state in the wrong spot.
Reviewed-by: Lionel Landwerlin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
| |
Found by inspection
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
Original 965 sets bits 28:27 to 0, while G45 and later set it to 1.
Note that the G45 docs are incorrect in this regard - see the DevCTG+
note in the Ironlake PRMs.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
This isn't necessary and causes trouble for a project I'm working on.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We use the same hardware mechanism for both atomic counters and SSBO
atomics, so there's really no benefit to maintaining separate code to
handle each case. Instead, we can just use Rob's shiny new NIR pass to
convert atomic_uints to SSBOs, and delete piles of code.
The ssbo_start section of the binding table becomes a combined ABO and
SSBO section, with ABOs first, then SSBOs.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We already have this workaround in OpenGL driver.
See Mesa commit 3cf4fe2219.
Signed-off-by: Anuj Phogat <[email protected]>
Cc: Nanley Chery <[email protected]>
Cc: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit e8c9e65185de3e821e1e482e77906d1d51efa3ec.
With the actual bug fixed (by commit 6ac2d1690192), this is not
necessary. I'm doubtful of its correctness in any case.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The MOV instruction can extract bytes to words/double words, and
words/double words to quadwords, but not byte to quadwords.
For unsigned byte to quadword, we can read them as words and AND off the
high byte and extract to quadword in one instruction. For signed bytes,
we need to first sign extend to word and the sign extend that word to a
quadword.
Fixes the following test on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotBitmasks
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103628
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Fixes the following tests on CHV, BXT, and GLK:
KHR-GL46.shader_ballot_tests.ShaderBallotFunctionBallot
dEQP-VK.spirv_assembly.instruction.compute.uconvert.uint32_to_int64
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=103115
|
|
|
|
|
|
|
|
| |
This makes our MOCS settings significantly more flexible.
Cc: "17.3" <[email protected]>
Tested-by: Lyude Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Cc: "17.3" <[email protected]>
Tested-by: Lyude Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Cc: "17.3" <[email protected]>
Tested-by: Lyude Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a bit more annoying than your average shader - we need to look
at MEDIA_INTERFACE_DESCRIPTOR_LOAD in the batch buffer, then hop over
to the dynamic state buffer to read the INTERFACE_DESCRIPTOR_DATA, then
hop over to the instruction buffer to decode the program.
Now that we store all the buffers before decoding, we can actually do
this fairly easily.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
while loops skip the first field of the instruction/structure, which
is not what the code intended. It works out because the field we're
looking for doesn't happen to be first, but we ought to do it right
regardless.
Found while writing the next patch, where Kernel Start Pointer is
the first field of INTERFACE_DESCRIPTOR_DATA.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes aubinator_error_decode's shader dumping work like aubinator.
Instead of printing them after the fact, it prints them right inside the
3DSTATE_VS/HS/DS/GS/PS packet that references them. This saves you the
effort of cross-referencing things and jumping back and forth.
It also reduces a bunch of book-keeping, and eliminates the limitation
that we could only handle 4096 programs. That code was also broken and
failed to print any shaders if there were under 4096 programs.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This lets us complete parsing and storing of each buffer's data before
we begin decoding the batchbuffer. This makes it possible to inspect
the state buffer and program buffer, so we can properly decode any
indirect state or shader programs.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
| |
Ported from intel_error_decode. We don't want to run off the end.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
| |
Dead code.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Based on a similar patch to intel_error_decode by Chris Wilson.
While we're de-duplicating the gtt_offset calculation, we can simplify
it to assume two hex digits are there - the kernel has done this since
v4.6, and we already require error states from v4.10.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
| |
These three are the only we can reasonably decode with genxml.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
| |
Also change count from a pointer into a value. We were supposed to
be resetting it to 0 (and failed to), but that's gone since we dropped
the pre-ascii85 handling.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Error state files used to look like:
render ring --- gtt_offset = 0x0e8f6000
00000000 : 69040000
00000004 : 79090000
...
00007ffc : 00000000
--- ringbuffer = 0x00001000
There were thousands of lines between sections. The file format changed
with Kernel 4.10, and now has a single ascii85-encoded line following
each section heading. This is much easier to parse.
There are a bunch of bugs in our handling of the old style format,
where we'd decode the wrong data, at the wrong time. Fixing all of
these is going to be a giant pain. It's also a lot of extra code
complexity. In order to properly decode indirect state, or compute
shaders, we'll also need to parse data in advance of decoding, which
is going to be a giant pain with this ad-hoc "decode everywhere!"
mentality. So, let's just drop support for the older file format.
This unfortunately requires an error state generated by Kernel 4.10 or
later. That's probably not the end of the world, as we encourage users
to upgrade to the latest kernel when encountering GPU hangs anyway. It
might be a giant pain for people with LTS kernels, though...
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The dashes "---" may occur within an ascii85 block, but only an ascii85
block starts with ':' or '~'.
Ported from Chris Wilson's intel-gpu-tools commit:
bceec7e1d8a160226b783c6344eae8cbf4ece144
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
| |
It's a neat idea, and still useful in some cases, but the intel common
code is used by i965 and anvil only, this is a little clearer.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The previous iteration algorithm would advance the field pointer right
after we advance the group. This meant that you would end up with
skipping the first field of the group. In the common case, where the
only field is a struct (e.g. 3DSTATE_VERTEX_BUFFERS), it would get
skipped entirely.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
They serve no purpose other than to just fill empty space in the packet
so each dword has something. Just disallowing empty groups is a bit
easier on some of the tools. This does not change the generated packing
headers in any way.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
We renamed "Function Enable" to "Enable", which broke our detection
of whether shaders are enabled or not. So, we'd see a bunch of HS/DS
packets with program offsets of 0, and think that was a valid TCS/TES.
Fixes: c032cae9ff77e (genxml: Rename "Function Enable" to "Enable".)
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These flags are set for C sources, but not C++. This causes symbol
visibility leaks from the C++ parts of the Intel compiler.
Fixes: 700bebb958e93f4d ("i965: Move the back-end compiler to src/intel/compiler")
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|