| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Noticed by James Legg @ Feral.
Cc: 17.2 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
From 25-26 min fps to 31, used the game in conjuction with a mod (full
invasion 2) beaumaris castle map and 200 bots.
|
|
|
|
|
|
|
|
|
|
|
|
| |
We'll fail to flag an error if the context flags appear after the
no-error attribute in the context attribute list.
Delay the check to after attribute parsing to fix this.
Fixes: 4909519a665 ("egl: Add EGL_KHR_create_context_no_error support")
Cc: [email protected]
[Emil Velikov: add fixes/stable tags, commit message polish]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
We have a few UUIDs, so lets be more specific.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The number of supported waves per thread group has been reduced to 16
with gfx9. Trying to use 32 waves causes hangs, and barriers might
not work correctly with > 16 waves.
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The firmware version numbers for SI were wrong. The new numbers are probably
too conservative (we don't have a definitive answer by the firmware team),
but DRAW_INDIRECT_MULTI has been confirmed to work with these versions on
Tahiti (by Gustaw) and on Verde (by myself).
While this is technically adding a feature, it's a feature we thought we had
for a long time. The change is small enough and we're early enough in the 17.2
release cycle that it should still go in.
Reported-by: Gustaw Smolarczyk <[email protected]>
Cc: 17.2 <[email protected]>
Acked-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The EU limit of 128 GRFs should allow 32 vertex elements of 4 GRFs.
However, the maximum allowed value of "Vertex URB Entry Read Length"
in SIMD8 is 15. And 15 * 8 = 120 gives us a limit of 30 vertex elements.
Because we also need to reserve a vertex buffer to upload
VertexIndex/InstanceIndex and another to upload DrawID when needed,
we can only expose 28.
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Cc: "17.2" <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Here the AUX_USAGE_* mode indicates that we have HiZ, so we will have
a HiZ buffer. But Coverity doesn't know that, so it thinks it might
be NULL because we checked hiz_buf != NULL earlier.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Caught by Coverity (CID 1415680).
Cc: "17.2" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Increase the value, not the pointer to the stack variable.
Caught by Coverity (CID 1415574). Not shipped in a real release.
Cc: "17.2" <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
No changes, just re-indent.
Signed-off-by: Andres Rodriguez <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
This allows us to drop the duplicate gl_uniform_block_packing enum.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
NewBufferObj() is called when the shared state is allocated so we
wouldn't get this far if it was NULL.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This avoids useless error checking.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
It should have been created by this point.
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
| |
This keeps the flags out of v3d_decode.c's output. In the generated code,
only the unpack functions see any change (where they now get the
restricted start value), and vc4 doesn't use the unpack functions yet.
|
|
|
|
|
|
|
|
|
|
| |
I was writing the XML such that the address field overlapped various flags
in the alignment bits, which caused pain when trying to unpack for decode.
Instead, keep the XML matching the docs (address fields don't overlap),
and just infer the appropriate shift value during decode.
During pack, the address is just applied to the appropriate bits
already, ignoring the sub-byte start/end fields.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We simply pick r4 if available (anything else would force a MOV), then
round-robin through accumulators (avoids physical regfile RAW delay
slots), then round-robin through the physical regfile.
The effect on instruction count is pretty impressive:
total instructions in shared programs: 76563 -> 74526 (-2.66%)
instructions in affected programs: 66463 -> 64426 (-3.06%)
and we could probably do better with a little heuristic of "if we're going
to choose a physical reg, and other operands of instructions using this as
a src have the same physical regfile, then use the other regfile".
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
VC4 has had a tension, similar to pre-Sandybridge Intel, where we want to
use low-numbered registers (more parallelism on Intel, fewer delay slots
on vc4), but in order to give instruction scheduling the most freedom to
avoid delays we want to round-robin between registers of the same cost.
Our two heuristics so far have chosen one end or the other of that
tradeoff.
The callback, instead, hands the driver the set of registers that are
available, and the driver gets to make its own choice. This will be used
in vc4 to round-robin between registers of the same cost, and might be
used in the future for improving bank selection.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
All the paths looping over adjacency had guards against considering
themselves (the non-obvious one was ra_any_neighbors_conflict(), which has
in_stack set).
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
I was going to indent this code another level, and decided it would be
easier to read as a helper.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Without this, a BlitFramebuffer would mark the whole framebuffer as being
changed (so we emit loads/stores of all of it) rather than just the
modified subset.
|
|
|
|
|
|
|
| |
I don't know how I managed to leave this here for so long. Found when
working on a 1:1 overlapping blit extension for X11.
Cc: [email protected]
|
|
|
|
|
|
| |
This gets us automatic CL decoding to a floating-point value, and drops a
magic number from the emit code. 250x250 shader runner tests now say they
have a center of 125.0 instead of 2000.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The VC4_DEBUG_CL output goes from:
0x00000010 0x00000010: 0x06 VC4_PACKET_START_TILE_BINNING
0x00000011 0x00000011: 0x38 VC4_PACKET_PRIMITIVE_LIST_FORMAT
0x00000012 0x00000012: 0x12
0x00000013 0x00000013: 0x66 VC4_PACKET_CLIP_WINDOW
0x00000014 0x00000014: 0x00
0x00000015 0x00000015: 0x00
0x00000016 0x00000016: 0x00
0x00000017 0x00000017: 0x00
0x00000018 0x00000018: 0xfa
0x00000019 0x00000019: 0x00
0x0000001a 0x0000001a: 0xfa
0x0000001b 0x0000001b: 0x00
to:
0x00000010 0x00000010: 0x06 Start Tile Binning
0x00000011 0x00000011: 0x38 Primitive List Format
Data Type: 1 (16-bit index)
Primitive Type: 2 (Triangles List)
0x00000013 0x00000013: 0x66 Clip Window
Clip Window Height in pixels: 250
Clip Window Width in pixels: 250
Clip Window Bottom Pixel Coordinate: 0
Clip Window Left Pixel Coordinate: 0
v2: Squash in robher's fixes for Android
|
|
|
|
|
|
|
| |
This is copied from Intel's XML decoder, modified to handle V3D's
byte-oriented packets.
v2: Squash in robher's fixes for Android
|
|
|
|
| |
This is the same 8-space style used in the vc4 and vc5 gallium drivers.
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The device doesn't directly support this feature so we implement it with
additional shader code which sets the color output(s) w component to
1.0 (or max_int or max_uint).
Fixes 16 Piglit ext_framebuffer_multisample/*alpha-to-one* tests.
v2: only support unorm/float buffers, not int/uint, per Roland.
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When we forcibly write white to FS outputs (for XOR mode emulation)
we were using a temp register. But that's not really necessary.
This also fixes the case of writing white to multiple color buffers.
Subsequent changes will build on this.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Performance delta on Core i5-4570 + Radeon R9 270:
Overlord: +20% in certain locations
Overlord II: +20% in certain locations
Oil Rush: +12% in most locations
War Thunder: +4-9% in benchmarks
Saints Row 2: +10-35% in certain locations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.
If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).
Fixes: adafe4b733c02 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <[email protected]>
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: "17.2 17.1" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Always initialise whandle.modifier for DRIImage modifier queries, so if
the driver doesn't support it then we return false for the query.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Fixes: d33fe8b84e45 ("st/dri: enable DRIimage modifier queries")
|
|
|
|
|
|
|
|
| |
In the DRIImage queryImage hook, check if resource_get_handle() failed
and return FALSE if so.
Signed-off-by: Daniel Stone <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Fixes rendercheck errors when using glamor acceleration in X server.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
To workaround an unknown bug.
Signed-off-by: Leo Liu <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
For textures we must not approximate the calculation with `stride *
height`, or `slice_stride * depth`, as that can easily lead to buffer
overflows, particularly for partial transfers.
This should address the issue that Bruce Cherniak found and diagnosed.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
To reduce code duplication.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
To reduce code duplication.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Add uintptr_t cast to fix 'cast to pointer from integer of different size'
warning on 32bit build (build error on Android M).
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Constantine Kharlamov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|