| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
V2: remove code duplication and one unnessecary variable, minor whitespace fix
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Tested-by: Benedikt Schemmer <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Note: this causes spurious regressions in some current piglit tests,
because the tests incorrectly assume that there is no denorm support for
doubles. I'm going to send out a fix for those tests as well.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
| |
The LLVM intrinsic has existed for a long time. The current name was
established in LLVM 3.9.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The status quo is quite the mess:
1. tgsi_exec will do a per-channel computation, and store the dst[0]
result (significand) correctly for each channel. The dst[1] result
(exponent) will be written to the first bit set in the writemask.
So per-component calculation only works partially.
2. r600 will only do a single computation. It will replicate the
exponent but not the significand.
3. The docs pretend that there's per-component calculation, but even
get dst[0] and dst[1] confused.
4. Luckily, st_glsl_to_tgsi only ever emits single-component instructions,
and kind-of assumes that everything is replicated, generating this for
the dvec4 case:
DFRACEXP TEMP[0].xy, TEMP[1].x, CONST[0][0].xyxy
DFRACEXP TEMP[0].zw, TEMP[1].y, CONST[0][0].zwzw
DFRACEXP TEMP[2].xy, TEMP[1].z, CONST[0][1].xyxy
DFRACEXP TEMP[2].zw, TEMP[1].w, CONST[0][1].zwzw
Settle on the simplest behavior, which is single-component calculation
with replication, document it, and adjust tgsi_exec and r600.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a warning caused by the fork (note the change in the function
signature):
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c: In function ‘r600_init_common_state_functions’:
../../../../../mesa-src/src/gallium/drivers/r600/r600_state_common.c:2974:36: warning: assignment from incompatible pointer type [-Wincompatible-pointer-types]
rctx->b.set_occlusion_query_state = r600_set_occlusion_query_state;
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes the extremely unlikely case that an application uses
0x80000000 or 0x3f800000 as border color for an integer texture and
helps in the also, but perhaps slightly less, unlikely case that 1 is
used as a border color.
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware does this automatically for unorm formats, but we need to
do it manually for unorm depth formats that have been upgraded to
Z32_FLOAT.
Fixes dEQP-GLES31.functional.texture.border_clamp.range_clamp.nearest_unorm_depth
and others.
Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The hardware usually does this automatically. However, we upgrade
depth to Z32_FLOAT to enable TC-compatible HTILE, which means the
hardware no longer clamps the comparison value for us.
The only way to tell in the shader whether a clamp is required
seems to be to communicate an additional bit in the descriptor
table. While VI has some unused bits in the resource descriptor,
those bits have unfortunately all been used in gfx9. So we use
an unused bit in the sampler state instead.
Fixes dEQP-GLES3.functional.texture.shadow.2d.linear.equal_depth_component32f
and many other tests in dEQP-GLES3.functional.texture.shadow.*
Fixes: d4d9ec55c589 ("radeonsi: implement TC-compatible HTILE")
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Not that those are super common or useful, but hey! Fun corner cases
of the API...
Fixes dEQP-GLES31.functional.geometry_shading.emit.*
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
| |
It has to happen after descriptor uploads since otherwise we'll print out
the wrong GPU list / incorrectly claim descriptor corruption.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Redundant with the recently added ac_llvm_context::chip_class.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Silences a compiler warning.
Reviewed-by: Roland Scheidegger <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Since our driver support arb_provoking_vertex, we can start
advertising PIPE_CAP_QUADS_FOLLOW_PROVOKING_VERTEX_CONVENTION
Fixes ./clipflat & ./arb-provoking-vertex-render piglit tests
Tested piglit, glretrace on Hw 11 and Hw 13
Reviewed-by: Charmaine Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently we are blitting the whole resource when the RS is used to
de-/tile a resource. This can be very inefficient for large resources
where the transfer is only changing a small part of the resource
(happens a lot with glTexSubImage2D).
Optimize this by only blitting the tile aligned subregion of the
resource, which the transfer is going to change.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-By: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
| |
This is useful if we only need to copy part of a larger resource, mostly
when using the RS engine to de-/tile on pipe transfers.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-By: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
| |
The RS can blit abitrary tile aligned subregions of a resource by
adjusting the buffer offset.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-By: Wladimir J. van der Laan <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I remember thinking "gosh, it would be nice if I could do a kernel-style
'if (!IS_ENABLED(DEBUG))' instead of using an #ifdef, so the code was
compiled on both builds", and then forgot to test a release build anyway.
Fixes: a8fd58eae596 ("vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.")
Reported-by: Derek Foreman <[email protected]>
|
|
|
|
|
|
|
| |
This has proven to be incredibly useful for debugging CMA allocation
failures and driving memory management improvements. However, we don't
want to burden entry and exit from the BO cache with the labeling ioctl's
overhead on release builds.
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
That's unnecessary to double-check that dcc_offset is not 0
because all callers already check that.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
Found by inspection.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
No need to check if screen->pipe != pipe, so we can just assign it. Just do it.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Swr caches fb contents in tiles. Those tiles are stored on a per-context
basis.
When switching contexts that share resources we need to make sure that
the tiles of the old context are being stored and the tiles of the new
context are being invalidated (marked as invalid, hence contents need
to be reloaded).
The context does not get any dirty bits to identify this case. This has
to be, then, coordinated by the resources that are being shared between
the contexts.
Add a "curr_pipe" hook in swr_resource that will allow us to identify a
MakeCurrent of the above form during swr_update_derived(). At that time,
we invalidate the tiles of the new context. The old context, will need to
have already store its tiles by that time, which happens during glFlush().
glFlush() is being called at the beginning of MakeCurrent.
So, the sequence of operations is:
- At the beginning of glXMakeCurrent(), glFlush() will store the tiles
of all bound surfaces of the old context.
- After the store, a fence will guarantee that the all tile store make
it to the surface
- During swr_update_derived(), when we validate the new context, we check
all resources to see what changed, and if so, we invalidate the
current tiles.
Fixes rendering problems with CEI/Ensight.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
cleared_and_retried is always reset to false when jumping to the retry
label, thus leading to an infinite retry loop.
Fix that by moving the cleared_and_retried variable definitions at the
beginning of the function. While we're at it, move the create variable
with the other local variables and explicitly reset its content in the
retry path.
Signed-off-by: Boris Brezillon <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Fixes: 78087676c98aa8884ba92 "vc4: Restructure the simulator mode."
|
|
|
|
|
|
|
|
|
|
|
| |
I was overwriting view->texture with the shadow resource when we need to
do shadow copies (retiling or baselevel rebase), but that tripped up some
critical new sanity checking in state_tracker (making sure that stObj->pt
hasn't changed from view->texture through TexImage-related paths).
To avoid that, move the shadow resource to the vc4_sampler_view struct.
Fixes: f0ecd36ef8e1 ("st/mesa: add an entirely separate codepath for setting up buffer views")
|
|
|
|
| |
Trivial
|
|
|
|
|
|
|
|
|
|
|
| |
This marks the end of code sharing between r600 and radeonsi.
It's getting difficult to work on radeonsi without breaking r600.
A lot of functions had to be renamed to prevent linker conflicts.
There are also minor cleanups.
Acked-by: Dave Airlie <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Supported in JitGatherVertices(); FetchJit::JitLoadVertices() may require
similar changes, will need address this if it is determined that this
path is still in use.
Handle Force Sequential Access in FetchJit::Create.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Move structure, as the size is significantly reduced due to dynamic
allocation of the GS buffers.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Add ForceSequentialAccessEnable and InstanceIDOffsetEnable bools to
FETCH_COMPILE_STATE.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
One piglit regression, which was a false pass:
[email protected]@execution@geometry@dynamic_input_array_index
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
These changes were generated using python's `2to3` tool.
Suggested-by: Ilia Mirkin <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These changes were generated using python's `2to3` tool.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102852
Reported-by: Alex Granni <[email protected]>
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Add missing includes after 6ace0b8 (etnaviv: don't enable RT
full-overwrite when logicop is enabled), otherwise the etnaviv driver
won't build because of missing macros.
Signed-off-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Tested-by: Andres Gomez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
util_pack_color may leave undefined values in the upper half of the packed
integer. As our hardware needs the upper 16 bits to mirror the lower 16bits,
this breaks clears of those formats if the undefined values aren't masked off.
I've only observed the issue with R5G6B5_UNORM surfaces, other 16bpp
formats seem to work fine.
Fixes: d6aa2ba2b2 (etnaviv: replace translate_clear_color with util_pack_color)
Cc: [email protected]
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Wladimir J. van der Laan <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We currently don't use these instructions, and since their API
changed in llvm-5.0 having them in the autogen files broke the mesa
release tarballs which ship with generated autogen files.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=102847
CC: [email protected]
Tested-by: Laurent Carlier <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Logicop is a form of blending with the framebuffer, so we must allow
framebuffer reads when logicop is enabled.
Fixes: piglit gl-1.0-logicop on GC3000, which has logicop support
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
| |
Denotes availability of 64bit int atomic instructions
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|