| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Acked-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
I'm also sending out a piglit test, gl-2.0/vertexattribpointer-size-3,
which exposes this corner case.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
| |
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels_pbo.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even though glBlitFramebuffer cannot be used for SINT <-> UINT blits, we
still need to handle this type of blit here because it can happen as part
of texture uploads / downloads, e.g. uploading a GL_RGBA8I texture from
GL_UNSIGNED_INT data.
Fixes parts of GL45-CTS.gtf32.GL3Tests.packed_pixels.packed_pixels.
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
There is no functionality in swr to clamp either vertex or frag colors.
This could be added in swr_shader, at which point these could be
re-enabled.
Fixes arb_color_buffer_float-render
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
Fixes gl-3.2-basevertex-vertexid
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
Fixes glsl-arb-fragment-coord-conventions.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rendering could still be ongoing (or have yet to start) when the shader
is deleted. There's no refcounting on the shader text, so insert a
pipeline stall unconditionally when this happens.
[Note, we should instead introduce a way to attach work to
fences, so that the freeing can be done in the current fence.]
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The EXT_texture_integer test says that blending and alphatest should
all be disabled. st/mesa takes care of alphatest already.
Fixes the ext_texture_integer-fbo-blending piglit test.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The support in swr requires shaders to output the components as UINTs.
This is not how GL or Gallium work, and since this is not a
required-renderable format, just leave it out.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Fixes the texsubimage piglit and lets the copyteximage one get further.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
In a BGR10X2 or BGR5X1 situation, there's no need to try to quantize the
X channel - the default will have the proper quantization required.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Noticed by inspection.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
This is the format used for the primary surface of a
PIPE_FORMAT_Z32_FLOAT_S8X24_UINT resource.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Piglit regressions (radeonsi or LLVM bugs, they pass on softpipe):
- glsl-1.10/execution/variable-indexing/vs-output-array-vec3-index-wr
- glsl-1.10/execution/variable-indexing/vs-output-array-vec4-index-wr
- glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-col-row-wr
- glsl-110/execution/variable-indexing/vs-temp-array-mat2-index-row-wr
Totals:
SGPRS: 1132185 -> 1168801 (3.23 %)
VGPRS: 907856 -> 906204 (-0.18 %)
Spilled SGPRs: 2011 -> 2425 (20.59 %)
Spilled VGPRs: 368 -> 96 (-73.91 %)
Scratch VGPRs: 1344 -> 1060 (-21.13 %) dwords per thread
Code Size: 35916164 -> 35705372 (-0.59 %) bytes
LDS: 767 -> 767 (0.00 %) blocks
Max Waves: 194010 -> 194921 (0.47 %)
Wait states: 0 -> 0 (0.00 %)
Before:
VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR
alien_isolation 2938 38 40
bioshock-infinite 1769 245 732
dirt-showdown 548 85 72
f1-2015 776 0 320
ue4_lightroom_inter.. 74 0 180
After:
VGPR SPILLING APPS Shaders SpillVGPR ScratchVGPR
alien_isolation 2938 38 40
bioshock-infinite 1769 0 480
dirt-showdown 548 58 40
f1-2015 776 0 320
ue4_lightroom_inter.. 74 0 180
Bioshock and DiRT benefit.
If I set IF_THRESHOLD=4, tesseract starts spilling VGPRs
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Tested-by: Dieter Nützel <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Move to pass by value since most events are very small in size.
We can look at pass by reference but will need to create multiple
versions to handle temp objects.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Added events for tracking early/late Depth and stencil events,
TE patch info, GS prim info, and FrontEnd/BackEnd DrawEnd events.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
- Do proper culling of wireframe triangles (including non-culling of
degenerates)
- Fix degenerate culling of CCW front-facing triangles in wireframe and
conservative rast
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Alpha from render target 0 should always be used for alpha test for all
render targets, according to GL and DX9 specs. Previously we were using
alpha from the current render target.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
| |
Don't generate files when no events have been generated outside
the header events.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Buffer events ourselves and then when that's full or we're destroying
the context then write the contents to file. Previously, we're relying
ofstream to buffer for us.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
* All format combinations coded
* Fully emulated on AVX2 and AVX
* Known issue: the MSAA sample locations need to be adjusted for 8x2
Set ENABLE_AVX512_SIMD16 and USD_8x2_TILE_BACKEND to 1 in knobs.h to enable
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
| |
This is Jonas Pfeil's code from the kernel, brought back to Mesa by
anholt.
|
| |
|
|
|
|
|
|
| |
We don't allow the last thread switch to be inside control flow, to be
sure that we hit the last state exactly once. If the last texturing was
in control flow, fall back to single threaded.
|
|
|
|
|
| |
This is a suboptimal implementation, but Jonas Pfeil found that it was
still a massive performance gain.
|
| |
|
|
|
|
| |
This makes the raddr fixups compatible with FS threading.
|
|
|
|
|
|
| |
We have two major requirements: Make sure that only the bottom half of the
physical reg space is used, and make sure that none of our values are live
in an accumulator across a switch.
|
| |
|
|
|
|
|
|
| |
We have had no reason to separate ability to store in an accumulator from
ability to store in B, but with FS threading, we need to be able to force
values to be stored only in the physical regfiles.
|
|
|
|
| |
This is vaguely based off of Jonas Pfeil's thread switch support branch.
|
|
|
|
|
|
|
| |
This will eventually be generated at the QIR level, so that
vc4_qir_schedule.c can arrange the separation of tex_strb from tex_result
correctly. It will also be important so that register allocation set the
register classes appropriately for values that are live across the switch.
|
|
|
|
|
| |
These are both bugs we've run into along the way writing multithreaded FS
support.
|