| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
so that it can be removed and replaced with inline VBO descriptors,
and the pointer can be packed in unused bits of VBO descriptors.
This also removes the pointer from merged TES-GS where it's useless.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
We need to take num_input_sgprs from VS, not the second shader.
No apps suffered from this.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
VBO descriptor code will change a lot one day.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
On cayman this was hitting an assert later, which probably wasn't
see on non-cayman due to having the t slot.
Fixes: 9041730d1 (r600: add support for ARB_shader_clock.)
|
|
|
|
| |
This just adds the these to the debug prints.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Lowering fpow in NIR rather than LLVM can be beneficial.
Polaris results:
Totals from affected shaders:
SGPRS: 124928 -> 124896 (-0.03 %)
VGPRS: 68616 -> 68332 (-0.41 %)
Spilled SGPRs: 394 -> 413 (4.82 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 3668912 -> 3658368 (-0.29 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 18575 -> 18593 (0.10 %)
Wait states: 0 -> 0 (0.00 %)
Fixes: d6b753920677 "ac/nir: remove emission of nir_op_fpow"
Tested-by: Dieter Nützel <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Seems to have not been used since 16be87c90429
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We were ignoring the channel offset.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
ALU_EXTENDED needs 4 DWORDS instead of the usual 2, hence if the last ALU
clause within a IF-JUMP or ELSE branch is ALU_EXTENDED the target jump
offset needs to be adjusted accordingly.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104654
Cc: <[email protected]>
Signed-off-by: Gert Wollny <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
so that it's not done multiple times in branches
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
TCS_OUT_LAYOUT has 13 unused bits. That's enough for a 32-bit address
aligned to 512KB. Hey, it's a 13-bit pointer!
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If 32-bit pointers are supported, both pointers can be moved into s[0:1]
and then ESGS has exactly the same user data SGPR declarations as VS.
If 32-bit pointers are not supported, only one pointer can be moved into
s[0:1]. In that case, the 2nd pointer is moved before TCS constants,
so that the location is the same in HS and GS.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
| |
We will want to use SH registers outside of user data SGPRs, like the GFX9
special SGPRs.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
For a later patch.
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Cube maps are entire miptrees repeated, while 3D textures have each level
have all of its layers next to each other. Fixes tex3d and
tex-miplevel-selection GL2:texture() 3D.
|
|
|
|
|
|
|
|
| |
Like for vc4, the new DISPLAY_TARGET flag ended up causing no formats to
match. Just drop the whole retval == usage thing and return early when we
hit a known unsupported case.
Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
|
|
|
|
|
|
|
|
|
| |
LLVM requirement was bumped to 4.0.0 with earlier commit.
Hence any code tailored for older versions is now unreachable.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-By: George Kyriazis <[email protected]>
Reviewed-by: Andres Gomez <[email protected]>
|
|
|
|
| |
This got us into trouble recently, so just remove it entirely.
|
|
|
|
|
| |
Previously we would assertion fail about having no hardware format. This
is enough to get kmscube -M nv12-2img working.
|
|
|
|
|
|
|
| |
When we set up the shadow resource we were copying the original resource
as the template, including its prsc->next field. When we shadowed the
first YUV plane's resource for linear-to-tiled conversion, we would end up
unbalancing the refcount on the shadow resource's destruction.
|
|
|
|
|
| |
Trying to track down the YUV EGLImage use-after-free, it helps to see what
the mystery objects are that are being refcounted.
|
|
|
|
|
| |
It would be broken if NULL was passed to it anyway, since it wouldn't
participate in screen->bo_handles management.
|
|
|
|
| |
Improves u_debug_refcount output.
|
|
|
|
|
|
| |
This is part of supporting YUV textures -- MMAL will be handing us a
single GEM BO with the planes at offsets within it, and MMAL-decided
stride.
|
|
|
|
|
|
| |
We were failing the retval == usage check at the end.
Fixes: f7604d8af521 ("st/dri: only expose config formats that are display targets")
|
|
|
|
|
|
|
|
|
|
|
| |
TS tiles map to a fixed amount of bytes in the color/depth surface,
so the blocksize of the format needs to be taken into account when
calculating the number of tiles to fill.
The simplest fix is to just use the layer stride, which is the surface
size in bytes.
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
| |
Some of the 16bit formats misrender with missing tiles with the current
"2" state. As all the previously working formats also work with the "3"
state, just always use that one.
Signed-off-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This feature has caused some trouble already. Add a debug switch to
allow users to quickly check if a specific issue is caused by this
feature.
Signed-off-by: Lucas Stach <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
|
|
|
|
|
|
|
| |
Reduces size of struct etna_specs from 100 to 94 bytes.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Lucas Stach <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't want filtering for integer textures, same as depth/stencil.
Fixes: KHR-GL45.direct_state_access.renderbuffers_storage_multisample
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to mark the range as valid, and validate the resource using a
helper to ensure that the buffer status is marked properly.
Fixes some CTS pipeline stats query tests, and
KHR-GL45.direct_state_access.queries_functional
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Two things were off:
- valid range was not updated, which could affect waiting for future
maps
- fencing was done manually instead of using the *_resource_validate
helper, which resulted in a missed dirty buffer flag being set
Fixes: KHR-GL45.direct_state_access.buffers_clear
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Karol Herbst <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes a segfault exposed by a29d63ecf7 which occurs when swr is
used on an unsupported architecture.
v2: re-work to place logic in xmesa_init_display
Signed-off-by: Chuck Atkins <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Cc: [email protected]
Cc: George Kyriazis <[email protected]>
Cc: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes assert in the glsl-1.50-gs-max-output-components piglit test.
Note that the double handling will only work for doubles that
don't take up multiple slots i.e. double and dvec2. However
dual slot double handling is an existing bug which is made no
worse by this patch.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Delaying unrolling and allowing NIR to do it instead has been shown
to result in better code in drivers such as i965. shader-db results
appear to show the same is true for radeonsi.
The other advantage is that using NIR unrolling improves compile
times significantly.
Totals from affected shaders:
SGPRS: 9624 -> 10016 (4.07 %)
VGPRS: 6800 -> 6464 (-4.94 %)
Spilled SGPRs: 0 -> 2 (0.00 %)
Spilled VGPRs: 0 -> 0 (0.00 %)
Private memory VGPRs: 0 -> 0 (0.00 %)
Scratch size: 0 -> 0 (0.00 %) dwords per thread
Code Size: 359176 -> 332264 (-7.49 %) bytes
LDS: 0 -> 0 (0.00 %) blocks
Max Waves: 1355 -> 1432 (5.68 %)
Wait states: 0 -> 0 (0.00 %)
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes the following piglit tests:
tests/spec/arb_tessellation_shader/execution/double-array-vs-tcs-tes.shader_test
tests/spec/arb_tessellation_shader/execution/double-vs-tcs-tes.shader_test
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
We need this to be able to load 64bit varyings.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
| |
Enable UVD encode for HEVC main profile
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
| |
Add UVD hevc encode pipe video codec creation entry
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
| |
Implement UVD hevc encode functions
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
| |
Implement required IBs for UVD HEVC encode.
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|
|
|
|
|
|
|
| |
Add hevc encode hardware interface for UVD
Signed-off-by: James Zhu <[email protected]>
Reviewed-by: Boyuan Zhang <[email protected]>
|