| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
We normally call with stderr which is unbuffered, so this won't affect
that, but it does let me call nir_print_shader(nir, fopen("log", "w+"))
from gdb and actually get the whole shader in my file.
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
I've been using this with the kmsro series to test v3d on VKMS without my
old KMS hack in the v3d kernel driver. KMSRO still needs some cleanup,
but v3d RO support seems reasonable.
|
|
|
|
|
| |
Fixes: 0d17b685b1ff ("gallium/u_tests: add a compute shader test that clears an image")
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improves performance in Talos by about 15% (and significant improvements
in RotR and possibly other but did not bench with final patch) on
kernel 4.19 and earlier.
On 4.20+ a similar effect comes from
433ca054949a "drm/amdgpu: try allocating VRAM as power of two"
v2: Do not impact the alignment of the physical memory.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
CC: <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since 1285f71d3e landed, it needs to provide apps with proper sample
position for MSAA.
Currently no way to query this to hw, these are taken from blob driver.
Fixes: dEQP-GLES31.functional.texture.multisample.samples_#.sample_position
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We set equiv bit in SP_FS_CTRL_REG0. Somehow the hw doesn't hang with
this mismatched config, but does run slower. It is faster with either
neither bit set, or both bits set, but both is the fastest of the three
configurations. Worth a bit over 10% gain in glmark2.
Spotted-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
|
|
|
|
|
|
|
| |
blip_fp uses GENERIC as input, so blit_vp should match for linking
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Adds all missing texture related logic. For everything to work it also
needs changes to ir2/fd2_program, which are part of the ir2 update patch.
Note: it needs rnndb update
Signed-off-by: Jonathan Marek <[email protected]>
[remove stray patch]
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Note: it needs rnndb update
Signed-off-by: Marek Vasut <[email protected]>
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
200: 256KiB GMEM A200 (imx53)
201: 128KiB GMEM A200 (imx51)
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
As it stands, it overflows to zero.
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
lowers ceil(x) as -floor(-x)
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On older gens, the CLIP_ADJ bitfields were actually 3.6 fixed point.
Which might make more sense. Although this formula comes up with values
pretty close to what blob does for various viewport sizes (for at least
a5xx and a6xx), and seems to work.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
f6131d4ec7a had the side effect of enabling LRZ w/ 32b depth buffers.
But there are some bugs with this, which aren't fully understood yet,
so for now just skip LRZ w/ z32..
Fixes: f6131d4ec7a freedreno/a6xx: Clear z32 and separate stencil with blitter
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We generate an IB to clear the gmem at flush time and jump to it
before rendering each tile. This lets us get rid of the command stream
patching for gmem offsets.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For RGB surfaces (for example) we don't really care that the colormask
is 0x7 instead of 0xf. This should not trigger clear_with_quad()
slowpath.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If we can't clear all the buffers with pctx->clear() (say, for example,
because of ColorMask), push the buffers we *can* clear with pctx->clear()
first. Tilers want to see clears coming before draws to enable fast-
paths, and clearing one of the attachments with a quad-draw first
confuses that logic.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Move (most of) the ir3 compiler to src/freedreno/ir3 so that it can be
re-used by some future vulkan driver. The parts that are gallium
specific have been refactored out and remain in the gallium driver.
Getting the move done now so that it can happen before further
refactoring to support a6xx specific instructions.
NOTE also removes ir3_cmdline compiler tool from autotools build since
that was easier than fixing it and I normally use meson build. Waiting
patiently for the day that we can remove *everything* from the autotools
build.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Split the parts that are gallium specific into ir3_gallium so the rest
can move to a common location outside of gallium.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
A bit annoying to have to copy into our own struct. But this is
something the compiler really needs to know, at least on earlier
generations where streamout is implemented in shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Clean up some of the low-hanging-fruit usages of freedreno_util.h
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
So I can drop env2u() helper from freedreno_util.h and get rid of one
small ir3 dependency on gallium/freedreno
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Move them to IR3_SHADER_DEBUG so we can remove ir3's dependency on
fd_mesa_debug.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Only used by ir3, so move it into ir3 to be more self contained.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Just massive search/replace for the most part.
Step towards removing ir3 dependency on disasm.h which is shared by
a2xx. One step closer to being able to move ir3 out of gallium.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
So that we can re-use at least parts of it for vulkan driver, and so
that we can move ir3 to a common location (which uses fd_bo to allocate
storage for shaders)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Prep work to move drm to a common location.
Slightly hacky, but the softpin debug flag is only temporary.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
as well as os_memory*
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The compiler doesn't know that ny != 0, so x might be uninitialized for
the printf at the end.
Reviewed-by: Elie Tournier <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Mirrors AMDVLK. Looks like if we go over the alignment of height
we actually start to change the addressing. Seems like the extra
miplevels actually work with this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108245
Fixes: f6cc15dccd5 "radv/gfx9: fix block compression texture views. (v2)"
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
| |
L3 allocation table in h/w specification recommends using 4 KB
granularity for programming allocation fields in L3CNTLREG.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use L3 configuration specified in h/w specification.
V2: Drop configs which do under allocation of l3 cache.
Bump up the comment above table.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Everything else uses PACKAGE_VERSION, so let's be consistent, and
VERSION and PACKAGE_VERSION are currently defined to be the same in
meson and android, while VERSION is undefined in autotools and scons.
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Per chapter 3.2 "Instances":
> Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing
> an apiVersion of 0 is equivalent to providing an apiVersion of
> VK_MAKE_VERSION(1,0,0).
Reported-by: Niklas Haas <[email protected]>
Fixes: 8c048af5890d43578ca4 "anv: Copy the appliation info into the instance"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This enum is also allowed by EXT_tessellation_shader, which is supported
on older i965 HW (as opposed to OES_geometry_shader). This was missed
when narrowing this code-path, leading to dEQP regressions.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108868
Fixes: f09d94fbd11 "mesa/main: fix validation of transform-feedback queries"
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If glGetTexImage or glGetnTexImage is called with a level that doesn't
exist, we get an error message on this form:
Mesa: User error: GL_INVALID_VALUE in glGetTexImage(depth = 0)
This is clearly nonsensical, because these APIs don't even have a
depth-parameter. The reason is that get_texture_image_dims() return
all-zero dimensions for non-existent texture-images, and we go on to
validate these dimensions as if they were user-input, because
glGetTextureSubImage requires checking.
So let's split this logic in two, so glGetTextureSubImage can have
stricter input-validation. All arguments that are no longer validated
are generated internally by mesa, so there's no use in validating them.
Fixes: 42891dbaa12 "gettextsubimage: verify zoffset and depth are correct"
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Juan A. Suarez <[email protected]>
|