| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
On machines with many cores, you can run into that issue :
../mesa-9999/src/vulkan/overlay-layer/overlay.cpp:42:10: fatal error: vk_enum_to_str.h: No such file or directory
v2: Move declare_dependency around (Eric)
Signed-off-by: Lionel Landwerlin <[email protected]>
Reported-by: Jan Ziak
Cc: <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
shader-db's report.py will use this to see when we've changed loop
unrolling behavior on a shader and skip including other stats like
instruction count from being considered for that shader, since they won't
be useful as a proxy for real world performance in that case.
Reviewed-by: Rob Clark <[email protected]>
Tested-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
| |
The new location matches other drivers, and has a README about the rules
for updating it.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
__u64 is a ulonglong on x86_64, not uint64_t, so my gcc was complaining
about the wrong type being passed in.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The .editorconfig helps with the tabs, but we've got this
two-tabs-from-previous-indentation line continuation style that requires
whacking the c-file-offsets. This will throw emacs warnings when first
opening a file in the directory, press '!' to shut it up for the future.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
The editorconfig takes precedence over dir-locals in emacs26 with
editorconfig enabled, so the /.editorconfig was affecting these
directories.
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
It's used without being imported
|
|
|
|
|
|
|
|
|
| |
This can be used by both etnaviv and freedreno/a2xx as they are both vec4
architectures with some instructions being scalar-only.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Ofc legacy gl features that are broken don't trigger fails in deqp. I
should remember to test glxgears more often.
Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a6xx, we construct/emit a single VS const state used for both
binning pass and draw pass. So far we were mostly getting lucky that
there were not (obvious) mismatches between the const_state (like
different lowered immediates) between the binning and draw pass
VS ir3_shader_variant.
And I guess this situation will come up more as GS and tess is added
into the equation.
Since really everything about the const state is not specific to the
variant, move this. The main exception is lowered immediates, but these
are the last to appear in the layout, and it doesn't hurt for each new
shader variant to just append any immed's it lowers to the end of the
immediate state.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Next patch moves const_state to ir3_shader, before the compile context
is created. So move the code around in prep to call it earlier.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
They are really part of the constant state, and it will moving things
from ir3_shader_variant to ir3_shader if we combine them.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Combine the offsets of differenet parts of the constant space with (what
was formerly known as) ir3_driver_const_layout. Bunch of churn, but no
functional change.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Move to ir3_compiler so it doesn't depend on the compile context. Prep
work for moving constant state from variant (where we have compile
context) to shader (where we do not).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I tried to be very careful while updating all the various drivers, but I
don't have any of that hardware for testing. :(
i965 is the only platform that sets always_precise = true, and it is
only set true for fragment shaders. Gen4 and Gen5 both set lower_flrp32
only for vertex shaders. For fragment shaders, nir_op_flrp is lowered
during code generation as a(1-c)+bc. On all other platforms 64-bit
nir_op_flrp and on Gen11 32-bit nir_op_flrp are lowered using the old
nir_opt_algebraic method.
No changes on any other Intel platforms.
v2: Add panfrost changes.
Iron Lake and GM45 had similar results. (Iron Lake shown)
total cycles in shared programs: 188647754 -> 188647748 (<.01%)
cycles in affected programs: 5096 -> 5090 (-0.12%)
helped: 3
HURT: 0
helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2
helped stats (rel) min: 0.12% max: 0.12% x̄: 0.12% x̃: 0.12%
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Driver which do not support native integers should use a lowering
pass to go from integers to floats.
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to Makefile.sources
In commit 2f0b9d22495 ("freedreno/ir3: lower
load_barycentric_at_offset") a new file was added that needs to
also be added to the Makefile.sources list used by Android and
SCons build system.
Cc: Rob Clark <[email protected]>
Cc: Emil Velikov <[email protected]>
Cc: Amit Pundir <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Alistair Strachan <[email protected]>
Cc: Greg Hartman <[email protected]>
Cc: Tapani Pälli <[email protected]>
Cc: Jason Ekstrand <[email protected]>
Fixes: 2f0b9d22495 ("freedreno/ir3: lower load_barycentric_at_offset")
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: John Stultz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add libfreedreno_drm/ir3 to the build
Cc: Rob Clark <[email protected]>
Cc: Emil Velikov <[email protected]>
Cc: Amit Pundir <[email protected]>
Cc: Sumit Semwal <[email protected]>
Cc: Alistair Strachan <[email protected]>
Cc: Greg Hartman <[email protected]>
Cc: Tapani Pälli <[email protected]>
Cc: Jason Ekstrand <[email protected]>
Fixes: b4476138d5a ("freedreno: move drm to common location")
Fixes: aa0fed10d35 ("freedreno: move ir3 to common location")
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Signed-off-by: Amit Pundir <[email protected]>
[jstultz: Tweaked to add extra ir3 files from master]
Signed-off-by: John Stultz <[email protected]>
|
|
|
|
|
|
| |
Corrects tex state ubwc pitch/size
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-GLES31.functional.ubo.random.all_per_block_buffers.13 and .20
ca3eb5db665cbcc2de5a5d3158e3dc68f86e5822 went from silently truncating
the constant state, which was also the wrong thing to do, to an assert.
Which then showed up in a couple of dEQPs. Actually there is nothing
wrong with larger constant file so just drop the assert.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Lower load_output to txf_ms_fb and add support for the new texture fetch
instruction.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
Needed for sampling from tile buffer (GMEM).
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
And a comment.. since we are mixing units of bytes/dwords/vec4,
hopefully this will avoid some unit confusion.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It isn't quite as simple as not running the pass, since with packed
varyings we get load_ubo for block==0 (ie. the "real" uniforms). So
instead run the pass normally but decline to lower anything in
block > 0
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we emit UBO regions INDIRECTly (ie. not copied into cmdstream but
emit by EXT_SRC_ADDR) we need to keep them 4*vec4 aligned. Which the
code already mostly did, except for aligning the first UBO region itself
(ie. the one after block==0 which is the "real" uniforms).
Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise we zero out the state again, but all the UBO loads that we
could lower are already lowered. End result is that we didn't emit the
uniforms for lowered UBO access in any case where multiple shader
variants are used.
Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file
Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Needs to update max_half_reg, or be remapped to full reg and update
max_reg accordingly, depending on generation..
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These were updated in version 1.1.106 of vulkan.h to make more sense
with the extension names. We may as well keep with the times.
See also: 90108deb277d33d19233 "anv: Update to use the new features struct names"
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Chia-I Wu <[email protected]>
|
|
|
|
|
|
| |
Because who are we kidding... it is a sysval.
Signed-off-by: Rob Clark <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
The fd is -1, thus the block of if (fd != -1) close(fd) is dead code.
Cc: Chad Versace <[email protected]>
Reviewed-by: Chia-I Wu <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The compiler support for:
OES_sample_shading
OES_sample_variables
OES_shader_multisample_interpolation
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
The so->inputs[] table is in units of vec4
Fixes: 7ff6705b8d8 freedreno/ir3: convert to "new style" frag inputs
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Since this is what the value actually is. Cleanup the name before
adding more different i,j related values for sample-shading.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
tex instruction can actually return 16b values.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Calculates i,j at specified offset within a pixel. A new load_size_ir3
intrinsic is used in conjunction with fddx/fddy to translate the offset
into primitive space and adjust the i,j from load_barycentric_pixel
accordingly.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
This lowers load_barycentric_at_sample to load_sample_pos_from_id plus
load_barycentric_at_offset.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Pull in updates for sample shading.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
De-duplicate the "normal" and "flags" versions of the macros, and while
at it go ahead and add "flags" versions for all the remaining macros,
since we'll at least need INSTR1F in a following commit.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Couple more opcodes which don't take a sampler id as first arg.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
It takes an argument.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
And add corresponding enums for different sorts of varying
interpolation.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Update UABI header and add FD_PP_PGTABLE and FD_NR_FAULTS params.
Robustness can be supported by a kernel which provides the new ABI if it
also indicates that per-process pagetables are in use.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The dri options are optional. When the dri options are not provided
the WSI will not use adaptive sync.
FWIW I think for xf86-video-amdgpu this still requires an X11 config
option, so only people who opt in can get possible regressions from this.
So then the remaining question is: why do this in the WSI?
It has been suggested in another MR that the application sets this.
However, I disagree with that as I don't think we'll ever get a
reasonable set of applications setting it.
The next questions is whether this can be a layer. It definitely
can be as implemented now. However, I think this generally fits
well with the function of the WSI. Furthemore, for e.g. the DISPLAY
WSI this is much harder to do in a layer.
Of course, most of the WSI could almost be a layer, but I think
this still fits best in the WSI.
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Fixes: fe8c57e859d freedreno/ir3: use nir_src_as_uint in a few places
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Fixes a few warnings.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Bas Nieuwenhuizen <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: remove & operator in a couple of memsets
add some memsets
v3: fixup lima
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> (v2)
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Add even more places
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This required to calculate sizes correctly when we have bindless
samplers/images.
Reviewed-by: Marek Olšák <[email protected]>
|