| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Enable ETC support for BDW+. In Vulkan, an array lookup on
surface_format[] is used to determine HW support for certain
formats. In contrast, Mesa dynamically populates an array
which reports this information.
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The table has this marked as unsupported on all gens, but I don't really
believe that given how early it is in the table. I've tested and it seems
to work on Broadwell. The Bspec says that it sould be renderable on SKL+
but alpha blending is questionable.
Side note: We really need to audit the format table again.
|
|
|
|
|
| |
Cc: Jordan Justen <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
|
| |
|
| |
|
| |
|
|\ |
|
| | |
|
| |
| |
| |
| |
| | |
And mark nir_op_pack_uvec4_to_uint unreachable, since it's only produced
by lowering pack[SU]norm4x8 which the vec4 backend does not need.
|
| |
| |
| |
| | |
The vec4 backend will lower it.
|
| |
| |
| |
| | |
i965/fs was the only consumer, and we're now doing the lowering in NIR.
|
| | |
|
| |
| |
| |
| |
| | |
We'll want to have different lowering options set for scalar/vector
stages.
|
| |
| |
| |
| |
| | |
A future patch will want to use designated initalizers, which aren't
available in C++, but this is C.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Fixes a number of GLES31 CTS failures and hangs on various hardware:
ES31-CTS.texture_gather.plain-gather-depth-2d
ES31-CTS.texture_gather.plain-gather-depth-2darray
ES31-CTS.texture_gather.plain-gather-depth-cube
ES31-CTS.texture_gather.offset-gather-depth-2d
ES31-CTS.texture_gather.offset-gather-depth-2darray
ES31-CTS.layout_binding.sampler2D_layout_binding_texture_ComputeShader
ES31-CTS.layout_binding.sampler2DArray_layout_binding_texture_ComputeShader
ES31-CTS.explicit_uniform_location.uniform-loc-types-samplers
ES31-CTS.compute_shader.resources-texture
Some of them were actually passing by luck on some generations even
though we weren't uploading sampler state tables explicitly for the
compute stage, most likely because they relied on the cached sampler
state left from previous rendering to be close enough.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92589
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93312
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93325
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93407
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93725
Reported-by: Marta Lofstedt <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This reuses the NEW_SAMPLER_STATE_TABLE state bit (currently only used
on pre-Gen7 hardware) to signal that the sampler state tables have
changed in order to make sure that the GPGPU interface descriptor is
updated.
Reviewed-by: Marta Lofstedt <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Extends commit 6531ccb70 to silence the warning in release builds as
well.
Reviewed-by: Ilia Mirkin <[email protected]>
|
| |
| |
| |
| | |
Signed-off-by: Emil Velikov <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In the vertex and fragment stages, the hardware is nice to us and leaves
g0.2 zerod out for us so we can use it for headers. However, in compute,
geometry, and tessellation stages, the hardware is not so nice. In
particular, for compute shaders on BDW, the hardware places some debug bits
in 23:15. As it happens, bit 15 is interpreted by the sampler as the alpha
channel mask. This means that if you use a texturing instruction with a
header in a compute shader, you may randomly get the alpha channel
disabled. Since channel masks affect the return length of the sampler
message, this can lead the GPU to expect a different mlen to the one you
specified in the shader and this, in turn, hangs your GPU.
Cc: "11.1" <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
| |
| |
| |
| |
| |
| | |
Cc: "11.1" <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
BDW adds the following restriction: "When multiplying DW x DW, the dst
cannot be accumulator."
Cc: "11.1,11.0" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This shouldn't hurt anything, and I'm about to introduce a pass that
will want it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
This makes it a pass, hiding the parameter structs and block callbacks
so it's simpler to work with.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| | |
Shorter than compiler->scalar_stage[MESA_SHADER_VERTEX], which can
help with line-wrapping.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
The RS and hardware binding tables are only supported on the 3D
pipeline and can lead to corruption if left enabled during a GPGPU
workload. Disable it when switching to the GPGPU (or media) pipeline
and re-enable it when switching back to the 3D pipeline.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Abdiel Janulgue <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
pipeline.
This hardware bug can supposedly lead to a hang on IVB and VLV.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
AFAIK brw_emit_select_pipeline() is only called once during context
init on Gen4-5, at which point the pipeline is likely to be already
idle so it may just happen to work by luck regardless of the MI_FLUSH.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Switching the current pipeline while it's not completely idle or the
read and write caches aren't flushed can lead to corruption. Fixes
misrendering of at least the following Khronos CTS test:
ES31-CTS.shader_image_load_store.basic-allTargets-store-fs
The stall and flushes are no longer required on Gen8+.
v2: Emit PIPE_CONTROL with non-zero post-sync op before the write
cache flush on SNB due to hardware bug. (Ken)
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93323
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This hardware bug can cause a hang on context restore while the
current pipeline is set to GPGPU (BDWGFX HSD 1909593). In addition to
clearing the valid bit, mark the CC state as dirty to make sure that
the CC indirect state pointer is re-emitted when we switch back to the
3D pipeline.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This will be used on Gen8+ to make sure that the color calculator
state pointers are re-emitted when switching back to the 3D pipeline
after some GPGPU workload due to a hardware workaround. There are
other state bits already defined that could be used to achieve the
same effect but they all cause a ton of unrelated state to be
re-emitted (e.g. BRW_NEW_STATE_BASE_ADDRESS), so just define a new
one, state bits are cheap.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
| |
| |
| | |
Now that we properly handle write-masking, this should be safe.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
For unspills (scratch reads), we can just set WE_all all the time because
we always unspill into a new GRF. For spills, we have two options: If the
instruction has a 32-bit-per-channel destination and "normal" regioning,
then we just do a regular write and it will interleave channels from
different control-flow paths properly. If, on the other hand, the the
regioning is non-normal, then we have to unspill, run the instruction, and
spill afterwards. In this second case, we need to do the spill with
we_ALL.
|
| | |
|
| |
| |
| |
| |
| |
| | |
We don't want to do this in the long-run but it's needed for passing the
NoContraction tests at the moment. Eventually, we want to plumb this
through NIR properly.
|
| |
| |
| |
| |
| |
| |
| | |
This currently sets the base and size of all push constants to the
entire push constant block. The idea is that we'll use the base and size
to eventually optimize the amount we actually push, but for now we don't
do that.
|
|\ \ |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations. The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing. Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up. To do this,
we use an algorithm similar to the on in split_virtual_grfs.
|
| | | |
|
| | | |
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
This commit moves us to an instruction based model rather than a
register-based model for indirects. This is more accurate anyway as we
have to emit instructions to resolve the reladdr. It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.
One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants. This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.
|
| | |
| | |
| | |
| | | |
It's not really doing enough anymore to justify a helper function.
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode. With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore. The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other. In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.
|
| | |
| | |
| | |
| | | |
We aren't using it anymore.
|
| | |
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load. We also have to
rework some of the other bits of the backend to handle this new form of
uniform load. The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.
|
| | | |
|
| | |
| | |
| | |
| | |
| | |
| | |
| | | |
While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant. This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.
|