| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we were trusting in the hardware to take the intersection
of the viewport clip with the drawing rectangle. Unfortunately,
3DSTATE_DRAWING_RECTANGLE is fairly expensive because it implicitly
does a full pipeline stall. If we're a bit more careful with our
viewport clipping, we can just re-emit it once at context creation
time.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Looks like a rebase mistake.
Fixes: 89fe5190a256 ("intel/compiler: Lower flrp32 on Gen11+")
|
|
|
|
|
|
|
|
|
|
|
|
| |
The delayed loading code was fail if we had control flow.
This fixes:
tests/spec/arb_shader_image_load_store/execution/image_checkerboard.shader_test
v2: don't use temp_reg before setting temp_reg up.
Tested-by: Gert Wollny <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
trivial fix.
|
|
|
|
|
|
|
| |
With the Align16 tests now disabled, we can run the rest of the tests in
ICL mode (and see them pass!)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Align16 is no more.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Gen11 only differs from SKL+ in that it uses a new datatype index table.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The LRP instruction is no more.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Align16 is no more. We previously generated an align16 ADD instruction
to calculate DDY:
add(16) g25<1>F -g23<4>.xyxyF g23<4>.zwzwF { align16 1H };
Without align16, we now implement it as:
add(4) g25<1>F -g23<0,2,1>F g23.2<0,2,1>F { align1 1N };
add(4) g25.4<1>F -g23.4<0,2,1>F g23.6<0,2,1>F { align1 1N };
add(4) g26<1>F -g24<0,2,1>F g24.2<0,2,1>F { align1 1N };
add(4) g26.4<1>F -g24.4<0,2,1>F g24.6<0,2,1>F { align1 1N };
where only the first two instructions are needed in SIMD8 mode.
Note: an earlier version of the patch implemented this in two
instructions in SIMD16:
add(8) g25<2>F -g23<4,2,0>F g23.2<4,2,0>F { align1 1N };
add(8) g25.1<2>F -g23.1<4,2,0>F g23.3<4,2,0>F { align1 1N };
but I realized that the channel enable bits will not be correct. If we
knew we were under uniform control flow, we could emit only those two
instructions however.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The brw_reg() constructor just obfuscates things here, in my opinion.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
In a future patch, generate_ddy will want to inspect inst->exec_size.
Change generate_ddx as well for consistency.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Like CHV et al., Gen11 does not support 32x32 -> 32/64-bit integer
multiplies.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PLN instruction is no more. Its functionality is now implemented
using two MAD instructions with the new native-float type. Instead of
pln(16) r20.0<1>:F r10.4<0;1,0>:F r4.0<8;8,1>:F
we now have
mad(8) acc0<1>:NF r10.7<0;1,0>:F r4.0<8;8,1>:F r10.4<0;1,0>:F
mad(8) r20.0<1>:F acc0<8;8,1>:NF r5.0<8;8,1>:F r10.5<0;1,0>:F
mad(8) acc0<1>:NF r10.7<0;1,0>:F r6.0<8;8,1>:F r10.4<0;1,0>:F
mad(8) r21.0<1>:F acc0<8;8,1>:NF r7.0<8;8,1>:F r10.5<0;1,0>:F
... and in the case of SIMD8 only the first pair of MAD instructions is
used.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
If multiple instructions are emitted, special handling of things like
conditional mod and NoDDClr/NoDDChk need to be performed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This isn't technically broken, but the next patch will make this
function report whether it generated multiple instructions, and that
information will be used to disable the application of conditional mod
by the generic code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This new type exposes the additional precision offered by the
accumulator register and will be used in the next patch to implement the
functionality of the PLN instruction using a pair of MAD instructions.
One weird thing to note: align1 ternary instructions may only have an
accumulator in the dst or src1 normally, but when src0's type is :NF
the accumulator is read.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The hardware register types' encodings have changed on Gen11. Good thing
we have that superfluous looking brw_reg_type abstraction lying around!
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Gen11 does not support DF, Q, UQ types in hardware. As a result, we have
to disable some GL extensions until they can be reimplemented.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
anv_gem_set_context_param is to be used directly instead!
Fixes: 6d8ab53303 "anv: implement VK_EXT_global_priority extension"
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Fixes piglit tests that broke with 8a64593bde
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fix clipper validMask setting. We don't need to run frustum rejected
primitives through the clipper. Perform frustum culling with only
frustum clip codes. Guardband clip codes cannot be used because they
overlap frustum codes.
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Translate is now part of an overloaded LOAD call which required a change to
the code gen to skip the load functions in order to handle them manually
to make them virtual.
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Generate more compact code from gen_llvm.hpp.
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
- Have the draw type sent to DrawInfoEvent in handlers created in
archrast.cpp. The draw type no longer needs to be sent during during
AR_API_EVENT() call in api.cpp.
- Remove draw type from event defintions in events_private.proto, no
longer needed
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Populate pLastIndex, even for the non-indexed case. An zero pLastIndex
can cause the index offsets inside the fetcher to have non-sensical values
that can be either very large positive or very large negative numbers.
Reviewed-By: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
We were setting view to NULL if the iteration was larger than i.
But in fact if the view is NULL the code did nothing anyway...
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
There's no point, we know the highest non-null one.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
We already stored the highest (potentially) used number.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: add ANV_CONTEXT_REALTIME_PRIORITY (Chris)
use unreachable with unknown priority (Samuel)
v3: add stubs in gem_stubs.c (Emil)
use priority defines from gen_defines.h
v4: cleanup, add anv_gem_set_context_param (Jason)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> (v2)
Reviewed-by: Chris Wilson <[email protected]> (v2)
Reviewed-by: Emil Velikov <[email protected]> (v3)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Chris Wilson <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We now have hopefully fixed all bugs regarding high addresses on Vega10 and
Raven. Start to use the high range to make room for SVM in the low
range.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to GLSL ES 3.2 spec, see table in 9.2.1 "Linked Shaders"
section, the precision qualifier should match for uniform variables.
This also applies to previous GLSL ES 3.x specs.
This 'if' checks the condition for uniform variables, while for UBOs
it is checked in link_interface_blocks.cpp.
Fixes: b50b82b8a553
("glsl/es31: precision qualifier doesn't need to match in shader interface block members")
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- Add the proper values to gen9+ (Jason)
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we had a check for 1d of narrow 2D textures, however
narrow 2d textures caused gpu hangs, but it was correct for 1d
textures.
This fixes a bunch of 1D image piglits for me.
Fixes: 7b8e1c089d (r600/texture: drop lowering 1d/2d images to linear.)
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the GLSL 4.60 spec Section 5.9 (Expressions):
"Dividing by zero does not cause an exception but does result in
an unspecified value."
Fixes: 89285e4d47a6 "nir: add new constant folding infrastructure"
Reviewed-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105271
|
|
|
|
|
|
|
|
|
| |
Ideally the st_finalize_texture call would take care of that, but it
doesn't seem to with KHR-GL45.shader_image_size.advanced-nonMS-*. This
assertion makes sure that no such values are passed to the driver.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was segfaulting:
dEQP-VK.memory.pipeline_barrier.host_write_index_buffer.1024
Fixes: 8de6f797070 (ac/radeonsi: add load_base_vertex() to the abi)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes:
dEQP-VK.glsl.440.linkage.varying.component.*
Fixes: 1c57a6da5e3 (ac/shader: scan vertex inputs usage mask)
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This is never used.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
radeonsi, i965 and anv all treat fdd{x,y} opcodes the same as
fdd{x,y}_coarse by default. The SPIR-V spec lets the implementation
decide how it should be handled and radv was previously going
for the higher quality option. Here we change the shared amd
code to match how nir_op_fdd{x,y} is expected to be handled
by the other NIR drivers.
Fixes piglit test:
./bin/arb_shader_texture_lod-texgrad -auto
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following piglit tests:
./bin/arb_shader_draw_parameters-basevertex basevertex -auto -fbo
./bin/arb_shader_draw_parameters-basevertex basevertex-baseinstance -auto -fbo
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|