| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Original blorp writes only one buffer per shader invocation. Once
the launch mechanism is shared with glsl-based programs there will
be need for supporting multiple render targets.
Also drop the always constant color write disable settings.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that the magic number of one in gen7 logic is replaced by
BRW_SF_URB_ENTRY_READ_OFFSET ( == 1 also) for clarity.
On gen6 the change from zero to one (BRW_SF_URB_ENTRY_READ_OFFSET)
has no effect for native blorp as blorp doesn't use any
additional attributes. In fact, regular pipeline setup always
uses BRW_SF_URB_ENTRY_READ_OFFSET even when there are no additional
attributes. Hence the change makes the two (blorp and regular)
consistent.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Ken): s/use_unorm_coords/non_normalized_coords/
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This was still needed when we had support for blorp clears but now
this is fixed to nop.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Use SET_FIELD() for sampler count, and for that reason
added GEN7_PS_SAMPLER_COUNT_MASK.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now the uploading depends only on the input parameters instead
of consulting the current gl-state.
v2: Rebased on top of sampler count clamping
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Matt): Moved * to the name.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Read and write parts of the state stage are also split into
explicit arguments allowing future patches to use constant
program data.
v2 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Note that brw_update_renderbuffer_surfaces() already had a helper
variable which was used in parallel to direct access of the current
draw buffer of the context.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Notice that in gen7_wm_surface_state.c there is also indentation
change in the surrounding code removing tabs.
v2 (Matt): Fixed whitespace: tabs -> spaces
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The value is actually clamped to 0-16 as sample state pointer
can be used to support more than 16 samplers.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The opt_sampler_eot optimisation of fs_visitor effectively assumes
that it is running on a fragment shader because it casts the program
key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
used for vertex shaders. It looks like this usually works anyway
because the optimisation is skipped if key->nr_color_regions != 1.
However for a vertex shader the key is actually a brw_vs_prog_key so
the space for nr_color_regions is probably taken up by
key->base.program_string_id. This can end up making nr_color_regions
be 1 in which case the function will later assert when the last
instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test
suite assert. Presumably this only happens there because that compiles
a lot of shaders so it would end up with a high value for
program_string_id.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Code generation is not allowed to fail for any reason - in fact,
fs_generator has no mechanism for failing. The visitor is responsible
for that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Copy over from brw_fs_visitor.cpp.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We have to use W/UW type for src1 of the multiply in the MUL/MACH macro,
but in order to read the low 16-bits of each 32-bit integer, we need to
set the appropriate stride.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 9f5e5bd34d8ba48c851b442fb88f742b1ba6a571.
I have no idea what made me believe these didn't apply to Gen > 7. They
do, and without them we generate bad code that causes failures on Gen 8.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
gen8_update_texture_surface().
This moves most of the surface state set-up logic that can be shared
between textures and shader images to a separate function.
|
|
|
|
|
|
|
| |
gen7_update_texture_surface().
This moves most of the surface state set-up logic that can be shared
between textures and shader images to a separate function.
|
|
|
|
| |
miptree.
|
|
|
|
|
|
|
|
|
|
| |
This makes INTEL_DEBUG=perf report shader recompiles due to CMS vs.
UMS/IMS differences and Sandybridge textureGather workarounds.
Previously, we just flagged them as "Something else".
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, sampler messages were decoded as
sampler (1, 0, 2, 2) mlen 6 rlen 8 { align1 1H };
I don't know how much time we've collectly wasted trying to read this
format. I can never recall which number is the surface index, sampler
index, message type, or...whatever that other number is. Figuring out
the message name from the numerical code is also painful.
Now they decode as:
sampler sample_l SIMD16 Surface = 1 Sampler = 0 mlen 6 rlen 8 { align1 1H };
This is easy to read at a glance, and matches the format I used for
render target formats.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Fixes assertion failures in three piglit tests on Gen 6 since commit
0087cf23e.
|
|
|
|
| |
Forgotten in commit 529064f6.
|
|
|
|
|
|
| |
Constant combining won't promote non-floats, so this isn't safe.
Fixes regressions since commit 0087cf23e.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
The INTEL_DEBUG variable is a uint64_t and if we want a enum value higer
than 32 bits, you need to use ull. We might as well use it for all of them.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The BLT engine on Gen8+ requires linear surfaces to be cacheline
aligned. This restriction was added as part of converting the BLT to
use 48-bit addressing.
The main user, intel_emit_linear_blit, now handles this properly.
But we might also have linear miptrees; just refuse to blit those.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The BLT engine on Gen8+ requires linear surfaces to be cacheline
aligned. This restriction was added as part of converting the BLT to
use 48-bit addressing.
intel_emit_linear_blit needs to handle blits that are not cacheline
aligned, as we use it for arbitrary glBufferSubData calls and subrange
mappings.
Since intel_emit_linear_blit uses 1 byte per pixel, we can use the src/dst
pixel X offset field to represent the unaligned portion, and subtract
that from the address so it's cacheline aligned.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88521
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: [email protected]
|
|
|
|
| |
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This name better matches what it's actually used for. The patch was
generated with the following command:
for file in *; do
sed -i -e s/brw_compile/brw_codegen/g $file
done
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since commit 2881b123, we have used 0/~0 for representing booleans on all
gens. However, we still had a bunch of places in the visitor code where we
were still referring to ctx->Const.UniformBooleanTrue. Since this is
always ~0, we can just remove them.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This also involves moving revision checking to screen creation time and
passing that into brw_get_device_info so that we can get the right
device_info for early versions of SKL. Since the only place we used
revision was to check for SIMD16 3-src instruction support, it's safe to
remove the revision field from brw_context.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
It's basically just a copy of GEN7_FEATURES only with is_haswell set
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|