| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Also add code to brw_upload_state to set it when the compute program
changes.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If a send message is emitted with a message length that is less than
required for the message then the remaining parameters default to
zero. We can take advantage of this to save a register when a shader
passes constant zeroes as the final coordinates to the sample
function.
I think this might be useful for GLES applications that are using 2D
textures to simulate 1D textures.
On Skylake it will be useful for shaders that do
texelFetch(tex,something,0) which I think is fairly common. This helps
more on Skylake because in that case the order of the instruction
operands are u,v,lod,r which is good for 2D textures whereas before
they were u,lod,v,r which is only good for 1D textures.
On Haswell:
total instructions in shared programs: 8535730 -> 8533261 (-0.03%)
instructions in affected programs: 236968 -> 234499 (-1.04%)
helped: 1174
On Skylake:
total instructions in shared programs: 10345646 -> 10341237 (-0.04%)
instructions in affected programs: 293011 -> 288602 (-1.50%)
helped: 1218
Reviewed-by: Matt Turner <[email protected]>
v2: Applied suggestions by Kenneth Graunke:
- Only apply on Gen5+
- Apply to all texture opcodes, not just TEX and TXF.
Moved the optimisation into the loop as suggested by Matt Turner.
Fix the array index when there is a header.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Gen9+ there needs to be a header when sampling using SIMD4x2. The
header is set up by copying from the g0 register. Commit 07c571a39f
tried to fix this mov instruction to always use an exec size of 8
because previously it was incorrectly using 4. It did this by casting
the type of the destination register to vec8. This was done because
there is code in brw_set_dest to guess the exec size based on the
width of the dest register. However I misunderstood how this works
because it is actually only used when the width is less than 8. That
means the patch actually changed it to use the default exec size which
on SIMD16 would be 16 and the MOV would clobber over the first
register in the send message. This patch makes it additionally set the
default exec size to 8. This is similar to how the message is set up
in fs_generator::generate_tex.
I think this wasn't picked up by any Piglit tests because we don't
have any fragment shaders that hit this code path so nothing was using
SIMD16. However the patch caused failures in deqp tests.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90153
Reviewed-by: Matt Turner <[email protected]>
Tested-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stage_abbrev and stage_name fields in backend_visitor provide what
we need without any additional effort. It also means we'll get the
right names for compute shaders, SIMD8 geometry shaders, and both kinds
of tessellation shaders.
This does unfortunately change the capitalization of the stage
abbreviation in the INTEL_DEBUG=optimizer output filenames. It doesn't
seem worth adding code to handle, though.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - move interop.cpp to clover/api
- change intptr_t to void* in the interface
- add a virtual function fence() to simplify some code
v3: - use bool in the interface
v4: - enclose the last two interop functions in try..catch
Reviewed-by: Francisco Jerez <[email protected]>
|
| |
|
| |
|
|
|
|
| |
v2: fix the SYNC_CONDITION query
|
| |
|
| |
|
|
|
|
|
| |
This is an empty extension whose presence means that EGL sync objects can be
used with ES contexts.
|
| |
|
|
|
|
|
|
|
|
| |
Commit a9d08a250 accidentally didn't make use of the new src1 variable.
Use it.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
src1 would contain the predicate, which would get emitted as a register
source by an undiscerning srcId helper. Work around this in the same way
as in emitTEX.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
| |
This was causing src0 to always have the absolute value flag set.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Original blorp writes only one buffer per shader invocation. Once
the launch mechanism is shared with glsl-based programs there will
be need for supporting multiple render targets.
Also drop the always constant color write disable settings.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Note that the magic number of one in gen7 logic is replaced by
BRW_SF_URB_ENTRY_READ_OFFSET ( == 1 also) for clarity.
On gen6 the change from zero to one (BRW_SF_URB_ENTRY_READ_OFFSET)
has no effect for native blorp as blorp doesn't use any
additional attributes. In fact, regular pipeline setup always
uses BRW_SF_URB_ENTRY_READ_OFFSET even when there are no additional
attributes. Hence the change makes the two (blorp and regular)
consistent.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Ken): s/use_unorm_coords/non_normalized_coords/
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This was still needed when we had support for blorp clears but now
this is fixed to nop.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: Use SET_FIELD() for sampler count, and for that reason
added GEN7_PS_SAMPLER_COUNT_MASK.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Now the uploading depends only on the input parameters instead
of consulting the current gl-state.
v2: Rebased on top of sampler count clamping
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Matt): Moved * to the name.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Read and write parts of the state stage are also split into
explicit arguments allowing future patches to use constant
program data.
v2 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Note that brw_update_renderbuffer_surfaces() already had a helper
variable which was used in parallel to direct access of the current
draw buffer of the context.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Notice that in gen7_wm_surface_state.c there is also indentation
change in the surrounding code removing tabs.
v2 (Matt): Fixed whitespace: tabs -> spaces
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The value is actually clamped to 0-16 as sample state pointer
can be used to support more than 16 samplers.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In response to another patch, Emil asked for some clarification how this
stuff works. Rather than just reply to the e-mail, I decided to update
the exlanation in the code.
Signed-off-by: Ian Romanick <[email protected]>
Cc: Emil Velikov <[email protected]>
|
| |
|
| |
|
|
|
|
|
| |
Acked-by: Francisco Jerez <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The opt_sampler_eot optimisation of fs_visitor effectively assumes
that it is running on a fragment shader because it casts the program
key to a brw_wm_prog_key. However on Skylake fs_visitor can also be
used for vertex shaders. It looks like this usually works anyway
because the optimisation is skipped if key->nr_color_regions != 1.
However for a vertex shader the key is actually a brw_vs_prog_key so
the space for nr_color_regions is probably taken up by
key->base.program_string_id. This can end up making nr_color_regions
be 1 in which case the function will later assert when the last
instruction is not FS_OPCODE_FB_WRITE. This was making the DEQP test
suite assert. Presumably this only happens there because that compiles
a lot of shaders so it would end up with a high value for
program_string_id.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Accidentally added since the introduction of the file.
Cc: "10.4 10.5" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Acked-by: Francisco Jerez <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
| |
Acked-by: Francisco Jerez <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
| |
Acked-by: Francisco Jerez <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
| |
Later we can remove the compat code
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously binding an unitialized managed texture
was causing a crash, and a workaround was added to
prevent the crash.
This patch removes this workaround and instead set the initial
state of managed textures as dirty, so that when the texture is bound
for the first time, it is always initialized.
Signed-off-by: Axel Davy <[email protected]>
|