| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
MESA_FORMAT_RGBA_DXT5 should actually be reserved for GL_RGBA[4]_DXT5_S3TC.
Also, Gallium and other dri drivers (radeon and nouveau) follow this mapping
scheme.
Reviewed-by: Chad Versace <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
As Glenn did for finalize_loop we need to update_cf when we
add a POP at the end of a shader.
I think this fixes one of the earlier shader going off end
of memory problems we've stopped.
Reviewed-by: Glenn Kennard <[email protected]>
Cc: "10.6" "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The docs specifically call out SEL with .l and .ge as the
implementations of MIN and MAX respectively. Among other things,
SEL with these conditional mods are commutative.
See commit 3b7f683f.
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Certain compressed formats require this setting. The docs don't go into much
detail as to why it's needed exactly.
This patch introduces no piglit regressions on gen9 (bsw is untested). Note that
the SKL "regressions" are fixed tests, and the egl_khr_gl_colorspace tests are
WTF. The patch also fixes nothing I can find.
http://otc-mesa-ci.jf.intel.com/job/Leeroy/127820/
v2:
Reworded commit message (Matt); Added piglit results link.
Restructured condition (Matt)
Moved check out to function (Nanley). I left the setting of the bit in the
surface state open coded because it seems to go better with the existing code.
v3:
Use and inline function only in gen8_emit_texture_surface_state() (Matt).
Cc: Matt Turner <[email protected]>
Cc: Nanley Chery <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
This can be done with a single pass for the instruction base,
and takes renumber_registers out of its spot on the profile.
Acked-by: Marek Olšák <[email protected]
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The glsl->tgsi convertor does some temporary register reduction
however in profiling shader-db this shows up quite highly,
so optimise things to reduce the number of loops through
all the instructions we do. This drops merge_registers
from 4-5% on the profile to 1%. I think this can be reduced
further by possibly optimising the renumber pass.
Acked-by: Marek Olšák <[email protected]
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Instead of looking this up lots, lets just cache it in the instruction
translation up front. I just noticed this function what high in a profile
of shader-db on radeonsi.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This should avoid C++ fail including this header.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
Unlike samplers we can get the correct value for subroutines from
component_slots()
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes better use of the work that the TGSI API has done for
us.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The selector is shared by all shader variants, so the
individual shaders shouldn't change it. Use tgsi_shader_scan()
results to set geometry properties within a
r600_create_shader_state() call and treat said propertices in
the selector as read-only within r600_shader_from_tgsi().
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Note that 'geometry shader properties' should be carried in the
selector state over the shader state in any case.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
It confuses my editor.
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
| |
Cc: [email protected]
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The hardware is capable of dealing with GL1-style user clip planes.
No clip vertex, no clip distances. Fixes a number of ucp tests, as well
as neverball.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
| |
According to NVIDIA, local performance counters (MP) are prefixed
with SM, while global performance counters (PCOUNTER) are called PM.
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
| |
Signed-off-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This code was broken by the tess merge, and I totally missed it
until now. I'm not sure this fixes anything but it stops the assert.
Cc: "11.0" <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise this will crash on 32-bit, and it gets rid of
warnings building on 32-bit.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On 32-bit we need to use PRIu64 flags for printfs,
otherwise this segfaults in R600_DEBUG=help otherwise.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This greatly improves generated code, especially for the snorm variants,
since it is able to get rid of the lshift/rshift for sext, as well as
replacing each shift + mask with a single op.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It is fairly tricky to detect the proper conditions for using bitfield
insert, but easy to just use it up front. This removes a lot of
instructions on nvc0 when invoking the packing builtins.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
No shader-db changes on g4x, snb, hsw, or bdw.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
start_to -> dst_start
end_to -> dst_end
start_from -> src_start
end_from -> src_end
var_to -> dst_var
var_from -> src_var
reg_to -> dst_reg
reg_to_offset -> dst_reg_offset
reg_from -> src_reg
Not sure how these made sense to me before.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
No need to walk through instructions in blocks we know don't contain our
registers' live ranges.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I always thought that the is_control_flow() -> return false check was a
bad hack, and some previous attempts to remove it have failed and have
been reverted.
The previous two patches fix some problems that caused register
coalescing to not notice some interference between registers, which the
is_control_flow() check apparently works around.
With that fixed, we can calculate interference more accurately.
total instructions in shared programs: 6261319 -> 6257917 (-0.05%)
instructions in affected programs: 346282 -> 342880 (-0.98%)
helped: 1552
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
equals() returns false for registers with different types, using it
isn't appropriate to determine whether an is overwriting a register.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Noticed when debugging things that lead to the next patch.
On G45 (and presumably ILK) this helps register coalescing:
total instructions in shared programs: 4077373 -> 4077340 (-0.00%)
instructions in affected programs: 43751 -> 43718 (-0.08%)
helped: 52
HURT: 2
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Zero sized uniforms can exist in the list, but they don't get get any space
allocated in prog_data->params or in the param_size array, so the size
should not be set for them. This was previously fixed in:
commit: 781dc7c0e1f41502f18e07c0940af949a78d2792.
However,
commit: 259f7291de2387aa3ac5f856b39b7b934a1d8e7d
removed the fix.
Signed-off-by: Marta Lofstedt <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the sampler object has been deleted in the same context the binding
will have been cleared. If it has been deleted in another context, the
spec does not say what should returned. None of the other binding point
queries check for deletion in another context.
Also, as names of deleted objects are free for reuse, the current code
didn't even work reliably.
Reviewed-by: Fredrik Höglund <[email protected]>
Signed-off-by: Fredrik Höglund <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This adds index queries (glGet*i_v) for GL_TEXTURE_BINDING_* and
GL_SAMPLER_BINDING, as well as textue queries
(glGetTex{,ture}Parameter*) for GL_TEXTURE_TARGET.
CC: "10.6 11.0" <[email protected]>
Reviewed-by: Fredrik Höglund <[email protected]>
Signed-off-by: Fredrik Höglund <[email protected]>
|
| |
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91785
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Fixes 'make check' on FreeBSD.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shader-db result on Ivy Bridge:
total instructions in shared programs: 145484 -> 145445 (-0.03%)
instructions in affected programs: 225 -> 186 (-17.33%)
helped: 5
HURT: 0
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Thomas Helland <[email protected]>
Signed-off-by: Boyan Ding <[email protected]>
|
|
|
|
|
|
| |
Never used.
Reviewed-by: Mark Janes <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Shaders that contain instruction data after an instruction with EOP could end
up parsing that as an instruction, leading to various crashes and asserts in
SB as it gets very confused if it sees for instance a loop start instruction
jumping off to some random point.
Add a couple of asserts, and print EOP bit if set in old asm printer.
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
e8e443 missed adding check for undef values also in
unreserve function, leading to an assert triggering.
Signed-off-by: Glenn Kennard <[email protected]>
Cc: <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cube maps are special in that they have separate teximages for each
face. We handled that by copying the data to them separately, but in
case zoffset != 0 or depth != 6 we would read off the end of the client
array or modify the wrong images.
zoffset/depth have already been verified by the time the code gets to
this stage, so no need to double-check.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR cursor API is exactly what we want for the builder's insertion
point. This simplifies the API, the implementation, and is actually
more flexible as well.
This required a bit of reworking of TGSI->NIR's if/loop stack handling;
we now store cursors instead of cf_node_lists, for better or worse.
v2: Actually move the cursor in the after_instr case.
v3: Take advantage of nir_instr_insert (suggested by Connor).
v4: vc4 build fixes (thanks to Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]> [v4]
Acked-by: Connor Abbott <[email protected]> [v4]
|