| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use nir_lower_clip pass for adding the VS/FS instructions to handle
user-clip-planes and CLIPDIST. Wire up support for load_user_clip_plane
intrinsic to fetch ucp[plane] values as driver-params (passed as const's
to the shader).
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vertex shader lowering adds calculation for CLIPDIST, if needed
(ie. user-clip-planes), and the frag shader lowering adds conditional
kills based on CLIPDIST value (which should be treated as a normal
interpolated varying by the driver).
Note that this won't quite do the right thing in the face of MSAA plus
user-clip-planes, since all the samples would be killed or not (rather
than potentially only a portion of them). But it's better than no UCP
support at all for drivers that don't have this in hw.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
For lowering user-clip-planes, we need a way to pass the enabled/used
user-clip-planes in to shader.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Used internally in freedreno/ir3 to calc stream-out position. Seems
like a generic enough way to implement stream-out (using str instrs),
plus it avoids compiler warnings by sneaking in a non-enum value in
switch statements.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
A small step towards un-TGSI'ifying ir3.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes the newly-added arb_texture_buffer_object-bufferstorage
piglit test.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When updating texture buffers, we might end up replacing the whole
buffer. Check that the tic address matches the resource address, and if
not, update the tic and reupload it.
This fixes:
arb_direct_state_access-texture-buffer
arb_texture_buffer_object-data-sync
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instructions with difference in PM field can actually be paired up if
the one without PM doesn't do packing/unpacking and non-NOP
packing/unpacking operations from PM instruction aren't added to the
other without PM.
total instructions in shared programs: 48209 -> 47460 (-1.55%)
instructions in affected programs: 11688 -> 10939 (-6.41%)
Signed-off-by: Boyan Ding <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The idea here is not that it gives register coalescing a little bit of a
helping hand. It doesn't actually fix the coalescing problems, but it
seems to help a good bit.
Shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1746280 -> 1683959 (-3.57%)
instructions in affected programs: 1259166 -> 1196845 (-4.95%)
helped: 11363
HURT: 148
v2 (Jason Ekstrand):
- Run nir_move_vec_src_uses_to_dest after going out of SSA
- New shader-db numbers
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Handle non-SSA sources and destinations
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
| |
The provided indices have the very nice property that if A dominates B then
A->index <= B->index. We should document that somewhere.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Various pieces of code to create compressed textures will first
generate an uncompressed RGBA texture into a temporary buffer,
and then read from that buffer while creating the final compressed
texture in the requested format.
The code reading from the temporary buffer assumes the buffer is
formatted as an array of bytes in RGBA order. However, the buffer
is filled using a _mesa_texstore call with MESA_FORMAT_R8G8B8A8_UNORM
format -- this is defined as an array of *integers* holding the
RGBA values in packed format (least-significant to most-significant).
This means incorrect bytes are accessed on big-endian systems.
This patch fixes this by using the MESA_FORMAT_A8B8G8R8_UNORM format
instead on big-endian systems when filling the buffer. This fixes
about 100 piglit test case failures on s390x for me.
Signed-off-by: Ulrich Weigand <[email protected]>
Tested-by: Oded Gabbay <[email protected]>
Cc: "10.6" "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
XA has been using L8_UNORM for a8 and yuv component surfaces.
This commit instead makes XA prefer R8_UNORM since it's assumed to have a
higher availability.
Also neither of these formats are suitable as destination formats using
destination alpha blending, so reject those operations.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From OpenGL 4.5 Core spec (7.13):
"If pipeline is a name that has been generated (without subsequent
deletion) by GenProgramPipelines, but refers to a program pipeline
object that has not been previously bound, the GL first creates a
new state vector in the same manner as when BindProgramPipeline
creates a new program pipeline object."
I interpret this as "If GetProgramPipelineiv gets called without a
bound (but valid) pipeline object, the state should reflect initial
state of a new pipeline object." This is also expected behaviour by
ES31-CTS.sepshaderobjs.PipelineApi conformance test.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From OpenGL ES 3.1 spec (7.12):
"Most properties set within program objects are specified not to
take effect until the next call to LinkProgram or ProgramBinary.
Some properties further require a successful call to either of
these commands before taking effect. GetProgramiv returns the
properties currently in effect for program, which may differ from
the properties set within program since the most recent call to
LinkProgram or ProgramBinary, which have not yet taken effect. If
there has been no such call putting changes to pname into effect,
initial values are returned."
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
| |
Specified in OpenGL ES 3.1 spec, Table 23.32: Program Object State.
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Marta Lofstedt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As a bonus we get indirect support for arrays of arrays for free.
V5: couple of small clean-ups suggested by Jason.
V4: fix struct member location caclulation, use nir_ssa_def rather than
nir_src for the indirect as suggested by Jason
V3: Use nir_instr_rewrite_src() with empty src rather then clearing
the use_link list directly for the old indirects as suggested by Jason
V2: Fixed validation error in debug build
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
V2: update comments
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will allow us to access the uniform later on without resorting to
building a name string and looking it up in UniformHash.
V3: remove line wrap change from this patch
V2: store slot number for all non-UBO uniforms to make code more
consitent, renamed explicit_binding to explicit_location and added
comment about what it does. Store the location at every shader stage.
Updated data.location comments in ir/nir.h.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required so that the next patch can safely assign the slot id
to the var.
The ids are now assigned in the order we want before allocating storage
so there is no need to sort the storage array and move things around.
V2: rename variable to make code easier to follow as suggested by Jason
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows the correct offset to be easily calculated for indirect
indexing when a struct array contains multiple samplers, or any crazy
nesting.
The indices for the folling struct will now look like this:
Sampler index: 0 Name: s[0].tex
Sampler index: 1 Name: s[1].tex
Sampler index: 2 Name: s[0].si.tex
Sampler index: 3 Name: s[1].si.tex
Sampler index: 4 Name: s[0].si.tex2
Sampler index: 5 Name: s[1].si.tex2
Before this change it looked like this:
Sampler index: 0 Name: s[0].tex
Sampler index: 3 Name: s[1].tex
Sampler index: 1 Name: s[0].si.tex
Sampler index: 4 Name: s[1].si.tex
Sampler index: 2 Name: s[0].si.tex2
Sampler index: 5 Name: s[1].si.tex2
struct S_inner {
sampler2D tex;
sampler2D tex2;
};
struct S {
sampler2D tex;
S_inner si;
};
uniform S s[2];
V3: Update comments with suggestions from Jason
V2: rename struct array counter to have better name
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 48961fa3ba37999a6f8fd812458b735e39604a95.
glamor/Xwayland use this, the spec saying something when it
was written, and the fact that the comment says Mesa relies on it
hasn't changed.
I also don't have a copy of this patch in my mail archive, which
seems wierd, did it get posted to mesa-dev?
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Use the helper from the newly-updated generated header file.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Even though luminance formats don't have alpha, we still want the alpha
output to go to the blender. This fixes the luminance blending tests.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
(originally part of previous patch, split out to separate patch by Rob)
v2: squash in some fixes from Eric
v3: Another fix from Eric for point coords.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This avoids exceeding the size of the .index bitfield since it got
truncated, and should make our NIR look more like the NIR that the rest of
the NIR developers are working on.
v2: split out vc4 updates, first patch uses varying_slot_to_tgsi_semantic()
helper, and second patch does the actual conversion.
v3: add frag_result_to_tgsi_semantic() helper and don't try to map
frag_results to semantic name/index as if they were varying_slot's
v4: use VERT_ATTRIB_ for VS inputs
v5: Fix vc4 build.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
This is what the hardware supports, there never was any sort of 64K
limit.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes failures with the newly-submitted max-size texture buffer
piglit test for GPUs exposing >= 128M max texels.
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
| |
Move this code out of bufferobj.c since it's not strongly connected to
buffer objects.
Acked-by: Matt Turner <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
| |
v2: split out moving of FILE *fp into state structure into it's own
(more complete patch) to reduce the noise in this one
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename print_var_state to print_state, and stuff FILE ptr into the state
object. This avoids passing around an extra parameter everywhere.
v2: even more extensive conversion.. use state *everywhere* instead of
FILE ptr, and convert nir_print_instr() to use state as well
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Similar to fee0686c21c631d96d6042741267a3c218c23ffc, but in this case to
ensure that drm_gralloc and libGLES_mesa are sharing a single screen.
Bumps libdrm_freedreno version dependency, as it requires the new
fd_device_fd() API.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When preparing the barrier payload, the instructions should operate in
simd8 mode since we only use 1 payload register.
fs_inst::regs_read is also updated to indicate that it only reads one
register for SHADER_OPCODE_BARRIER.
These issues were flagged by:
commit cadd7dd384b33a779d46bd664f456bed4a21a5b7
Author: Jason Ekstrand <[email protected]>
Date: Thu Jul 2 15:41:02 2015 -0700
i965/fs: Add a very basic validation pass
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
C++ gets cranky if we take references of temporaries. This isn't a problem
yet in master because nir_builder is never used from C++. However, it will
be in the future so we should fix it now.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Both use the same layout for the buffer containing border-color values,
so rather than duplicating the logic in a4xx, split it out into a
helper.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have a replicating fdot instruction, we can actually coalesce
into the destinations of vec4 instructions. We couldn't really do this
before because, if the destination had to end up in .z, we couldn't
reswizzle the instruction. With a replicated destination, the result ends
up in all channels so we can just set the writemask and we're done.
Shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1747753 -> 1746280 (-0.08%)
instructions in affected programs: 143274 -> 141801 (-1.03%)
helped: 667
HURT: 0
It turns out that dot-products matter...
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|