| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
| |
The deferred allocation doesn't really make much sense anymore, since we no
longer allocate swizzled/linear memory in chunks and not per level / slice
neither.
This means we could fail resource creation a bit more (could already fail in
theory anyway) but should not fail maps later (right now, callers can't deal
with neither really).
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Just use a tex_data pointer directly - the description was no longer correct
neither.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Since switching to non-swizzled rendering we only have "normal", aka linear,
offsets.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
it looks since ce1a1372280d737a1b85279995529206586ae480 they are now included
in more places, in particular even for things buildable with msvc, and hence
those break the build.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
| |
|
|
|
|
| |
This is required for fallbacks to work with ARB_draw_indirect.
|
|
|
|
|
|
| |
v2:
Added comments to util_draw_indirect, clarified and fixed map size.
Removed unlikely().
|
|
|
|
|
| |
Intended for use with GL_ARB_draw_indirect's DRAW_INDIRECT_BUFFER
target or for D3D11_RESOURCE_MISC_DRAWINDIRECT_ARGS.
|
|
|
|
|
|
|
|
|
| |
Drop stdbool, due to the X server being a pain and having
struct members called bool, although I've sent a patch to fix
that we should retain stupidity here. Use unsigned char
which is what GLboolean is anyways.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 17c7ead7 exposed a bug in how uniform loading happens in the
presence of discard. It manifested itself in an application as
randomly incorrect pixels on the borders of conditional areas.
This is due to how discards jump to the end of the shader incorrectly
for some channels. The current implementation checks each 2x2
subspan to preserve derivatives. When uniform loading via samplers
was turned on, it uses a full execution mask, as stated in
lower_uniform_pull_constant_loads(), and only populates four channels
of the destination (see generate_uniform_pull_constant_load_gen7()).
It happens incorrectly when the first subspan has been jumped over.
The series that implemented this optimization was done before the
changes to use samplers for uniform loads. Uniform sampler loads
use special execution masks and only populate four channels, so we
can't jump over those or corruption ensues.
This fix only jumps to the end of the shader if all relevant channels
are disabled, i.e. all 8 or 16, depending on dispatch. This
preserves the original GLbenchmark 2.7 speedup noted in commit
beafced2.
It changes the shader assembly accordingly:
before : (-f0.1.any4h) halt(8) 17 2 null { align1 WE_all 1Q };
after(8) : (-f0.1.any8h) halt(8) 17 2 null { align1 WE_all 1Q };
after(16): (-f0.1.any16h) halt(16) 17 2 null { align1 WE_all 1H };
v2: Cleaned up comments and conditional ordering.
v3: Fix typo.
Signed-off-by: Cody Northrop <[email protected]>
Reviewed-by: Mike Stroyan <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79948
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
To aid in debugging.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Setting a couple of bits is the same cost or less as conditionally
setting a couple of bits.
|
|
|
|
|
|
|
| |
We've often created the CFG immediately before, so use it when
available.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
cfg, for instance, is a pointer to a local variable in
calculate_live_intervals, certainly not valid after that function has
returned.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
foreach_list_typed_const was never used as far as I can tell.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
Makes it more clear what we're doing and requires less knowledge of
exec_list.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Acked-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
We want to have the dri common files compiled to define USE_DRICONF.
We need to check both NEED_OPENGL_COMMON and HAVE_DRICOMMON
Signed-off-by: Axel Davy <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
UniformBufferSize is in bytes so we need to divide by 16 to get the
number of constant buffer slots. Also, the ureg_DECL_constant2D()
function takes first..last parameters so we need to subtract one
for the last value.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we were always using the address register and indirect addressing
to index into a UBO constant buffer. With this change we only do that
when necessary.
Using the piglit bin/arb_uniform_buffer_object-rendering test as an
example:
Shader code:
uniform ub_rot {float rotation; };
...
m[1][1] = cos(rotation);
Before:
IMM[1] INT32 {0, 1, 0, 0}
1: UARL ADDR[0].x, IMM[1].xxxx
2: MOV TEMP[0].x, CONST[3][ADDR[0].x].xxxx
3: COS TEMP[1].x, TEMP[0].xxxx
After:
0: COS TEMP[0].x, CONST[3][0].xxxx
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise, if we were creating a const buffer src register for a UBO
the index into the UBO was always zero.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
To implement the unlit_centroid_workaround, previously we emitted
(+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 1Q };
(-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 1Q };
where the flag register contains the channel enable bits from g0.
Since the predicates are complementary, the pair of pln instructions
write to non-overlapping components of the destination, which is the
case that the dependency control hints are designed for.
Typically setting dependency control hints on predicated instructions
isn't safe (if an instruction doesn't execute due to the predicate, it
won't update the scoreboard, leaving it in a bad state) but since we
must have at least one channel executing (i.e., +f0 is true for some
channel) by virtue of the fact that the thread is running, we can put
the +f0 pln instruction last and set the hints:
(-f0) pln(8) g20<1>F g16.4<0,1,0>F g2<8,8,1>F { align1 NoDDClr 1Q };
(+f0) pln(8) g20<1>F g16.4<0,1,0>F g4<8,8,1>F { align1 NoDDChk 1Q };
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Maybe lets us skip some PLN instructions if whole subspans are disabled?
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
And plumb them through. Also make the assert in the generator look like
the vec4 one.
Reviewed-by: Kristian Høgsberg <[email protected]>
|