| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Christian Gmeiner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
u_upload_mgr sets it, so that util_range_add can skip the lock.
The time spent in tc_transfer_flush_region decreases from 0.8% to 0.2%
in torcs on radeonsi.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is based on the fix I used for the same problem on V3D. In this
case, it fixes all but the the
dEQP-GLES2.functional.texture.filtering.2d.*_npot cases of
dEQP-GLES2.functional.texture.filtering.2d.*'s failures.
Acked-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Consolidate a few more generic shaders setup regs in fd6_emit_shader.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
| |
This add generic stage state setup for HS/DS/GS to the program state
object.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Let's try to always order the stages in the pipeline order.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
| |
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
| |
We're using vs and fs now, and adding hs, ds and gs soon. It's
confusing enough that we have both DS/TCS and HS/TES. At least for VS
and FS there doesn't have to be multiple names.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
We'll be sharing this logic for new shader stages soon.
Signed-off-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use VPC_SO_OVERRIDE to control whether we do streamout in binning or
draw pass. Normally we want to do streamout in binning pass, except
when there is a single tile and binning passed is skipped.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We could bit doing streamout from binning pass. In this case we want to
use the full VS which doesn't have (potentially streamed out) varyings
stripped out.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When both UTIL_QUEUE_INIT_RESIZE_IF_FULL and
UTIL_QUEUE_INIT_USE_MINIMUM_PRIORITY are set, we can get into a
situation where the queue never executes and grows to a huge size
due to all other threads being busy.
This is the case with the shader cache when attempting to compile a
huge number of shaders up front. If all threads are busy compiling
shaders the cache queues memory use can climb into the many GBs
very fast.
The use of these two flags with the shader cache is intended to
allow shaders compiled at runtime to be compiled as fast as possible.
To avoid huge memory use but still allow the queue to perform
optimally in the run time compilation case, we now add the ability
to track memory consumed by the jobs in the queue and limit it to
a hardcoded 256MB which should be more than enough.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Compute the number of writes up front.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Also, swap vs and fs constructor or so fs comes first.
Signed-off-by: Kristian H. Kristensen <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using xfb and rasterizing, the fragment shader may have fewer
inputs than the vertex shader outputs. We can't rely on gl_Position to
be placed at fs->total_in, but have to instead remember where we add
it in the link map and use that location.
Fixes 100+ tesselation dEQPs under
dEQP-GLES31.functional.tessellation.primitive_discard.*
dEQP-GLES31.functional.tessellation.user_defined_io.*
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The AnTuTu "garden" benchmark overflows the fixed size constbuffer
stateobject, so lets be more clever and calculate (a potentially
slightly pessimistic) actual size.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers]
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Fixes dEQP-GLES3.functional.texture.specification.texstorage3d.size.3d_2x2x2_2_levels
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If the lowest (largest) mipmap level is too small to tile, then don't
bother pretending.
Note that this requires initializing pipe->screen before
fd_resource_level_linear() is called.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kristian H. Kristensen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The driver can't determine PIPE_QUERY_PRIMITIVES_GENERATED or
PIPE_QUERY_PRIMITIVES_EMITTED once we support geometry or
tessellation, since these stages add primitives at runtime. Use the
WRITE_PRIMITIVE_COUNTS event to write back the primitive counts and
implement a hw query for this.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The GPU writes out streamout offsets as it goes to the FLUSH_BASE
pointer. We use that value with CP_MEM_TO_REG when appending to the
stream so that we don't have to track the offsets with the CPU in the
driver. This ensures that streamout continues to work once we enable
geometry and tessellation shader stages that add geometry.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For render formats, update fd2_pipe2color to only work with HW supported
render formats, and remove the format whitelist is_format_supported. This
patch enables float render formats (which work).
For vertex/texture formats, use a generic function which translates using
the bitsize of the channels. Since we fake support for some vertex formats,
check for these in is_format_supported to avoid enabling them as sampler
formats.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Use fd_gmem_restore_format() to avoid trying to use unsupported Z24S8/Z16
render formats for gmem restore.
Also apply this change to gmem2mem so it doesn't depend on fd2_pipe2color
working with depth formats.
gmem2mem/mem2gmem also doesn't need to use the swap/swizzle, since dst/src
formats are the same.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.polygon_offset.*
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes failures in the following deqp tests:
dEQP-GLES2.functional.fragment_ops.*src_alpha_saturate*
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
The fdph opcode is not supported.
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes the following deqp test:
dEQP-GLES2.functional.shaders.builtin_variable.pointcoord
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Some instructions generated by int/bool float lowering need to be lowered
by opt_algebraic.
Fixes: 43dbd7d6
Signed-off-by: Jonathan Marek <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Set of opcodes doesn't have enough flexibility in certain cases. E.g.
Utgard PP has vector conditional select operation, but condition is always
scalar. Lowering all the vector selects to scalar increases instruction
number, so we need a way to filter only those ops that can't be handled
in hardware.
Reviewed-by: Qiang Yu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This looks like clear copy-and-pasteos, and fixes:
dEQP-GLES2.functional.draw.random.40
(on A307 and A630, both tested in the new CI farm)
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Tiling mode was missing from fd3_emit_gmem_restore_tex().
emit_gmem2mem_surf() used LINEAR exclusiveley.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
* Fix 2D/2DArray/3D tiling parameters:
There is a bottom threshold for width and height.
* Renable tiling for Cubemap, after setting the right parameters.
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were clamping the LOD to force non-mipmap filtering, but that means
that the HW doesn't get to select between the min and mag filters.
Setting MIPFILTER_LINEAR_FAR appears to force non-mipmap filtering.
Fixes all failures in dEQP-GLES2.functional.texture.filtering.2d.*
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
If driver-params are required, we really should emit it on every draw
for correctness. And if not required, we should emit a DISABLE so that
un-applied state updates from previous draws don't corrupt the const
state.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Worth ~+20% on gl_driver2
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Which takes ownership of the stateobj. Useful for streaming state-
objs, to avoid an extra ref/unref
Worth ~5% at gl_driver2
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
To avoid emitting unneeded const state.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Should be no functional change. Next step is to re-arrange various
const state into different stateobjs.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Move more of the code to deal just w/ screen, without requiring ctx.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Hoist them out of code-paths that will eventually be called directly for
various a6xx+ const related stateobjs.
This ends up duplicating one constlen check in ir3_emit_vs_consts(), to
avoid what could otherwise be an unnecessary WFI on older gens.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
framebuffer_barrier() still depends on the ctx, but the rest can move to
screen.
Signed-off-by: Rob Clark <[email protected]>
|