| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We currently use CL_INVOCATION_COUNT for the GL_PRIMITIVES_GENERATED
query, which involves passing all primitives to the clipper. When
rasterizer discard is enabled, we program the clipper in REJECT_ALL
mode, rather than using the SOL stage's "Rendering Disable" feature.
See commit f09b91f78247409f54c975f56cb10d5f350fe64e for an explanation
of why we implement GL_PRIMITIVES_GENERATED this way.
Apparently the SOL stage's "Rendering Disable" feature is a lot faster
than having the clipper reject all primitives. It's safe to use when
no GL_PRIMITIVES_GENERATED query is active, as we don't care about
CL_INVOCATION_COUNT incrementing.
This patch makes us use SO_RENDERING_DISABLE when no query is active,
but continues falling back to the clipper in REJECT_ALL mode when the
queries are enabled. It brings back the perf_debug for the clipper
case (which I removed in commit 1f9445ff57b, thinking it wasn't useful).
Improves performance in Gl32GSCloth by 84.8303% +/- 2.07132% (n = 10)
on my Broadwell GT2 laptop.
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
They're basically the same. Let's avoid the code duplication.
v2: Fix SO_BUFFER_ENABLE stuff to only happen on Gen < 8 (caught
by Jason Ekstrand).
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The scalar backend currently doesn't support variable indexing on
temporary arrays, but it does support it on uniform arrays, and
some stages support it for input arrays. Make sure these are
propagated through before exploding indirects into piles of
if-ladders unnecessarily.
On Broadwell, no instruction count change in shader-db.
total cycles in shared programs: 80675652 -> 80674928 (-0.00%)
cycles in affected programs: 649972 -> 649248 (-0.11%)
helped: 386
HURT: 165
This will help avoid code quality regressions in a future commit.
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Without vertex elements originating directly from vertex fetcher
are not passed to wm-state correctly.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
just as core upload logic does.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
just as core upload logic does.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Fixes regression introduced by af5ca43f2676bff7499f93277f908b681cb821d0
Cc: "12.0 11.2" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95419
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already store these in gl_shader and gl_program here we
remove it from gl_shader_program and just use the values
from gl_shader.
This will allow us to keep the shader cache restore code as
simple as it can be while making it somewhat clearer where these
values originate from.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already store this in gl_shader and gl_program here we
remove it from gl_shader_program and just use the values
from gl_shader.
This will allow us to keep the shader cache restore code as
simple as it can be while making it somewhat clearer where these
values originate from.
V2: remove unnecessary NULL check
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Iago Toral <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Christian Gmeiner <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise, we end up with a bogus value in the third component. On gen6-7
where we always use 2D textures, this can cause problems if the
SurfaceArray bit is set in the SURFACE_STATE.
Acked-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There's no real reason why we shouldn't set this bit. It does affect how
the sampler operates a bit but since you can have a 2D non-array view of a
2D_ARRAY texture that distinction is very weak. Also, this is what ISL
will do and we would like this change to be isolated from using ISL.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The PRM states that the values put in Width, Height, and Depth should be
various bits from the value size - 1. We seem to have done this wrong
more-or-less from the start.
Reviewed-by: Chad Versace <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
| |
This hasn't been used since 1cfb4bc890b8 where we deleted the meta stencil
blit path.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were incrementing length but not actually putting anything
in the Y coordinate. This meant that 1-D TXF operations had a garbage
array index. If the surface is emitted as 1-D non-array, the coordinate
gets discarded and it works fine. If it happens to be bound as an array
surface, it may count as an out-of-bounds array access and you get zero.
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We were adding in the base which is wrong because the values given in the
miptree are relative to zero and not the base layer/level.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The RenderTargetViewExtent field of RENDER_SURFACE_STATE is supposed to be
set to the depth of a 3-D texture when rendering. Unfortunatley, that
field is only 9 bits on Sandy Bridge and prior so we can't actually bind
a 3-D texturing for rendering if it has depth > 512. On Ivy Bridge, this
field was bumpped to 11 bits so we can go all the way up to 2048. On Iron
Lake and prior, we don't support layered rendering and we use OffsetX/Y
hacks to render to particular layers so 2048 is ok there too.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is basically a direct translation of what we do for gen7.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes texture views sort-of work. It doesn't add full texture view
support for gen4-5 but it is enough to fix the GL_ARB_copy_image formats
piglit test on Iron Lake.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83036
Cc: "11.1 11.2 12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our previous code worked for desktop GL, and ES without geometry or
tessellation shaders. But those features require fancier point size
handling. Fortunately, we can use one rule for all APIs.
Fixes a number of dEQP tests with EXT_tessellation_shader enabled:
dEQP-GLES31.functional.tessellation_geometry_interaction.point_size.*
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
We will reuse this for fs key generation for the on disk shader
cache.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
This avoids costly address recomputations, function overhead, and may trigger
large copy optimizations.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Cc: "12.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
v2: add ST_DEBUG flag for disabling (suggested by Ilia)
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
|
|
|
| |
Whenever a draw happens or some other function call might change the result
of future glReadPixels calls, we must invalidate the cache.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Found by inspection.
Cc: 11.2 12.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, a sequence of glBitmap followed by texture functions
that refer to a texture bound as the framebuffer is well within what should
be allowed.
Found by inspection.
Cc: 11.2 12.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
In the unlikely case that a program uses glBitmap to render to a framebuffer
whose texture is bound in a compute shader.
Found by inspection.
Cc: 11.2 12.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This is more consistent with what we do elsewhere and will allow
us to only cache one of the values in the shader cache.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Cherryview and Broxton don't support DW x DW multiplication. We have
piles of code to handle this, but apparently weren't retyping in the
immediate case.
For example,
tests/spec/arb_tessellation_shader/execution/dvec3-vs-tcs-tes
makes the simulator angry about instructions such as:
mul(8) r18<1>:D r10.0<8;8,1>:D 0x00000003:D
Just retype to W or UW. It should be safe on all platforms.
Cc: "12.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
A nearly identical block already exists in the gen >= 6 block above.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
main/pipelineobj.c: In function ‘delete_pipelineobj_cb’:
main/pipelineobj.c:110:30: warning: unused parameter ‘id’ [-Wunused-parameter]
delete_pipelineobj_cb(GLuint id, void *data, void *userData)
^
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Make sure to pass the requisite information in draws, blits, and clears
that work on the context's draw buffer.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
To match what's done in the automake build.
v2: Use git rev-parse to get a 10-character hash ID
Fix Python imports
Reviewed-by: Eric Engestrom <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Cherryview's PRM, Volume 7, 3D Media GPGPU Engine, Register Region
Restrictions, page 844:
"When source or destination datatype is 64b or operation is integer DWord
multiply, indirect addressing must not be used."
v2:
- Fix it for Broxton too.
v3:
- Simplify code by using subscript() and not creating a new num_components
variable (Kenneth).
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Cc: "12.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Cherryview PRM, Volume 7, 3D Media GPGPU Engine,
Register Region Restrictions:
"When source or destination is 64b (...), regioning in Align1
must follow these rules:
1. Source and destination horizontal stride must be aligned to
the same qword.
(...)"
v2:
- Fix it for Broxton too.
v3:
- Remove inst->regs_written change as it is not necessary (Ken)
Cc: "12.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95462
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
I typo'd this.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I recently experimented with performing rasterizer discard in the SOL
unit instead of the clipper, and as far as I can tell, it's basically
the same performance. The clipper comes directly after SOL anyway,
and setting the clipper to REJECT_ALL should be pretty darn cheap.
Keep the perf_debug on Sandybridge, where the GS actually does work.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are almost no tests in any test suite, but what little I've found
seems to work. Ilia believes everything is in place.
v2: Predicate the enable on ES 3.1 being available (Gen8+) and also
ARB_compute_shader being available (requested by Ilia).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are quite a few pipelines that desktop applications (including a
bunch of piglit test) can expect to have run but don't meet the GLES
requirements. Instead of failing validation, just emit a debug message.
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96358
Cc: "12.0" <[email protected]>
Cc: Gregory Hainaut <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
Left over from 31dee99e052902bc08ddbb1009748dc982ac3211. It should fix
Clang Windows build.
Trivial.
|
|
|
|
| |
Acked-by: Kenneth Graunke <[email protected]>
|