| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Cc: [email protected]
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
I have a piglit test that hits this.
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's almost the same.
This enables tiling for HTILE. It also enables Hyper-Z for other texture
targets (1D, 1D_ARRAY, 2D_ARRAY, CUBE, CUBE_ARRAY, 3D, RECT).
2D array depth textures are tested by Unigine Sanctuary and my new piglit
test.
Acked-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
| |
This fixes rendering to non-zero layer/face/slice with HTILE.
v2: added the assertion
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes rendering to a non-zero layer/face/slice with HTILE.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72685
v2: added the assertion
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Requires Evergreen or later hardware.
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes crashes if the number of temporaries is greater than 4096.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66184
v2: added fail paths for realloc failures
Cc: 10.2 10.3 [email protected]
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
This should make a machine which is running piglit more responsive at times.
e.g. streaming-texture-leak can easily eat 600 MB because of how fast it
creates new textures.
|
| |
|
|
|
|
| |
Required for multi-GPU configuration where each GPU has its own X screen.
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We were only using it to get at its type, which we already know because
it's a builtin variable.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We were only using it to get at its type, which we already know because
it's a builtin variable.
v2 (Ken): Rebase on Matt's optimized gl_FrontFacing calculations.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The replicated data clear shader needs to be SIMD16, or else the GPU
will hang. So, compile it even if INTEL_DEBUG=no16 is set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Current method of generating distribution tar-balls involves manually
invoking make + target name in the appropriate places. This temporary
solution is used until we get 'make dist' working.
Currently it does not work, as in order to have the target (which is
also a filename) available in the final Makefile we need to add a PHONY
target + use the correct target name.
Cc: "10.2 10.3" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that saturate is implemented natively as an instruction,
we can cut down on unneeded functionality.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v3: Since the fs backend can emit saturate as a separate instruction, there is
no need to detect for min/max instructions and to rewrite the instruction tree
accordingly. On the other hand, we don't need to emit a separate saturated
mov either when the expression generating src can do saturate directly.
v4: Add can_do_saturate() check before enabling saturate modifer (Ken)
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that saturate is implemented natively as instruction,
we can cut down on unneeded functionality.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge dst a 0.25F
To be propagated as:
sel.ge.sat dst b 0.25F
v3: - Syntax clarifications in inst->saturate assignment
- Remove extra parenthesis when assigning src_reg value
from copy_entry (Matt Turner)
v4: - Take channels into consideration when propagating saturated instructions.
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When sel conditon is bounded within 0 and 1.0. This allows code as:
mov.sat a b
sel.ge dst a 0.25F
To be propagated as:
sel.ge.sat dst b 0.25F
v3: Syntax clarifications in inst->saturate assignment (Matt Turner)
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - Output max(saturate(x),b) instead of saturate(max(x,b))
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is > 0.0 and
inner constant is 1.0.
- Fix comments to show that the optimization is a commutative operation
(Matt Turner)
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - Output min(saturate(x),b) instead of saturate(min(x,b)) suggested by Ilia Mirkin
- Make sure we do component-wise comparison for vectors (Ian Romanick)
v3: - Add missing condition where the outer constant value is zero and
inner constant is < 1
- Fix comments to reflect we are doing a commutative operation (Matt Turner)
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - Check that the base type is float (Ian Romanick)
v3: - Make sure comments reflect that we are doing a commutative operation
- Add missing condition where the inner constant is 1.0 and outer constant is 0.0
- Make indexing of operands easier to read (Matt Turner)
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we have the ir_unop_saturate implemented as a single
instruction, generate the correct simplified expression.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
| |
Needed when vertex programs doesn't allow saturate
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Abdiel Janulgue <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Use CLAMP macro (Ian Romanick)
Signed-off-by: Abdiel Janulgue <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Abdiel Janulgue <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Suggested by Matt. This patch combines and moves back the code-generation
functions from generate_vec4_instruction() into generate_code(). Makes
generate_code() a bit larger, but helps us to count loops in a
straightforward manner.
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the cited documentation section (but in the newer docs),
x_scaledown is the same for 2x and 4x MSAA.
+47 piglits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=83081
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.3" <[email protected]>
|
|
|
|
|
|
|
| |
In commit 04895f5c I added support for reswizzling writemasks. This test
was checking that we didn't support this.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=82881
|
|
|
|
|
|
|
|
| |
brw_meta_fast_clear.c:211:17: warning: 'x_scaledown' may be used
uninitialized in this function [-Wmaybe-uninitialized]
unsigned int x_scaledown, y_scaledown;
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
We asked MI commands to use GGTT only on GEN6.
|
|
|
|
| |
Fix max/min entries on GEN7.5 GT2/GT3.
|
|
|
|
|
| |
With e3c251071b0c9396c3ec76d1cf943c60ae297281, the magic values are gone. We
no longer need "cmd" to hide them. Replace it by dw0.
|
| |
|
| |
|
|
|
|
|
|
| |
Fix potential segfault in debug code.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are some cases where the scheduler can get itself into impossible
situations, by scheduling the wrong write to pred or addr register
first. (Ie. it could end up being unable to schedule any instruction if
some instruction which depends on the current addr/reg value also
depends on another addr/reg value.)
To solve this we'd need to be able to insert extra mov instructions
(which would also help when register assignment gets into impossible
situations). To do that, we'd need to move the nop padding from sched
into legalize.
But to start with, just detect when we get into an impossible situation
and bail, rather than sitting forever in an infinite loop. This way it
will at least fall back to the old compiler, which might even work if
you are lucky.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All of the GL image enums fit in 16-bits.
Also move the fields from the anonymous "image" structucture to the next
higher structure. This will enable packing the bits with the other
bitfield.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0
After (32-bit): 70 40,577,421,777 68,487,584 62,973,695 5,513,889 0
Before (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0
After (64-bit): 74 37,124,603,758 95,891,808 88,466,712 7,425,096 0
A real savings of 346KiB on 32-bit and 262KiB on 64-bit.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The only values allowed are 0 and 1, and the value is checked before
assigning.
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0
After (32-bit): 76 40,572,916,873 68,831,248 63,328,783 5,502,465 0
Before (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0
After (64-bit): 60 36,822,640,058 96,526,824 88,735,296 7,791,528 0
A real savings of 173KiB on 32-bit and no change on 64-bit.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Just use ir_variable::data.binding... because that's the where the
binding is stored for everything else that can use layout(binding=).
Valgrind massif results for a trimmed apitrace of dota2:
n time(i) total(B) useful-heap(B) extra-heap(B) stacks(B)
Before (32-bit): 50 40,564,927,443 69,185,408 63,683,871 5,501,537 0
After (32-bit): 74 40,580,119,657 69,186,544 63,506,327 5,680,217 0
Before (64-bit): 59 36,822,048,449 96,526,888 89,113,000 7,413,888 0
After (64-bit): 89 36,822,971,897 96,526,616 88,735,296 7,791,320 0
A real savings of 173KiB on 32-bit and 368KiB on 64-bit.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The VertexProgram and FragmentProgram have a Cache member for dealing
with fixed function programs. There are no fixed function geometry
programs, so this should never have existed, and was just copy and
pasted.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|