| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Massive list of constant data. Annotate it as such.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
| |
Both of which are no longer used. Use designated initializer to make
things obvious as people add/remove TGSI_OPCODEs.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
| |
... rather than the local one in inst_info->tgsi_opcode.
This will allow us to simplify struct r600_shader_tgsi_instruction.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Marek Olšák <marek.olsak@amd.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the very least, unreal4/sun-temple/102.shader_test uses this pattern
for a signed integer result. However, that shader did not hit the
optimization in the first place because it uses !gl_FrontFacing. I
changed the shader to use remove the logical-not and reverse the other
operands. I verified that incorrect code is generated before this
change and correct code is generated after.
Fixes fs-frontfacing-ternary-1-neg-1.shader_test.
No shader-db changes.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
| |
If we check for the case that is actually necessary, the asserts
become superfluous.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Espically on platforms that do not natively generate 0u and ~0u for
Boolean results, we generate a lot of sequences where a CMP is
followed by an AND with 1. emit_bool_to_cond_code does this, for
example. On ILK, this results in a sequence like:
add(8) g3<1>F g8<8,8,1>F -g4<0,1,0>F
cmp.l.f0(8) g3<1>D g3<8,8,1>F 0F
and.nz.f0(8) null g3<8,8,1>D 1D
(+f0) iff(8) Jump: 6
The AND.nz is obviously redundant. By propagating the cmod, we can
instead generate
add.l.f0(8) null g8<8,8,1>F -g4<0,1,0>F
(+f0) iff(8) Jump: 6
Existing code already handles the propagation from the CMP to the ADD.
Shader-db results:
GM45 (0x2A42):
total instructions in shared programs: 3550829 -> 3550788 (-0.00%)
instructions in affected programs: 10028 -> 9987 (-0.41%)
helped: 24
Iron Lake (0x0046):
total instructions in shared programs: 4993146 -> 4993105 (-0.00%)
instructions in affected programs: 9675 -> 9634 (-0.42%)
helped: 24
Ivy Bridge (0x0166):
total instructions in shared programs: 6291870 -> 6291794 (-0.00%)
instructions in affected programs: 17914 -> 17838 (-0.42%)
helped: 48
Haswell (0x0426):
total instructions in shared programs: 5779256 -> 5779180 (-0.00%)
instructions in affected programs: 16694 -> 16618 (-0.46%)
helped: 48
Broadwell (0x162E):
total instructions in shared programs: 6823088 -> 6823014 (-0.00%)
instructions in affected programs: 15824 -> 15750 (-0.47%)
helped: 46
No chage on Sandy Bridge or on any platform when NIR is used.
v2: Add unit tests suggested by Matt. Remove spurious writes_flag()
check on scan_inst when scan_inst is known to be BRW_OPCODE_CMP (also
suggested by Matt).
v3: Fix some comments and remove some explicit int() casts in fs_reg
constructors in the unit tests. Both suggested by Matt.
Signed-off-by: Ian Romanick <ian.d.romanick@intel.com>
Reviewed-by: Matt Turner <mattst88@gmail.com>
|
|
|
|
|
|
|
|
|
| |
text data bss dec hex filename
9663 0 0 9663 25bf intel_tiled_memcpy.o before
8215 0 0 8215 2017 intel_tiled_memcpy.o after
Reviewed-by: Carl Worth <cworth@cworth.org>
Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
|
|
|
|
| |
Reviewed-by: Chad Versace <chad.versace@intel.com>
|
|
|
|
|
|
|
| |
This fixes Bug 89616, a build failure due to line 1639 of bufferobj.c:
_mesa_error(ctx, GL_INVALID_OPERATION, func);
Trivial.
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
|
| |
v3: Review from Fredrik Hoglund
-Split cosmetic refactor of GetBufferPointerv out into a separate commit
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
|
| |
v3: Review from Fredrik Hoglund
-Split cosmetic refactor of GetBufferPointerv out into a separate commit
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
| |
v2: Split into a refactor commit and an entry point commit.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
|
| |
v2:-Remove "_mesa" from in front of static software fallback.
-Split out the refactor from the addition of the DSA entry points.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
|
|
| |
v2: review from Ian Romanick
- Restore VBO_DEBUG and BOUNDS_CHECK
- Remove _mesa from static software fallback unmap_buffer.
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
|
|
|
|
|
|
| |
v2: review from Jason Ekstrand
- Split refactor from addition of DSA entry points.
review from Ian Romanick
- Remove "_mesa" from static software fallback map_buffer_range
- Restore VBO_DEBUG and BOUNDS_CHECK
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Fredrik Höglund <fredrik@kde.org>
|
|
|
|
| |
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
|
|
| |
v2: review by Jason Ekstrand
- Split refactor of clear buffer sub data from addition of DSA entry
points.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
| |
v2: remove _mesa in front of static software fallback.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
|
| |
- More explicit error reporting.
- Removed legacy style.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
|
|
|
| |
v2: review by Ian Romanick
- Remove "_mesa" from name of static software fallback buffer_sub_data.
- Remove mappedRange from _mesa_buffer_sub_data.
- Removed some cosmetic changes to a separate commit.
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
| |
v2: review from Ian Romanick
- Fix space in ARB_direct_state_access.xml.
- Remove "_mesa" from the name of buffer_data static fallback.
- Restore VBO_DEBUG and BOUNDS_CHECK.
- Fix beginning of comment to start on same line as /*
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
| |
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
| |
Reviewed-by: Martin Peres <martin.peres@linux.intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 1ee000a0b6737d6c140d4f07b6044908b8ebfdc7.
Failures with the GLES3 conformance suite and Synmark2 OGLHdrBloom revealed
that this commit was in error.
Extensive testing with Piglit prior to patch review and upstreaming did not
reveal this problem because, in the few Piglit tests that test for cube
completeness, NumLayers = 6. This is because all of the existing tests use
TextureStorage to initialize the texture, which sets NumLayers.
A new Piglit test has been sent to the mailing list that reproduces the bug
related to this patch ("texturing: Testing
glGenerateMipmap(GL_TEXTURE_CUBE_MAP) without glTexStorage2D").
Reviewed-by: Jason Ekstrand <jason.ekstrand@intel.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 0ac4c272755c7 made it add a header for the send message when
using SIMD4x2 on Skylake because without this it will end up using
SIMD8D. However the patch missed the case when a sampler is being used
to implement constant loads from a buffer surface in a SIMD4x2 vertex
shader.
This fixes 29 Piglit tests, mostly related to the ARL instruction in
vertex programs.
Reviewed-by: Kristian Høgsberg <krh@bitplanet.net>
Tested-by: Anuj Phogat <anuj.phogat@gmail.com>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit bb33a31 introduced optimizations that transform cases of MAD
in to simpler forms but it did not take in to account that src[0]
can not be immediate and did not report progress. Patch switches
src[0] and src[1] if src[0] is immediate and adds progress
reporting. If both sources are immediates, this is taken care of by
the same opt_algebraic pass on later run.
v2: Fix for all cases, use temporary fs_reg (Matt, Kenneth)
Signed-off-by: Tapani Pälli <tapani.palli@intel.com>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89569
Reviewed-by: Francisco Jerez <currojerez@riseup.net> (v1)
Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
Cc: "10.5" <mesa-stable@lists.freedesktop.org>
|
|
|
|
|
|
|
| |
Before this actually ran into an infinite loop printing out "invalid"...
Reviewed-by: Brian Paul <brianp@vmware.com>
Reviewed-by: Jose Fonseca <jfonseca@vmware.com>
|
|
|
|
|
|
|
| |
st/dri/common hasn't been around for a while.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
| |
Squash this silly typo introduced with commit c63eb5dd5ec(auxiliary/os: get
the mmap/munmap wrappers working with android)
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
| |
Remove the forward declaration and make use of the DEBUG_PRINT macro for
debug builds.
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reviewed-by: Brian Paul <brianp@vmware.com>
|
|
|
|
|
|
|
|
|
|
|
| |
Required by fstat(), otherwise we'll error out due to implicit function
declaration.
Cc: "10.4 10.5" <mesa-stable@lists.freedesktop.org>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89530
Signed-off-by: Emil Velikov <emil.l.velikov@gmail.com>
Reported-by: Vadim Rutkovsky <vrutkovs@redhat.com>
Tested-by: Vadim Rutkovsky <vrutkovs@redhat.com>
|
|
|
|
|
|
|
| |
v2: Don't use the intrinsics, the shader backend can recognize these
patterns and generates optimal code automatically.
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
| |
This will be used a lot (especially by tessellation).
v2: don't use the bfe intrinsic
Reviewed-by: Tom Stellard <thomas.stellard@amd.com>
|
|
|
|
|
|
|
|
|
|
| |
IvyBridge and Haswell PRM say that the JIP should be emitted
with type W but we were using UD. The previous implementation
did not show adverse effects, but IMHO it is safer to follow
the specification thoroughly.
Reviewed-by: Matt Turner <mattst88@gmail.com>
Signed-off-by: Antia Puentes <apuentes@igalia.com>
|
|
|
|
|
|
|
|
| |
- move it to its own function
- do it after all states are emitted
- bump SI_MAX_DRAW_CS_DWORDS
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
|
|
| |
Do it only when the line stipple state is changed.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
|
|
|
|
|
| |
This requires enabling the optional GL provoking vertex behavior for quads.
+ some cosmetic changes, so that the register is set exactly the same as
on r600.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
|
|
|
| |
The fragment shader multiplies the alpha channel with gl_SampleMaskIn.
If blending is enabled, it looks like MSAA.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
|
|
| |
Sample locations are not updated as often as framebuffers.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
|
|
|
| |
This will be used for line and polygon smoothing.
This is GCN-only even though it's in shared code.
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <michel.daenzer@amd.com>
|