| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
Pretty nonsensical to have it as a method of the visitor just for access
to brw.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
The emit_math?_gen? functions serve to implement workarounds for the
math instruction, none of which exist on Gen8+.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As far as I can tell, Broadwell doesn't need any of the SURFACE_STATE
workarounds for textureGather() bugs, so there's no need to emit
a second set of identical copies.
To keep things simple, just point the gather surface index base to the
same place as the texture surface index base.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously the blorp blitter would only be used if the format is identical or
there is only a difference between whether there is an alpha component or not.
This patch makes it also allow the blorp blitter if the only difference is the
ordering of the RGB components (ie, RGB or BGR).
This is particularly useful since commit 61e264f4fcdba3623 because Mesa now
prefers RGB ordering for textures but the window system buffers are still
created as BGR. That means that the blorp blitter won't be used for the
(probably) common case of blitting from a texture to the window system buffer.
This doesn't cause any regressions in the FBO piglit tests on Haswell. On
Sandybridge it causes the fbo-blit-stretch test to fail but that is only
because it was failing anyway before the above commit and that commit hid the
problem.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix an off by one in the texture unit walk during texblend
setup on gen2. This caused the last enabled texunit to be
skipped resulting in totally messed up texturing.
This is a regression introduced here:
commit 1ad443ecdd694dd9bf3c4a5050d749fb80db6fa2
Author: Eric Anholt <[email protected]>
Date: Wed Apr 23 15:35:27 2014 -0700
i915: Redo texture unit walking on i830.
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.2" <[email protected]>
Signed-off-by: Ville Syrjälä <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the last context in a share group is destroyed, the hash table
containing all of the shader programs (ctx->Shared->ShaderObjects) is
destroyed, throwing away all of the shader programs.
Using a static variable to store program IDs ends up holding on to them
after this, so we think we still have a compiled program, when it
actually got destroyed. _mesa_UseProgram then hits GL errors, since no
program by that ID exists.
Instead, store the program IDs in the context, so we know to recompile
if our context gets destroyed and the application creates another one.
Fixes es3conform tests when run without -minfmt (where it creates
separate contexts for testing each visual).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77865
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit c1c1cf5f9 added infrastructure for saving and restoring draw
buffer state. However, it universially used MAX_DRAW_BUFFERS, but many
drivers support far fewer than that at limit. For example, the radeon
and i915 drivers only support 1. Using MAX_DRAW_BUFFERS causes meta to
generate GL errors.
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80115
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Kenneth Graunke <[email protected]> [on Broadwell]
Tested-by: [email protected]
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
This unit test demonstrates a subtle bug fixed by
4ddf51db6af36736d5d42c1043eeea86e47459ce.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
They have to be marked as structs for C code elsewhere. bblock_t is
already defined as a struct, and all of backend_instruction's fields are
public anyway.
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Let's this file compile with clang.
Reviewed-by: Frank Henigman <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
Warned about 'coord' being undefined in the default case, which is
unreachable.
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
This reverts commit 20be3ff57670529a410b30a1008a71e768d08428.
No evidence of ever being used.
|
|
|
|
|
|
| |
instructions in affected programs: 474 -> 462 (-2.53%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Helps Unigine Tropics and some (old) gstreamer shaders in shader-db.
instructions in affected programs: 792 -> 744 (-6.06%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
But only into non-load_payload instructions. Otherwise we would prevent
register coalescing from combining identical payloads.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since CSE creates instructions, if we let CSE generate things register
coalescing can't remove, bad things will happen. Only let CSE combine
non-copy load_payloads.
E.g., allow CSE to handle this
load_payload vgrf4+0, vgrf5, vgrf6
but not this
load_payload vgrf4+0, vgrf5+0, vgrf5+1
|
| |
|
| |
|
| |
|
|
|
|
|
| |
So that we don't have partial writes to a large VGRF. Will be cleaned up
by register coalescing.
|
| |
|
|
|
|
| |
Clean up with with register_coalesce()/dead_code_eliminate().
|
|
|
|
|
|
| |
Will be used to simplify the handling of large virtual GRFs in SSA form.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Like on Haswell, we need to use 8x4 aligned rectangle primitives for
hierarchical depth buffer resolves and depth clears. See the comments
in brw_blorp.cpp's brw_hiz_op_params() constructor. (The Broadwell
documentation confirms that this is still necessary.)
This patch makes the Broadwell code follow the same behavior as Chad and
Jordan's Gen7 BLORP code. Based on a patch by Topi Pohjolainen.
This fixes es3conform's framebuffer_blit_functionality_scissor_blit
test, with no Piglit regressions.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We only enable HiZ for miplevels which are aligned on 8x4 blocks. When
debugging HiZ failures, it's useful to know whether a particular
miplevel is using HiZ or not.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
In 25268b93, we added a new environment variable
(INTEL_COMPUTE_SHADER) to allow some constant values to be upgraded
for the ARB_compute_shader extension.
Now, we can look to see if the extension was enabled via the
MESA_EXTENSION_OVERRIDE environment variable.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
| |
v2: enable also for i915 (Ian)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Petri Latvala <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've used the LD sampler message for pull constant loads on earlier
hardware for some time, and also were already using it for the FS on
Broadwell. This patch makes us use it for Broadwell VS/GS as well.
I believe that when I wrote this code in 2012, we still used the data
port in some cases, and I somehow neglected to convert it while
rebasing.
Improves performance in GLBenchmark 2.7 Egypt by 416.978% +/- 2.25821%
(n = 17). Many other applications should benefit similarly: this speeds
up uniform array access in the VS, which is commonly used for skinning
shaders, among other things.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Tested-by: Ben Widawsky <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I actually added MOCS support for these things, but forgot to delete the
corresponding perf_debug() warnings.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
Somehow I missed this when adding all of the other MOCS values.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When faced with code such as:
mov vgrf31.0:UD, 960D
mov vgrf31.1:UD, vgrf30.xxxx:UD
The dead code eliminator didn't consider reg_offsets, so it decided that
the second instruction was writing was writing to the same register as
the first one, and eliminated the first one. But they're actually
different registers.
This fixes INTEL_DEBUG=shader_time for vertex shaders. In the above
code, vgrf31.0 represents the offset into the shader_time buffer where
the data should be written, and vgrf31.1 represents the actual time
data. With a completely undefined offset, results were...unexpected.
I think this is probably one of the few cases (maybe only case) where we
generate multiple MOVs to a large VGRF. Normally, we just use them as
texturing results; the other SEND-from-GRF uses a size 1 VGRF.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79029
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
| |
"shader_time_add" is a lot more informative than "op152".
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes several clang constant-logical-operand warnings such as
the following.
../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: warning: use of logical '||' with constant operand [-Wconstant-logical-operand]
if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL)
^ ~~~~~~~~~~~
../../../../../src/mesa/tnl_dd/t_dd_tritmp.h:130:32: note: use '|' for a bitwise operation
if (DO_TWOSIDE || DO_OFFSET || DO_UNFILLED || DO_TWOSTENCIL)
^~
|
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Cc: "10.2" <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.
https://bugs.freedesktop.org/show_bug.cgi?id=79729
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.1 10.2" <[email protected]>
|
|
|
|
|
|
|
| |
Too many levels of indirection.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We only need to alter the default state if we're emitting MOVs for
header related fields. So, we can simply move the push/pop of state in
to the if (header_present) block, bypassing it in the common case.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally
moved fire_fb_write() above the brw_pop_insn_state(), which caused the
SEND to lose its predication and change from WE_normal to WE_all.
Haswell uses predicated SENDs for discards, so this broke Piglit's
tests for discards.
We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked,
but the actual FB write itself should respect those. So, pop state
first, and force it again around the single MOV.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
|
|
|
|
|
|
|
| |
Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
I'm adding vec4 CSE, and I want to diff the files.
|
|
|
|
|
|
|
|
|
|
| |
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float. Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.
No piglit regressions on Ivybridge.
total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs: 136533 -> 133671 (-2.10%)
GAINED: 0
LOST: 0
Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.
v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove
extraneous fix_3src_operand (suggested by Matt). The latter change
required swapping the order of the operands and using predicate_inverse.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|