| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each of the subroutine emitters alter the predication state, but
otherwise don't change anything (or put it back when they do).
Resetting predication at the end makes these functions idempotent with
regard to the default instruction state - which is a nice property.
With that in place, push/pop is no longer necessary.
v2: Improve whitespace (requested by Matt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
brw_MOV doesn't alter the default instruction state, so this does
nothing.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
brw_JMPI sets predicate_control to BRW_PREDICATE_NONE, but that's
already the value coming in. Otherwise, nothing changes state.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This field is only used to track the current value of the flag register
during the SF compile. It has no place in the common compiler code.
While we're changing every call, drop the 'brw' prefix from the function
since it's static.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
Only the Gen4-5 SF program compiler actually uses this function; move
it there. Soon the fields will be moved out of brw_compile.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
There's no point in pushing and popping the default state; the code
between the two stack operations doesn't alter anything.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
None of the assembly emitters called between push and pop actually
change the state. So, we can drop these.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, brw_CMP with a null destination implicitly set the default
state to make future instructions predicated. This is messy and
confusing - emitting a CMP that populates the flag register and later
using it to predicate instructions are logically separate. With the
main compiler, we may even schedule instructions between the CMP and the
user of the flag value.
This patch simplifies brw_CMP to just emit a CMP instruction, and not
mess with predication. It also updates all necessary callers. These
mostly fell into two patterns:
1. brw_CMP followed by brw_IF.
We don't need to do anything special here; brw_IF already sets up
predication appropriately.
2. brw_CMP followed by a single predicated instruction.
The old model was to call brw_CMP, emit the next (predicated)
instruction, then disable predication for any instructions beyond
that. Instead, just explicitly set predicate_control on the single
instruction we want to predicate. It's no more code, and requires
less cross-module knowledge.
This drops setting flag_value to 0xff as well, which is a field only
used by the SF compile. There is only one brw_CMP call in the SF code,
which is in do_twoside_caller, and called at the start of
brw_emit_tri_setup, where flag_value is already 0xff.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Presumably, this was to reset the default state of predication_control
from brw_CMP. But brw_CMP only sets that if dst is ARF null, which it
isn't here.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When compiling any of the SF program variants, flag_value starts off as
0xff and will be modified when generating code.
brw_emit_anyprim_setup emits several subroutines, saving and restoring
flag_value across each of them. Since it starts out as 0xff, this is
equivalent to simply setting it to 0xff at the start of each subroutine.
Resetting the value makes more logical sense; each subroutine doesn't
know whether one of the others even executed, much less what it did
to the flag register.
This also lets us to drop the brw_set_predicate_control_flag_value call
from brw_init_compile: predicate is already initialized to
BRW_PREDICATE_NONE by the memset, and the value of flag_value is
irrelevant (as it's only used by the SF compiler).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Leo Liu <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Meta path needs to keep the current texture object's state. Fixes
the following gles3 cts tests on bdw:
framebuffer_blit_functionality_negative_width_blit.test: fail
framebuffer_blit_functionality_all_buffer_blit.test: fail
framebuffer_blit_functionality_negative_height_blit.test: fail
framebuffer_blit_functionality_missing_buffers_blit.test: fail
framebuffer_blit_functionality_negative_dimensions_blit.test: fail
framebuffer_blit_functionality_minifying_blit.test: fail
framebuffer_blit_functionality_magnifying_blit.test: fail
Signed-off-by: Topi Pohjolainen <[email protected]>
Cc: "10.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Ken): Only restore the mode if it has changed.
Signed-off-by: Topi Pohjolainen <[email protected]>
Cc: "10.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
This broke when I separated declarations/shader.
|
|
|
|
|
| |
i915g's npot support is incomplete, so let's not use it for copies.
This fixes a bunch of piglit tests.
|
|
|
|
|
| |
We can handle depth, luminance,... copies by simply replacing the
format with a known format of the same bpp.
|
|
|
|
|
|
|
| |
Provide an accelerated path for ARB_clear_buffer_object
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
Number of compacted instructions: 827404 -> 833045 (0.68%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit 42a26cb5e441a01d5288b299980f23affaad53fe.
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78648
|
|
|
|
|
|
|
| |
This reverts commit 5ff1e446d44bb9d50f84883c7058635cb070e069.
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77704
|
|
|
|
|
|
| |
This reverts commit 55de1c035cbca2b7087b3aa21a8c3dfc900a4ad9.
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
| |
This reverts commit f770123f58b46459e8dbd27525162ee8ba89f30b.
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78692
|
|
|
|
|
|
|
|
|
|
| |
In commit af38ef907, I added a "fix" to color outputs not being assigned
correctly when sample mask was being output. This was totally wrong --
the color indices (i.e. "si" values) were the ones that were wrong. Undo
that hunk.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Commit c5d822dad90 added support for sample mask incorrectly. It became
treated as a color output, and messed up the color output indices.
Revert the hunk that did that, and add explicit support just like for
depth/stencil writes.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
clang <= 3.3 cpuid.h does not define contants for feature bits.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79095
Signed-off-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
We had a handful of cases where we'd used brw_imm_*() to generate an
immediate, rather than fs_reg(). We shouldn't do that but we shouldn't
limit scheduling flexibility on account of immediate arguments either.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Using brw_imm_* creates a source with file=HW_REG, and the scheduler
inserts barrier dependencies when it sees HW_REG. None of these are
hardware-registers in the sense that they're special and scheduling
shouldn't touch them. A few of the modified cases already have HW_REGs
for other sources, so it won't allow extra flexibility in some cases.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Turns out that the AC conditional did not include the
the version-scripts as expected. Rather it truncated
the remaining linker flags.
Cc: Jon TURNEY <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jon TURNEY <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Leave only the gl/glx and mangled gl symbols.
XMesa* was never an official interface and the only
user of it was mesa-demos, while they were still in
the same repo as mesa.
v2: Conditionally use the version-script.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Missed out with commit d4c3968c25885f6eb53dee4cc0c60d8d3f8fec32
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the presence of LLVM the final library exports every symbol from
the llvm namespace. Resolve this by using a version script (w/o the
version/name tag).
Considering that there are only ~35 symbols, explicitly list them
to minimize the chances of rogue symbols sneaking in.
v2: Conditionally include the version-script.
Reviewed-by: Thomas Hellstrom <[email protected]> (v1)
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The symbol was added with commit 45e2b51c853(DRI2/GLX: check for
vblank_mode in DRI2 GLX code) but was never used as such according
to git log.
Possibly it was marked as public due to confusion with
__driConfigOptions which was used for dri1 drivers.
Acked-by: Jesse Barnes <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Do not wrap the code in ifdef HAVE_DRI3 (suggested by Keith)
Cc: "10.1 10.2" <[email protected]>
Cc: Keith Packard <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The profiles are present depending on the defines at build time.
Drop the extra functions and feed the defines directly into the
state-tracker at build time.
v2: Drop unused variable i.
Acked-by: Chia-I Wu <[email protected]> (v1)
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
... rather than the one defined in our internal interface (dri_interface.h)
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
If we make ann_count non-zero, annotation_finalize() won't bail.
Not modifying it seems to make the code more clear than would modifying
annotation_finalize().
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit a6860100b87415ab510d0d210cabfeeccebc9a0a.
Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.
Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77707
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2dfbbeca50b95ccdd714d9baa4411c779f6a20d9 with the
comment about MAC and implicit accumulator removed.
Why this code didn't work in all circumstances is unknown and without a
working Ironlake simulator (which uses a different AUB format) we'll
probably never know, short of a lot of experimentation, and spending a
bunch of time to try to optimize a few instructions on Ironlake is not
time well spent.
Moreover, for mix(vec4, vec4, vec4) using the accumulator introduces a
dependence between the otherwise independent per-component calculations.
Not using the accumulator, even if it means an extra instruction per
component might be preferable. We don't know, we don't have data, and
we don't have the necessary register on Ironlake for shader_time to tell
us.
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=77703
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Number of compacted instructions: 817752 -> 827404 (1.18%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Enables the next commits to compact more instructions.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Note the weirdness with src1 subregs. The compacted immediate fields are
uncompacted to bits [127:96] and the high five bits of the subreg
mapping maps to bits [100:96].
Number of compacted instructions: 790085 -> 817752 (3.50%)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|