| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
2e01b8b440c1402c88a2755d89f40292e1f36ce5
That commit made possible that the items could be one just
after the other when their size was a multiple of ITEM_ALIGNMENT.
But compute_memory_prealloc_chunk still looked to leave a gap
between items. Resulting in that we got an infinite loop when
trying to add an item which would left no space between itself and
the next item.
Fixes piglit test: cl-custom-r600-create-release-buffer-bug
And the test for alignment I have just sent:
http://lists.freedesktop.org/archives/piglit/2014-June/011135.html
Sorry about this.
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes automake builds which are broken since
b52a530ce2aada1967bc8fefa83ab53e6a737dae.
v2: This patch also adds the FEATURE_* defines back to targets/egl-static for
Android and Scons that have been removed in the mentioned commit.
Signed-off-by: Niels Ole Salscheider <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79885
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit e62b7d38a1d (configure: autodetect video state-trackers
when non swrast driver is present) added a check that caused
the autodetection to be omitted when we have the swrast gallium
driver. Whereas it should have skipped the VL targets when only
swrast was selected.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79907
Cc: "10.2" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The code that parses LIBGL_DRIVERS_PATH was printing an
error for every attempted dlopen. It's not an error to
have to check multiple items in the path, only an error if
no suitable library is found. Reduced the load error to
a warning to match behavior of dynamic linker.
Signed-off-by: Courtney Goeltzenleuchter <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use the has_streamout flag as we do elsewhere to check if we need
to call pipe->set_stream_output_targets(). The driver might implement
the set_stream_output_targets() function, but not for all hardware
configurations.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When a multisampled texture is used for sampling the fast clear color value
needs to be programmed into the surface state. This was being left as all
zeroes so if the surface was cleared to a value other than black then it
wouldn't work properly. This doesn't matter for single-sample textures because
in that case the MCS buffer is resolved before it is used as a texture source.
https://bugs.freedesktop.org/show_bug.cgi?id=79729
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.1 10.2" <[email protected]>
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
| |
Too many levels of indirection.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We only need to alter the default state if we're emitting MOVs for
header related fields. So, we can simply move the push/pop of state in
to the if (header_present) block, bypassing it in the common case.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit dc2d3a7f5c217a7cee92380fbf503924a9591bea, Iago accidentally
moved fire_fb_write() above the brw_pop_insn_state(), which caused the
SEND to lose its predication and change from WE_normal to WE_all.
Haswell uses predicated SENDs for discards, so this broke Piglit's
tests for discards.
We want the Gen4-5 MOV to be uncompressed, unpredicated, and unmasked,
but the actual FB write itself should respect those. So, pop state
first, and force it again around the single MOV.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79903
|
|
|
|
|
|
|
|
|
|
| |
GBM_BO_USE_CURSOR_64X64 is kept so that existing users of GBM continue to
build, but it no longer rejects widths or heights other than 64.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=79809
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
Will simplify the automated conversion if we want to allow compiling the
driver for a single generation.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
I'm adding vec4 CSE, and I want to diff the files.
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
This isn't supposed to be difficult.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Please stop adding them.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes sure to use a no-op swizzle while iteratively rendering each
level of a mipmap otherwise we may loose components and effectively
apply the swizzle twice by the time these levels are sampled.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we would emit the comparison, emit an AND to mask off extra
bits from the comparison result, then convert the result to float. Now,
do the comparison, then use a cleverly constructed SEL to pick either
0.0f or 1.0f.
No piglit regressions on Ivybridge.
total instructions in shared programs: 1642311 -> 1639449 (-0.17%)
instructions in affected programs: 136533 -> 133671 (-2.10%)
GAINED: 0
LOST: 0
Programs that are affected appear to save between 1 and 5 instuctions
(just by skimming the output from shader-db report.py.
v2: s/b2i/b2f/ in commit subject (noticed by Chris Forbes). Remove
extraneous fix_3src_operand (suggested by Matt). The latter change
required swapping the order of the operands and using predicate_inverse.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
brw_vec4_visitor.cpp:2717:1: warning: unused parameter 'ir' [-Wunused-parameter]
brw_vec4_visitor.cpp:2723:1: warning: unused parameter 'ir' [-Wunused-parameter]
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
| |
And delete the incorrect comment.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
oops meant to move this.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This passes tests now on softpipe.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This enables GL3.3 on softpipe.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This just aligns the limits with llvmpipe.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes the limits for GL 3.2, and subsequently fixes
some segfaults in some varying packing tests and max varying tests
after the limits bumped.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This adds support for GL 3.2 layered rendering to softpipe.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This adds the layer info to the tile cache.
This changes clear_flags to be dynamically allocated as
MAX_LAYERS seems like a too big step.
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This passes the piglit depth clamp tests.
this is required for GL 3.2.
v2: move min/max up one level, could go further, thanks
to Roland for suggestion.
v1: Reviewed-by: Brian Paul <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This limits the number of emitted vertices to the shaders max output
vertices, and avoids us writing things into memory that isn't big
enough for it.
Reviewed-by: Zack Rusin <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add Intel driver hook for glGetTexImage to accelerate the case of reading
texture image into a PBO. This case gets huge performance gains by using
GPU BLIT directly to PBO rather than GPU BLIT to temporary texture followed
by memcpy.
No regressions on Piglit tests with Intel driver.
Performance gain (1280 x 800 FBO, Ivybridge):
glGetTexImage + glMapBufferRange with patch 1.45 msec
glGetTexImage + glMapBufferRange without patch 4.68 msec
v3: (by Kenneth Graunke)
- Fix compile after Eric's change to drop the tiling argument
to intel_miptree_create_for_bo.
- Add GL_TEXTURE_3D to blacklisted texture targets to prevent Piglit
regressions.
- Squash in several whitespace and coding style fixes.
|
|
|
|
|
|
|
|
|
| |
We need to invalidate the live intervals when inserting new
instructions.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
| |
When walking backwards, we want to stop at the head sentinel, which is
where scan_inst->prev->prev == NULL, not scan_inst->prev == NULL.
Fixes random crashes, as well as valgrind errors.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
Giving the meta clear program a meaningful name makes it easier to find
in output such as INTEL_DEBUG=fs or INTEL_DEBUG=shader_time. We already
did so for integer programs, but neglected to label the primary program.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These used to call different math emitters (brw_math vs. brw_math2).
Now that they both call gen6_math, they're virtually identical.
When unrolling SIMD16 to multiple SIMD8 operations, we should take care
not to apply sechalf to brw_null_reg for src1. Otherwise, we'd end up
with BRW_ARF_NULL + 1 as the register number, and I'm not sure if that's
valid.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These functions are basically identical, so we should combine them.
However, they're so trivial, we may as well just fold them into their
only call sites.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These are trivial to combine: we should just avoid checking the second
operand if it's brw_null_reg.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's now a single line of code, so we may as well fold it into the
caller.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Usually, I try to use "brw" for functions that apply to all generations,
and "gen4" for dead end/legacy code that is only used on Gen4-5.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our existing functions, brw_math and brw_math2, had unclear roles:
Gen4-5 used brw_math for both unary and binary math functions; it never
used brw_math2. Since operands are already in message registers, this
is reasonable.
Gen6+ used brw_math for unary math functions, and brw_math2 for binary
math functions, duplicating a lot of code. The only real difference was
that brw_math used brw_null_reg() for src1.
This patch improves brw_math2's assertions to allow both unary and
binary operations, renames it to gen6_math(), and drops the Gen6+ code
out of brw_math().
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
| |
This is more typical C++ style.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Thread switching on control flow instructions is a documented workaround
for Gen4-5 errata. As far as I can tell, it hasn't been needed since
Sandybridge. Thread switching is not free, so in theory this may help
performance slightly.
Flow control instructions with the "switch" flag cannot be compacted, so
removing it will make these instructions compactable. (Of course, we
still have to implement compaction for flow control instructions...)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 2081469 -> 2081248 (-0.01%)
instructions in affected programs: 22606 -> 22385 (-0.98%)
No programs were hurt by this patch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with IWYU. Compile-tested on my Ivy-bridge system.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
| |
Found with IWYU. Compile-tested on my Ivy-bridge system
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Only function-defs use glsl_type so forward declare instead.
Compile-tested on my Ivy-bridge system.
IWYU also suggests removing #include <new>, and this compiles fine.
I'm not familiar enough with memory management in C/C++ that I feel
comfortable removing this. Insights would be appreciated.
Reviewed-by: Tom Stellard <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Thomas Helland <[email protected]>
|