| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
This is the effective layer count, for clears etc. This differs from the
depth of the miptree level when views are involved.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We're about to need this in another place.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We're about to need this so we can determine the layer count of the
wrapper.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We'll still avoid MinLayer here since the fast path doesn't understand
arrays at all, but it's straightforward to do levels.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This allows core mesa's TexSubImage paths etc to work correctly
with views which have nonzero MinLevel or MinLayer.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This will eventually be relaxed, but we'll get the fallback path
working first.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
V4: Comment style, remove magic shift.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This is the actual mesa_format to use. In non-view cases this is always
the same as the mt's format.
V4: Comment style
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to wire the original texture's mt into the view. All the hard
work of setting up an appropriate tree of gl_texture_image structures
has already been done by core mesa.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
If we were to relayout the miptree, we'd break any views that are
sharing it.
(Simplified based on suggestions from Eric)
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
We will need to call this to munge view formats.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We will need this for munging the view's format.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
These formats can be cast to others (with different component types or
sizes) via ARB_texture_view or ARB_shader_image_load_store. We want
them to be laid out consistently so that we can just reinterpret the
memory with a different format.
In V1, this was done conditionally on a 'prefer_no_swizzle' flag which
was set in TexStorage/TextureView paths, but we need the same behavior
for ARB_shader_image_load_store (which also works with images created
via TexImage, so we don't want it to be conditional.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The sampler can handle R8G8B8X8 (and substitute 1.0 for the fourth
component) but we can't use it as a render target.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
None of the other 3-component 16bpc formats are directly supported, so
they get promoted to XRGB equivalents. *Not* promoting RGB16F the same
way makes texture views much more fiddly -- we don't want to have to do
crazy copying behind the scenes.
(with my other master + my experimental ARB_texture_view support) fixes
the piglit test: `spec/ARB_texture_view/view compare 48bit formats`
No regressions in gpu.tests on Haswell.
V4: Don't alter the formats table -- just don't match it to a mesa_format. [Kenneth]
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This is supported by all generations, and is required for memory layout
consistency for texture_view.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Now this is the preferred format for GL_SRGB8_ALPHA8.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
V4: Fix rebase conflicts with Brian's renaming of the texfetch
functions.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we would unpack the texels to floats using *_TO_FLOAT_TEX,
and then pack them into the desired format using FLOAT_TO_*. Unfortunately,
this isn't quite the inverse operation, and so some texel values would
end up off-by-one.
This fixes the GL_RGB8_SNORM and GL_RGB16_SNORM subcases in piglit's
arb_texture_view-format-consistency-get test on i965. The similar 1-, 2-
and 4-component cases already worked because they took the memcpy path
rather than repacking.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Filenames passed to dlopen() don't need to use the platform's default extension
for shared libraries.
Using the '.so' extension when dlopen()ing DRI drivers is hardcoded into mesa
and the X server, so it should be hardcoded here in the Makefile as well.
A similar fix is probably also needed for gallium DRI drivers.
(Consider that if we were starting from scratch, perhaps we would use a custom
extension like .dri instead)
Cc: Emil Velikov <[email protected]>
Signed-off-by: Jon TURNEY <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The "new" fragment shader backend has never supported the necessary
color conversion code for this to work. We began using the new backend
in Mesa 7.10 for GLSL (commit a81d423d93f22a948f3aa4bf73, October 2010),
and for ARB_fragment_program in Mesa 9.1 (commit 97615b2d8c7c3cea6fd3a4,
August 2012).
I haven't heard any complaints, so I don't think anyone will miss this
feature. I believe mplayer used it at one point, but these days
defaults to other paths anyway.
Cc: [email protected]
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This should help prevent situations where we render without proper index
bounds. For example: https://bugs.freedesktop.org/show_bug.cgi?id=59455
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
To fix MSVC build since cb4ad1368551b.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Applications frequently clear to colors other than 0.0 or 1.0, which
prevents us from doing fast color clears. In that case, we issue this
performance warning on basically every glClear call, resulting in so
much spam that it's nearly impossible to see any other messages.
Plus, I don't think it's useful. We aren't suggesting a better way to
do what the application developers want---we're just telling them it
would be faster to do something they don't want.
Driver developers have no control over the clear color, so this message
is totally useless to them.
A better alternative to get this sort of information is to use
INTEL_DEBUG=blorp, which tells you whether color clears were fast,
simd16 repdata, or slow.
v2: Rebase on has_color_component changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Decompressing ETC2 textures was causing intermitent segfault
by copying resulting 4x4 texel block to the destination texture
regardless of the size of the destination texture. Issue found
via application crash in GLBenchmark 3.0's Manhattan test.
v2: add more detail comment. Compute limit outside inner loops.
v3: add bugzilla reference
v4: Correct cc syntax in commit log
v5: really grab the right patch
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74988
Cc: "9.2 10.0 10.1" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]> [v1, suggested v2-3]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Given
mov vgrf7, vgrf9.xyxz
add vgrf9.xyz, vgrf4.xyzw, vgrf5.xyzw
add vgrf10.x, vgrf6.xyzw, vgrf7.wwww
the last instruction would be wrongly changed to
add vgrf10.x, vgrf6.xyzw, vgrf9.zzzz
during copy propagation.
The issue is that when deciding if a record should be cleared, the old code
checked for
inst->dst.writemask & (1 << ch)
instead of
inst->dst.writemask & (1 << BRW_GET_SWZ(src->swizzle, ch))
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=76749
Signed-off-by: Chia-I Wu <[email protected]>
Cc: Jordan Justen <[email protected]>
Cc: Matt Turner <[email protected]>
Reviewed-by: Ian Romainck <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Cc: "10.1" <[email protected]>
|
|
|
|
|
|
|
| |
I thought I was seeing a bug in the code while reviewing, but it's not
there.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
I'm going to be turning dual_src_output into an array in a moment.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Nothing bad came of this because they weren't used after visitor running,
but leaving them in a bad state seems like a recipe for pain later.
Suggested-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
All of a vec4 uniform was being printed as "u0"
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When you've got a simple solid-color shader that doesn't generate
pixel_x/y interpolation, we were deciding that the first vgrf was both the
undefined pixel_x and pixel_y, and extending its live interval to avoid
the stride problem. That tricked other optimization that tries to see if
a particular instruction is the last use of a variable.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We're walking the whole instruction stream, so we know the declaration
will be found.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We stopped doing variable index lowering for uniforms in
a64c1eb9b110f29b8abf803a8256306702629bdc, 5 months after the comment was
added.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
is_power_of_two() is now provided by mesa so its definition must be removed
from the i915 driver code.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The next few patches will introduce an optimization that only works when
integers are not represented as floating point values.
v2: Re-word-wrap a line, as requested by Ian Romanick.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vector backend already implemented this optimization, but
surprisingly, we never bothered to implement it in the scalar backend.
In addition to saving two instructions, this eliminates a use of the
accumulator as an explicit source, which is unsupported in SIMD16 mode
on Gen7+, which could help us gain SIMD16 programs.
Cuts 19.23% of the instructions in dolphin/efb2ram.shader_test.
v2: Rebase on is_16bit_integer_constant -> is_uint16_constant rename.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The i965 MUL instruction doesn't natively support 32-bit by 32-bit
integer multiplication; additional instructions (MACH/MOV) are required.
However, we can avoid those if we know one of the operands can be
represented in 16 bits or less. The vector backend's is_16bit_constant
static helper function checks for this.
We want to be able to use it in the scalar backend as well, which means
moving the function to a more generally-usable location. Since it isn't
i965 specific, I decided to make it an ir_constant method, in case it
ends up being useful to other people as well.
v2: Rename from is_16bit_integer_constant to is_uint16_constant, as
suggested by Ilia Mirkin. Update comments to clarify that it does
apply to both int and uint types, as long as the value is
non-negative and fits in 16-bits.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This makes the function available from core Mesa code, including the
GLSL compiler.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Performance warnings are logged via KHR_debug in addition to when the
INTEL_DEBUG=perf environment variable is set. Without this, messages in
debug contexts would have "(null)" for the reason.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are clearly needed---the comments in the function are even present
for each one of them. I originally had two separate state atoms for
3DSTATE_SBE and 3DSTATE_SBE_SWIZ. When I combined the functions, I must
have forgotten to add the atoms for 3DSTATE_SBE_SWIZ.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Nothing actually uses this---we handle rasterizer discard in the
clipper in order for statistics counters to work.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This is the equivalent of commit 43e77215b13b2f86e461cd8a62b542f.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
I think this was used for coalescing out partly dead large virtual
registers, but the patch that enabled that caused regressions and didn't
make it upstream.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's more likely that we won't find writes to all channels than one will
interfere, and calculating interference is more expensive. This change
will also help prepare for coalescing load_payload instructions'
operands.
Also update the live intervals for all channels, and not just the last
that we saw.
Reviewed-by: Eric Anholt <[email protected]>
|