| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
| |
This commit movies us from a miptree model to a surf+bo+offset model. In
the GL driver, miptrees are almost always at the start of the bo so the
offset is zero but we don't want to always make that assumption. In the
sort term, gen6 stencil and HiZ will be at an offset but, in the long term,
any Vulkan surface is liable to be at a non-zero offset.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The previous HiZ support was bogus because all of get_aux_isl_surf looked
at mt->mcs_mt directly. For HiZ buffers, you need to look at either
mt->hiz_buf or mt->hiz_buf->mt.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In order for the calculations of things such as fast clear rectangles to
work, we need more details of the auxiliary surface to be correct. In
particular, we need to be able to trust the width and height fields.
(These are not necessarily what you want coming out of the miptree.) The
only values state setup really cares about are the row and array pitch and
those we can safely stomp from the miptree.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
At one point, we were doing this correctly. It must have gotten lost in
one of the many rebases.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The only reason why we need layer or level is that we need the z-offset for
3-D surfaces. Let's just have the one field for that.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The data comes in via ISL in a format that's almost directly usable by the
hardware so we can avoid some of the conversion headache.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Now that the generic blorp path uses base level/layer, there's no need to
make gen8 special.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since the dawn of time, blorp has used offsets directly to get at different
mip levels and array slices of surfaces. This isn't really necessary since
we can just use the base level/layer provided in the surface state. While
it may have simplified blorp's original design, we haven't been using the
blorp path for surface state on gen8 thanks to render compression and
there's really no good need for it most of the time. This commit restricts
such surface munging to the cases of fake W-tiling and fake interleaved
multisampling.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The layer field is in terms of physical layers which isn't quite what the
sampler will want for 2-D MS array textures.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
Multisample array surfaces on IVB don't support the minimum array element
surface attribute so it needs to come through the sampler message. We may
as well just pass it through everything.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
At the moment, the minify operation does nothing because
params.depth.view.base_level is always zero. However, as soon as we start
using actual base miplevels and array slices, we are going to need the
minification. Also, we only need to align the surface dimensions in the
case where we are operating on miplevel 0. Previously, it didn't matter
because it aligned on miplevel 0 and, for all other miplevels, the miptree
code guaranteed that the level was already aligned.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The sampling hardware can handle them ok. It just looks at the tiling to
determine whether it's the new gen9 1-D layout or the old one. The render
hardware isn't so smart.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Instead, we manually mutate the surface size as needed.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The helper does a full transformation on the surface to turn it into a new
2-D single-layer single-level surface representing the original layer and
level in memory.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
For the moment, we still call the old miptree function; we just assert that
the two are equal.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Eventually, this will be the actual view that gets passed into isl to
create the surface state. For now, we just use it for the format and the
swizzle.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Previously we multiplied full x/y offsets, resolved tile aligned buffer
offset and intra tile offset based on that. Now we let ISL to take into
account the msaa setting and we only multiply the resolved intra tile
offsets.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
We put all of the code for fake IMS together. This requires moving a bit
of the program key setup code further down so that it gets the right values
out of the final surface.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
We also remove brw_blorp_surface_info::msaa_layout.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Now that we're carrying around the isl_surf, we can just modify it
directly instead of passing an extra bit around.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
We have a handy little function is ISL that does exactly the same thing.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
It's only used to stomp the tiling to Y and it's only used by blorp so
there's no reason why blorp can't do it itself.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The current logic used to determine the execution size of sampler
messages was based on special-casing several argument and opcode
combinations, which unsurprisingly missed the possibility that some
messages could exceed the payload size limit or not depending on the
number of coordinate components present. In particular:
- The TXL, TXB and TEX messages (the latter on non-FS stages only)
would attempt to use SIMD16 on Gen7+ hardware even if a shadow
reference was present and the texture was a cubemap array, causing
it to overflow the maximum supported sampler payload size and
crash.
- The TG4_OFFSET message with shadow comparison was falling back to
SIMD8 regardless of the number of coordinate components, which is
unnecessary when two coordinates or less are present.
Both cases have been handled incorrectly ever since cubemap arrays and
texture gather were respectively enabled (the current logic used by
the SIMD lowering pass is almost unchanged from the previous no16
fall-back logic used pre-SIMD lowering times).
Fixes the following GL4.5 conformance test on Gen7-8 (the bug also
affects Gen9+ in principle, but SKL passes the test by luck because it
manages to use the TXL_LZ message instead of TXL):
GL45-CTS.texture_cube_map_array.sampling
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97267
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This makes it easier for the caller to find out how many scalar
components are actually read by the instruction. As a bonus we no
longer need to special-case BAD_FILE in the implementation of
fs_inst::regs_read.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This simplifies the code slightly and will allow the SIMD lowering
pass to find out easily what the actual texturing opcode is in order
to determine the maximum execution size of texturing instructions.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This is required following the change in 8X sample positions.
Fixes the recently modified multisample-scaled-blit piglit tests.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There are no standard sample positions defined in OpenGL and OpenGL
ES specs. Implementations have the freedom to pick the positions
which give plausible results. But the Vulkan 1.0 spec does define
standard sample positions for different sample counts. Defined
positions in Vulkan for all the sample counts except 8X match with
the positions we set in i965. We have an upcoming plan to share the
blorp code between OpenGL and Vulkan driver in near future. Keeping
the 8X sample positions same on both the drivers will help us move
in that direction.
Here is an argument by Neil Roberts (from commit 20250e85) against
any advantage of current 8X sample positions over the new ones:
"The comment above for the 8x sample positions says that the hardware
implements centroid interpolation by picking the centre-most sample
that is inside the primitive. That implies that it might be worthwhile
to pick a pattern that includes 0.5,0.5. However by experimentation
this doesn't seem to actually be the case. With the sample positions
in this patch, if I modify the piglit test below so that it instead
reports the centroid position, it reports 0.492188,0.421875 which
doesn't match any of the positions. If I modify the sample positions
so that they include one at exactly 0.5,0.5 it doesn't help and it
reports another position which is even further from the center for
some reason.
arb_gpu_shader5-interpolateAtSample-different
Kenneth Graunke experimented with some other patterns that have a
higher standard deviation but I think after some discussion it was
decided that it would be better to pick the same pattern as the other
graphics API in case there are games that rely on this pattern."
Observed no regressions in jenkins testing.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The pass isn't really control-flow aware and you can get into case where it
tries to combine instructions from different blocks. This can actually
lead to an assertion failure when removing unneeded instructions if part of
the vector is set in one block and part in another. This prevents
regressions in the next commit.
Signed-off-by: Jason Ekstrand <[email protected]>
Cc: "12.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
As requested with the initial creation of util/bitscan.h
now move other bitscan related functions into util.
v2: Split into two patches.
Signed-off-by: Mathias Fröhlich <[email protected]>
Tested-by: Brian Paul <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenGL 4.4 specifies that BlitFramebuffer should perform sRGB encode
and decode like ES 3.x does, but only when GL_FRAMEBUFFER_SRGB is
enabled. This is technically incompatible in certain cases, but is
more consistent across GL, ES, and WebGL, and more flexible.
The NVIDIA 367.35 drivers appear to follow this behavior.
For the awful spec analysis, please read Piglit's
tests/spec/arb_framebuffer_srgb/blit.c, which explains the differences
between GL 4.1, 4.2, 4.3 (2012), 4.3 (2013), and 4.4, and why this
is the right rule to implement.
Note that ctx->Color.sRGBEnabled is initialized to _mesa_is_gles(ctx),
and ES doesn't have enable/disable flags for GL_FRAMEBUFFER_SRGB, so
it's effectively on all the time. This means the ES behavior should
be unchanged.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should have no effect, as all drivers which support BLORP also
support ES 3.0 - so ES 2.0 would be promoted and follow the ES 3 rules.
ES 1.0 doesn't have BlitFramebuffer.
This is purely to clarify the next patch a bit.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I've never quite understood the purpose of this hack - supposedly,
doing resolves in the sRGB colorspace is slightly more accurate.
Currently, BlitFramebuffer() ignores sRGB encoding and decoding
on OpenGL, although it encodes and decodes in GLES 3.x.
The updated OpenGL 4.4 rules also allow for encoding and decoding
if GL_FRAMEBUFFER_SRGB is enabled, allowing the application to
control what colorspace blits are done in. I don't think this hack
makes any sense in such a world - the application can do what it
wants, and we shouldn't second guess them.
A related Piglit patch, "Make multisample accuracy test set
GL_FRAMEBUFFER_SRGB when resolving." makes the Piglit MSAA accuracy
test explicitly request SRGB encoding/decoding during resolves when
running "srgb" subtests. Without that patch, this commit will regress
those tests, but with it, they should continue to work just fine.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Modern OpenGL BlitFramebuffer require sRGB encode/decode when
GL_FRAMEBUFFER_SRGB is enabled. The blitter can't handle this,
so we need to bail. On Gen4-5, this means falling back to Meta,
which should handle it.
We allow sRGB <-> sRGB blits, as decode then encode ought to be a noop
(other than potential precision loss, which nobody wants anyway).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, for every input, we moved the dispatch mask to the flag
register, then emitted two predicated PLN instructions, one with
centroid barycentric coordinates (for normal pixels), and one with
pixel barycentric coordinates (for unlit helper pixels).
Instead, we can simply emit a set of predicated MOVs at the top of
the program which copy the pixel barycentric coordinates over the
centroid ones for unlit helper pixel channels. Then, we can just
use normal PLNs.
On Sandybridge:
total instructions in shared programs: 7538470 -> 7534500 (-0.05%)
instructions in affected programs: 101268 -> 97298 (-3.92%)
helped: 705
HURT: 9 (all of which are SIMD16 programs)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we allocated a new VGRF for every undefined definition.
Instead, this patch makes us allocate a new VGRF for every use of an
undefined definition. This makes sure that undefined values are
fully independent of one another, and have live ranges limited to
their single use. This allows register coalescing to combine the
source and destination of MOVs from undefined sources, eliminating
the MOV altogether.
On Broadwell:
total instructions in shared programs: 11641187 -> 11640214 (-0.01%)
instructions in affected programs: 70199 -> 69226 (-1.39%)
helped: 213
HURT: 1
v2: Add a comment (based on Iago's suggested one).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to include mt->offset in the calculation of src pointer because its
value may be non-zero, for example in a cubemap texture.
Signed-off-by: Haixia Shi <[email protected]>
Cc: Jason Ekstrand <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Change-Id: I461ad5b204626d5a1c45611fc6b63735dcf29f63
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Once upon a time (commit 8313f44409) Paul added code for the unlit
centroid workaround (WaCopyUnlitCentroidBarys). His commit message
claims it fixed the EXT_framebuffer_multisample/interpolation {2,4}
{centroid-deriv,centroid-deriv-disabled} piglit tests but does not say
on which platform, though he cites the IVB PRM.
"3DSTATE_WM [DevIVB, DevHSW]" says
"[DevIVB]: Workaround: When Centroid Barycentric mode is required, HW
may produce incorrect interpolation results when a 2X2 pixels have
unlit pixels."
I later disabled it for Haswell (commit f6db414f3c) with no known ill
effects.
The Sandybridge page does not have this text, but the workarounds
database (see WaCopyUnlitCentroidBarys) says the issues applies *only*
to Sandybridge, and in fact in commit 1a2de7dce8fc I note that disabling
the workaround on Sandybridge causes the tests Paul originally mentioned
to fail.
So this is, and always has been, a huge confusing mess.
Disabling the workaround indeed causes the tests Paul originally
mentioned to fail on Sandybridge but not on Ivybridge/Baytrail.
On Ivybridge:
total instructions in shared programs: 6914901 -> 6909599 (-0.08%)
instructions in affected programs: 106766 -> 101464 (-4.97%)
helped: 884
total cycles in shared programs: 70874764 -> 70813774 (-0.09%)
cycles in affected programs: 794144 -> 733154 (-7.68%)
helped: 688
HURT: 186
LOST: 1
GAINED: 6
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|