| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Experimentation with different values of src/dst horizontal/vertical
alignment showed that these fileds are not used on gen9 hardware.
A recent update in graphics specs has removed these fields from
XY_FAST_COPY_BLT command.
Cc: Ben Widawsky <[email protected]>
Cc: Chad Versace <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For now, only enable it on platforms that actually support ETC2.
At this point, Broadwell is only failing 5 (out of 8358) dEQP tests:
dEQP-GLES31.functional.copy_image.non_compressed.viewclass_32_bits.
srgb8_alpha8_r11f_g11f_b10f.renderbuffer_to_texture3d
srgb8_alpha8_rgb10_a2ui.renderbuffer_to_cubemap
srgb8_alpha8_rgb10_a2ui.renderbuffer_to_renderbuffer
srgb8_alpha8_rgb10_a2.renderbuffer_to_texture2d
srgb8_alpha8_rgb9_e5.renderbuffer_to_texture3d
These fail with all methods (meta, blorp, blitter, memcpy).
All are blacklisted from the Android mustpass list, which makes me
wonder whether there's an issue with the tests. The formats in
question work with other targets, and the targets in question work
with other formats...
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We're dropping Meta in favor of BLORP everywhere we can.
This also fixes bugs when copying cubemaps to 2D, which is currently
broken in the meta pass. BLORP just works.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94198
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The BLT can't handle S8 because it's W-tiled (at least without
additional funny business, and I'm not sure we care). Disallow
it so it falls back to the CPU path, which works.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
The Meta path handles this, but the CPU/BLT fallbacks did not.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, it only contains the BLT/CPU fallbacks, so the name is a bit
too generic. But eventually this will use BLORP as well, at which point
the name will make more sense.
The next patch will introduce a second call.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This simplifies things a little - now we only have one (tex or rb?)
if-ladder for src, and a second for dst, rather than four.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes Piglit's arb_copy_image-texview test with the Meta path disabled
(so we hit the blitter/CPU fallback paths).
v2: Add MinLayer even for cube maps (suggested by Ilia).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
| |
Fixes the following cts test:
GL42-CTS.vertex_attrib_64bit.limits_test
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The nested declaration of 'height' shadows a parameter and uses
uninitialized memory. Fix by renaming to 'plane_height' which also makes
the code clearer.
This would typically break the bo size computation, but we don't use
that except when mmaping, and we don't mmap YUV buffers much.
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reported-by: Mathias Fröhlich <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_KHR_robustness adds the GL_CONTEXT_LOST error and five new entry
points that we already implement. This patch adds a new dispatch table
that returns GL_CONTEXT_LOST from all entry points and implements the
GL_LOSE_CONTEXT_ON_RESET strategy by setting that table when we learn
that we've lost the context.
With the GL_CONTEXT_LOST reporting in place and dispatch for the new
entry points we can turn on GL_KHR_robustness.
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
| |
The buffer_range_* arrays are indexed by buffer index not element index.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It appears that UV immediates aren't working on Ivy Bridge. In this
case, a signed version will work, and this fixes the piglit
tests/spec/glsl-4.50/execution/helper-invocation.shader_test test.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Lift the resctriction we had before and allow creation of images with
multiple planes. We still require all the planes to be within the same
bo.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This function now only creates the mt and we then call
intel_set_texture_image_mt() in intel_image_target_texture_2d() to set
it for the texture image.
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Create the mt for the drawable bo directly and call our new
intel_miptree_create_for_bo() helper instead.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
This factors out the work of setting up a miptree as the backing for a
texture image into a new helper.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For double-precision vertex inputs we need to measure them in dvec4
terms, and for single-precision vertex inputs we need to measure them in
vec4 terms.
For the later case, we use type_size_vec4() function. For the former
case, we had a wrong implementation based on type_size_vec4().
This commit introduces a proper type_size_dvec4() function, that we use
to measure vertex inputs.
Measuring double-precision vertex inputs as dvec4 is required because
ARB_vertex_attrib_64bit states that these uses the same number of
locations than the single-precision version. That is, two consecutives
dvec4 would be located in location "x" and location "x+1", not "x+2".
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This extension appears to be a strict subset of the ARB version. Also
remove it from GL3.txt since it doesn't seem relevant.
Signed-off-by: Ilia Mirkin <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
With this, we can delete the surface format table in brw_surface_formats.c
because all of the information we need is now in ISL.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This prevents array overflow when the block is actually an array of UBOs or
SSBOs. On some hardware such as i965, such overflows can cause GPU hangs.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Previously, we were using the size of the whole BO which may be
substantially larger than the actual index buffer size.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Previously, we were using the size of the BO which may be substantially
larger than the actual vertex buffer size.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For a long time, several of the 3-channel vertex formats didn't exist so we
faked them with 4-channel versions. Starting with Sandy Bridge, we can use
R16G16B16_FLOAT and 8 and 16-bit integer formats become available on
Haswell and Bay Trail.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Bay Trail and Haswell added a bunch of new vertex formats. There was also
the addition of 64-bit passthrough formats for BDW+.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The old code always divided rounded down and then subtracted 1. What we
wanted was to divide rounded up and then subtract 1 which is equivalent to
subtracting 1 and then dividing rounded down.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we only handled the "I don't know what's going on" case for
things with InstanceDivisor == 0. However, in the DrawIndirect case we can
get num_instances == 0 and we don't know what's going on with the instanced
ones either. This commit makes the worst-case bound the default and then
conservatively tightens the bound.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous code got the BO the first time we encountered it. However,
this can potentially lead to problems if the BO is used for multiple arrays
with the same buffer object because the range we declare as busy may not be
quite right. By delaying the call to intel_bufferobj_buffer, we can ensure
that we have the full range for the given buffer.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The vbo layer passes an index_bounds_valid flag that we should be using
instead. This also fixes a bug when min_index == -1 and basevertex != 0
where we were actually comparing min_index + basevertex == -1 which was
false and we were getting the wrong buffer-sizing path.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Now the lowering pass is fixed we can reenable culling.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This format does not support alpha blending, according to the SNB PRM.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
gcc6 warns about this.
Acked-by: Matt Turner <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Technically, this was introduced with GL 4.4. However, I believe it
was intended to be retroactive. As far as I know, AMD has never
supported primitive restart with patches, while NVidia and Intel do.
This necessitated the need for a query which would allow applications
to figure out whether this was usable or not.
I decided to expose it everywhere ARB_tessellation_shader is exposed.
(It's also in both OES and EXT_tessellation_shader.)
Enable this for i965 and Gallium drivers which expose the capability.
v2: Fix a bug in the state_tracker code (caught by Ilia Mirkin).
Bugzilla: https://cvs.khronos.org/bugzilla/show_bug.cgi?id=10364
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This lets the rest of the backend know that the uniform pull constant
load opcodes don't respect channel enables -- Without this the
register allocator has no way to know that the return payload of a
pull constant load is not per-channel and spills of the destination
will be broken under non-uniform control flow.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
This should be working fine now.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94997
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently the spilling code attempts to guess the scratch message
block size from the dispatch width of the shader, which is plain wrong
for SIMD-lowered instructions (frequently but not exclusively
encountered in SIMD32 shaders) or for instructions with register
region data types of size other than 32 bit.
Instead try to use the SIMD component size of the instruction which in
some cases will allow the dataport to apply the correct channel mask
to the scratch data read or written. In the spill case the block size
needs to be clamped to the number of MRF registers reserved for
spilling. In the unspill case I didn't even bother because we
currently have no 100% accurate way to determine whether a source
region is per-channel or whether it contains things like headers that
don't respect channel boundaries -- That's fine, because the unspill
is marked force_writemask_all we can just use the largest allowable
scratch message size.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
instruction.
This prevents the application of an incorrect channel mask by the
scratch write instruction for spilled variables that don't have an
exact one-to-one correspondence between channels of the variable and
32-bit components of the scratch write instruction.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This makes sure that unspills restore the exact contents of the
variable in scratch space into the GRF without applying channel
masking, which is incorrect under control flow for things like message
headers or vectors of heterogeneous types that don't properly respect
channel boundaries.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes emit_(un)spill even more stupid by removing the logic that
decides what execution size each scratch read or write send message
should have and instead relying on the caller to specify an
appropriate execution size via the builder argument. This makes sense
because the caller will need to act differently based on the scratch
message width (e.g. emit an additional unspill before the instruction
if the execution width and channel layout of the spill doesn't match
the instruction's).
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This seems cleaner than exposing an implementation detail of
brw_fs_reg_allocate.cpp to the world, and will give the caller control
over the instruction execution flags (e.g. force_writemask_all) that
are applied to the scratch read and write instructions.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Until now the execution controls (e.g. channel group,
force_writemask_all, exec_size) of the instruction had been completely
ignored by spilling, even though that can lead to a mismatch between
the channel mask applied to the contents of the (un)spilled memory and
the GRF source or destination of the instruction. In some cases we'll
actually want the (un)spill messages to be marked force_writemask_all
regardless of whether the instruction has it set, but that will have
to be handled specially by the caller.
Reviewed-by: Jason Ekstrand <[email protected]>
|