| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
CMP.Z doesn't work on Gen4-5 because the boolean isn't guaranteed to be
0 or 0xFFFFFFFF - only the low bit is defined.
We can call emit_bool_to_cond_code to generate the condition in f0.0;
the last instruction will generate the flag value. We can patch it to
use f0.1, and negate the condition.
Fixes discard tests on Gen4-5.
Haswell shader-db stats:
total instructions in shared programs: 5770279 -> 5769112 (-0.02%)
instructions in affected programs: 64342 -> 63175 (-1.81%)
helped: 1069
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
| |
Not needed by anything in that header. Include math.h or c99_math.h
where needed instead.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
A layered PBO image is now interpreted as a single tall 2D image so
the z argument in _mesa_meta_bind_fbo_image is ignored. Therefore this
was just redundantly rebinding the same image repeatedly.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
| |
intel_horizontal_texture_alignment_unit
This will be used by next patch in the series.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
| |
before calling _mesa_meta_pbo_TexSubImage(). This will be used in
later patches and will be required in Skylake to get the tile
resource mode of miptree before calling _mesa_meta_pbo_TexSubImage().
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
| |
No functional changes in the patch. Just makes the code look cleaner.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
| |
Y tiling is supported in blitter on SNB+.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
| |
to a temporary pbo created in _mesa_meta_pbo_GetTexSubImage().
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
| |
create_texture_for_pbo() is shared by _mesa_meta_pbo_GetTexSubImage()
and _mesa_meta_pbo_TexSubImage() functions. So, we need to account
for both pack and unpack buffer objects.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
create_texture_for_pbo() is used by both _mesa_meta_pbo_GetTexSubImage()
and _mesa_meta_pbo_TexSubImage() functions with different PBO targets.
Use GL_STREAM_READ with GL_PIXEL_PACK_BUFFER and GL_STREAM_DRAW with
GL_PIXEL_UNPACK_BUFFER.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
| |
before using it for derefrencing.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
| |
otherwise samples=0 passes the check, which is invalid.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sandybridge doesn't support y-tiling for surface formats with 16 or
more bpp. There was previously an override to explicitly allow this
for Gen7. However, this restriction is also removed in Gen8+ so we
should use y-tiling there too.
This is important to do for Skylake which doesn't support x-tiling for
3D surfaces.
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since 87d3ae0b45b6b6bb50b583dafa55eb109449a005
driQueryRendererIntegerCommon handles __DRI2_RENDERER_PREFFERED_PROFILE
too.
Signed-off-by: Andreas Boll <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Corrects the way that _mesa_meta_pbo_TexSubImage and
_mesa_meta_pbo_GetTexSubImage handle 1D_ARRAY textures. Fixes a failure in
the Piglit arb_direct_state_access/gettextureimage-targets test.
Reviewed-by: Jason Ekstrand <[email protected]>
Tested-by: Laura Ekstrand <[email protected]>
Cc: "10.4, 10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Changes PBO uploads and downloads to use a tall (height * depth) 2D texture
for blitting. This fixes the bug where 2D_ARRAY, 3D, and CUBE_MAP_ARRAY
textures are not properly uploaded and downloaded.
Removes the option to use a 2D ARRAY texture for the PBO during upload and
download. This option didn't work because the miptree couldn't be set up
reliably.
v2: Review from Jason Ekstrand and Neil Roberts:
-Delete the depth parameter from create_texture_for_pbo
-Abandon the option to create a 2D ARRAY texture in create_texture_for_pbo
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "10.4, 10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves the line setting immutability for the texture to after
_mesa_initialize_texture_object so that the initializer function will not
cancel it out. Moreover, because of the ARB_texture_view extension, immutable
textures must have NumLayers > 0, or depth will equal (0-1)=0xFFFFFFFF during
SURFACE_STATE setup, which triggers assertions.
v2: Review from Kenneth Graunke:
- Include more explanation in the commit message.
- Make texture setup bug fixes into a separate patch.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "10.4, 10.5" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the previous optimization in place, some shaders wind up with
multiple discard jumps in a row, or jumps directly to the next
instruction. We can remove those.
Without NIR on Haswell:
total instructions in shared programs: 5777258 -> 5775872 (-0.02%)
instructions in affected programs: 20312 -> 18926 (-6.82%)
helped: 716
With NIR on Haswell:
total instructions in shared programs: 5773163 -> 5771785 (-0.02%)
instructions in affected programs: 21040 -> 19662 (-6.55%)
helped: 717
v2: Use the CFG rather than the old instructions list. Presumably
the placeholder halt will be in the last basic block.
v3: Make sure placeholder_halt->prev isn't the head sentinel (caught
twice by Eric Anholt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The discard condition tells us which channels we want killed. We want
to invert that condition to get the channels that should survive (remain
live) in f0.1. Emit a CMP to negate it.
Nothing generates these today, but that will change shortly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0d8f27eab7b7e8b7a16e76aabd3f6a0ab4880497.
"This doesn't seem to be necessary." <- I was wrong!
Tested-by: Markus Wick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
Fixes rendering with Dolphin.
Tested-by: Markus Wick <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
| |
total instructions in shared programs: 5695356 -> 5689775 (-0.10%)
instructions in affected programs: 486231 -> 480650 (-1.15%)
helped: 2604
LOST: 1
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen8+ support was just broken, since MUL now consumes 32-bits from both
sources. Fixes 986 piglit tests on my BDW.
total instructions in shared programs: 7753873 -> 7753522 (-0.00%)
instructions in affected programs: 28164 -> 27813 (-1.25%)
helped: 77
GAINED: 47
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
| |
total instructions in shared programs: 7756214 -> 7753873 (-0.03%)
instructions in affected programs: 455452 -> 453111 (-0.51%)
helped: 2333
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Signed-off-by: Alex Henrie <[email protected]>
|
|
|
|
| |
Signed-off-by: Alex Henrie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
"(...)Let w be the width rounded to the nearest integer (...). If the
line segment has endpoints given by (x0,y0) and (x1,y1) in window
coordinates, the segment with endpoints (x0,y0-(w-1)/2) and
(x1,y1-(w-1/2)) is rasterized, (...)"
The hardware it not rounding the line width, so we should do it.
Also, we should be careful not to go beyond the hardware limits
for the line width after it gets rounded. Gen6-7 define a maximum line
width slightly below 8.0, so we should advertise a maximum line
width lower than 7.5 to make sure that 7.0 is the maximum integer
line width that we can select. Since the line width granularity in these
platforms is 0.125, we choose 7.375. Other platforms advertise rounded
maximum line widths, so those are fine.
Fixes the following 3 dEQP tests:
dEQP-GLES3.functional.rasterization.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.texture_2d.primitives.lines_wide
dEQP-GLES3.functional.rasterization.fbo.rbo_singlesample.primitives.lines_wide
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes:
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_dst_y_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_x_linear
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_nearest
dEQP-GLES3.functional.fbo.blit.rect.out_of_bounds_reverse_src_dst_y_linear
No piglit regressions.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Across the board of the various generations, the intial few atoms in
all of the atom lists are basically the same, (performing uploads for
the various programs). The only difference is that prior to gen6
there's an ff_gs upload in place of the later gs upload.
In this commit, instead of using the atom lists for this program state
upload, we add a new function brw_upload_programs that calls into the
per-stage upload functions which in turn check dirty bits and return
immediately if nothing needs to be done.
This commit is intended to have no functional change. The motivation
is that future code, (such as the shader cache), wants to have a
single function within which to perform various operations before and
after program upload, (with some local variables holding state across
the upload).
It may be worth looking at whether some of the other functionality
currently handled via atoms might also be more cleanly handled in a
similar fashion.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
immediates and uniforms.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes metadata guess when instructions in the program specify a
destination register with non-zero reg_offset and when the payload of
a LOAD_PAYLOAD spans several registers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
MRFs cannot be read from anyway so they cannot possibly be a valid
source of LOAD_PAYLOAD.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It's perfectly fine to read the second half of a register written with
force_writemask_all from a first half MOV instruction or vice versa, and
lower_load_payload shouldn't mark the whole MOV as belonging to the second
half in that case. Replicate the same metadata to both halves of the
destination when writemasking is disabled.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Cc: "10.5" <[email protected]>
Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=540962
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89260
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When under dispatch_width=16 the previous code would allocate 2 registers for
the payload when only one is needed. This manifested itself through bugs on SKL
which needs to mess with this instruction.
Ken though this might impact shader-db, but apparently it doesn't
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89118
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=88999
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Tested-by: Timo Aaltonen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_imm_ud will yield a HW_REG which then will introduce a barrier
for certain optimization opportunities.
No piglit regressions seen with gen8 (simd8vs).
Suggested-by: Matt Turner <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For fragment programs, we pull this mask from the payload header. The same
mask doesn't exist for compute shaders, so we set all bits to enabled.
Previously we were setting 0xff to support SIMD8 VS, but with CS we
support SIMD16, and therefore we change this to 0xffff.
Related commits for SIMD8 VS:
commit d9cd982d556be560af3bcbcdaf62b6b93eb934a5
Author: Ben Widawsky <[email protected]>
Date: Sun Feb 15 20:06:59 2015 -0800
i965/simd8vs: Fix SIMD8 atomics
commit 4a95be9772a255776309f23180519a4a8560f2dd
Author: Jordan Justen <[email protected]>
Date: Tue Feb 17 09:57:35 2015 -0800
i965/simd8vs: Fix SIMD8 atomics (read-only)
Note: this mask is ANDed with the execution mask, so some channels may not end
up issuing the atomic operation.
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
Fixes xlib driver build after e8c5cbfd921680c.
Acked-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|