| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
A couple functions are missing because there are no implementations of
them yet. These are:
glFramebufferParameteri (from GL_ARB_framebuffer_no_attachments)
glGetFramebufferParameteriv (from GL_ARB_framebuffer_no_attachments)
glMemoryBarrierByRegion
v2: Rebase on updated dispatch_sanity.cpp test.
v3: Add support for glDraw{Arrays,Elements}Indirect in vbo_exec_array.c.
The updated dispatch_sanity.cpp test discovered this omission.
v4: Rebase on glapi changes.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
| |
Cc: "10.6" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90629
Tested-by: Markus Wick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the multiplication's result is unused, except by a conditional_mod,
the destination will be null. Since the final instruction in the lowered
sequence is a partial-write, we can't put the conditional mod on it and
we have to store the full result to a register and do a MOV with a
conditional mod.
Cc: "10.6" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90580
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Starting with GEN8, there is documentation that the multisample state command
must be emitted before the 3DSTATE_WM_HZ_OP command any time the multisample
count changes. The 3DSTATE_WM_HZ_OP packet gets emitted as a result of a
intel_hix_exec(), which is called upon a fast clear and/or a resolve. This can
happen before the state atoms are checked, and so the multisample state must be
put directly in the function.
v1:
- In v0, I was always emitting the command, but Ken came up with the condition to
determine whether or not the sample count actually changed.
- Ken's recommendation was to set brw->num_multisamples after emitting
3DSTATE_MULTISAMPLE. This doesn't work. I put my best guess as to why in the XXX
(it was causing 7 regressions on BDW).
v2:
Flag NEW_MULTISAMPLE state. As Ken found, in state upload we check for the
multisample change to determine whether or not to emit certain packets. Since
the hiz code doesn't actually care about the number of multisamples, set the
flag and let the later code take care of it.
Jenkins results:
http://otc-mesa-ci.jf.intel.com/view/dev/job/bwidawsk/136/
Fixes around 200 piglit tests on SKL. I'm somewhat surprised that it seems to
have no impact on BDW as the restriction is needed there as well.
Cc: "10.5 10.6" <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Neil Roberts <[email protected]> (v0)
Reviewed-by: Kenneth Graunke <[email protected]> (v2)
|
|
|
|
|
|
|
| |
BRW_NEW_NUM_SAMPLES is sufficient.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This workaround is documented in the 3DSTATE_GS documentation. It
appears to only apply to early steppings of Broadwell and Skylake.
I don't think it ever affected production hardware, so at this point it
probably makes sense to delete it.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Encapsulate the knowledge about how to build the nop table in a new
_mesa_new_nop_table function. This makes it easier for dispatch_sanity
to keep working now and in the future.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Tested-by: Mark Janes <[email protected]>
Cc: 10.6 <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 4bdbb588a9d38 introduced new _glapi_new_nop_table() and
_glapi_set_nop_handler() functions in the glapi dispatcher (which
live in libGL.so). The calls to those functions from context.c
would be undefined (i.e. an ABI break) if the libGL used at runtime
was older.
For the time being, use the old single generic_nop() function for
non-Windows builds to avoid this problem. At some point in the future
it should be safe to remove this work-around. See comments for more
details.
v2: Incorporate feedback from Emil. Use _WIN32 instead of
GLX_DIRECT_RENDERING to control behavior, move comments.
Cc: 10.6 <[email protected]>
Reviewed-and-tested-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using SIMD4x2 on Skylake, the sampler instructions need a message
header to select the correct mode. This was added for most sample
instructions in 0ac4c2727 but the TXF_MCS instruction is emitted
separately and it was missed.
This fixes a bunch of Piglit tests which test texelFetch in a geometry
shader, for example:
spec/arb_texture_multisample/texelfetch/2-gs-sampler2dms
Cc: [email protected]
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "10.5 10.6" <[email protected]>
|
|
|
|
| |
Trivial. Deleted the 2 unneeded lines.
|
|
|
|
|
|
|
|
| |
We build the entire message in the generator so all the MRF writes are
implied.
Cc: "10.5 10.6" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the prog_to_nir pass was directly generating uniform load/store
intrinsics. This converts it to use a single giant "parameters" variable
and we now depend on lowering to get the uniform load/store intrinsics.
One advantage of this is that we now have one code-path after we do the
initial conversion into NIR.
No shader-db changes.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
... since it's always .x, and also always print the subreg offset when
using repctrl.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Passing -module to glibtool causes the resulting library to be called
libSomething.so rather than libSomething.dylib on darwin.
Regardless if libOSMesa is a library or a module, it has been used as
the former for quite some time. Update the build to reflect that and
resolve the naming issue.
Cc: "10.5 10.6" <[email protected]>
Signed-off-by: Jeremy Huddleston Sequoia <[email protected]>
[Emil Velikov: Tweak the commit message.]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes regression from commit 5b2d3480f57168d50ad24cf0b8c9244414bd3701
Cc: "10.5 10.6" <[email protected]>
Signed-off-by: Alan Coopersmith <[email protected]>
Reviewed-by: Jeremy Huddleston Sequoia <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we used intrinsic->const_index[1] to represent "the number of
array elements to load" for load/store intrinsics. However, this set to 1
by every pass that ever creates a load/store intrinsic. Also, while it
might make some sense for registers, it makes no sense whatsoever in SSA.
On top of that, the i965 backend was the only backend to ever support it;
freedreno and vc4 just assert that it's always 1. Let's just delete it.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Connor Abbott <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes a crash in nouveau which can't handle
set_constant_buffer(PIPE_SHADER_TESS_*).
Cc: 10.6 <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From ARB_program_interface_query:
"Note that if an interface enumerates a single active resource list
entry for an array variable (e.g., "a[0]"), a <name> identifying
any array element other than the first (e.g., "a[1]") is not
considered to match."
It doesn't apply to arrays of interface blocks but just to array
variables.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This just created extra upkeep and the push to move extern
C's into mesa code would mean a large number of extern's
in core Mesa driver interfaces. The Haiku Gallium renderers
are mostly insulated via the C-based Haiku state tracker.
As any future hardware support in Haiku will be gallium
based, lets just drop swrast.
Haiku has a Mesa 7.12 fork for gcc2 that uses swrast.
This commit fixes the last of the Haiku build issues.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL IR vs. NIR shader-db results for SIMD8 vertex shaders on Broadwell:
total instructions in shared programs: 2742062 -> 2681339 (-2.21%)
instructions in affected programs: 1514770 -> 1454047 (-4.01%)
helped: 5813
HURT: 1120
The gained programs are ARB vertext programs that were previously going
through the vec4 backend. Now that we have prog_to_nir, ARB vertex
programs can go through the scalar backend so they show up as "gained" in
the shader-db results.
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OLD:
0x00007340: 0x00800000: BLEND:
0x00007344: 0x84202100: BLEND:
NEW:
0x00007340: 0x00800000: BLEND: Alpha blend/test
0x00007344: 0x0000000b84202100: BLEND_ENTRY00:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x0000734c: 0x0000000b84202100: BLEND_ENTRY01:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x00007354: 0x0000000b84202100: BLEND_ENTRY02:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x0000735c: 0x0000000b84202100: BLEND_ENTRY03:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x00007364: 0x0000000b84202100: BLEND_ENTRY04:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x0000736c: 0x0000000b84202100: BLEND_ENTRY05:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x00007374: 0x0000000b84202100: BLEND_ENTRY06:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
0x0000737c: 0x0000000b84202100: BLEND_ENTRY07:
Color Buffer Blend factor ONE,ONE,ONE,ONE (src,dst,src alpha, dst alpha)
function ADD,ADD (color, alpha), Disables: ----
v2: Line length fixes, and const usage (Topi)
Safer initialization of name string (Topi)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch is optional in the series. It does make the output much cleaner, but
there is some risk.
Sample output (v3):
0x00007e80: 0x231d7000: SURF000: 2D R8G8B8A8_UNORM VALIGN4 HALIGN4 Y-tiled
0x00007e84: 0x05000000: SURF000: MOCS: 0x5 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007e88: 0x009f009f: SURF000: 160x160 [AUX_NONE]
0x00007e8c: 0x0000027f: SURF000: 1 slices (depth), pitch: 640
0x00007e90: 0x00000000: SURF000: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007e94: 0x00000000: SURF000: x,y offset: 0,0, min LOD: 0
0x00007e98: 0x00000000: SURF000: AUX pitch: 0 qpitch: 0
0x00007e9c: 0x09770000: SURF000: Clear color: R(0)G(0)B(0)A(0)
0x00007ea0: 0x00001000: SURF000: 0x00001000
0x00007ea4: 0x00000000: SURF000: 0x00000000
0x00007ea8: 0x00000000: SURF000: 0x00000000
0x00007eac: 0x00000000: SURF000: 0x00000000
0x00007e40: 0x234df000: SURF001: 2D R11G11B10_FLOAT VALIGN4 HALIGN16 Y-tiled
0x00007e44: 0x09000000: SURF001: MOCS: 0x9 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007e48: 0x009f009f: SURF001: 160x160 [AUX_CCS_D (Uncompressed, MULTISAMPLE_COUNT=1)]
0x00007e4c: 0x0000027f: SURF001: 1 slices (depth), pitch: 640
0x00007e50: 0x00000000: SURF001: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007e54: 0x00000000: SURF001: x,y offset: 0,0, min LOD: 0
0x00007e58: 0x00000001: SURF001: AUX pitch: 0 qpitch: 0
0x00007e5c: 0x09770000: SURF001: Clear color: R(0)G(0)B(0)A(0)
0x00007e60: 0x0002b000: SURF001: 0x0002b000
0x00007e64: 0x00000000: SURF001: 0x00000000
0x00007e68: 0x0002a000: SURF001: 0x0002a000
0x00007e6c: 0x00000000: SURF001: 0x00000000
v2: Rebased on Topi's recent series which changed around some of the gen8
surface setup code.
v3: Use ralloc_asprintf instead of asprintf to be more friendly to non-GNU
platforms.
Signed-off-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen9 surface state is very similar to the previous generation. The important
changes here are aux mode, and the way clear colors work.
NOTE: There are some things intentionally left out of this decoding.
v2: Redo the string for the aux buffer type to address compressed variants.
v3: Use the shift for compression enable (instead of compression mode) (Topi)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
AFAICT, none of the old data was wrong (the gen7 decoder), but it wa smissing a
bunch of stuff.
Adds a tick (') to denote the beginning of the surface state for easier reading.
This will be replaced later with some better, but more risky code.
OLD:
0x00007980: 0x23016000: SURF: 2D BRW_SURFACEFORMAT_B8G8R8A8_UNORM
0x00007984: 0x18000000: SURF: offset
0x00007988: 0x00ff00ff: SURF: 256x256 size, 0 mips, 1 slices
0x0000798c: 0x000003ff: SURF: pitch 1024, tiled
0x00007990: 0x00000000: SURF: min array element 0, array extent 1
0x00007994: 0x00000000: SURF: mip base 0
0x00007998: 0x00000000: SURF: x,y offset: 0,0
0x0000799c: 0x09770000: SURF:
0x00007940: 0x231d7000: SURF: 2D BRW_SURFACEFORMAT_R8G8B8A8_UNORM
0x00007944: 0x78000000: SURF: offset
0x00007948: 0x001f001f: SURF: 32x32 size, 0 mips, 1 slices
0x0000794c: 0x0000007f: SURF: pitch 128, tiled
0x00007950: 0x00000000: SURF: min array element 0, array extent 1
0x00007954: 0x00000000: SURF: mip base 0
0x00007958: 0x00000000: SURF: x,y offset: 0,0
0x0000795c: 0x09770000: SURF:
NEW (v1):
0x00007980: 0x23016000: SURF': 2D B8G8R8A8_UNORM VALIGN4 HALIGN4 X-tiled
0x00007984: 0x18000000: SURF: MOCS: 0x18 Base MIP: 0.0 (0 mips) Surface QPitch: 0
0x00007988: 0x00ff00ff: SURF: 256x256 [AUX_NONE]
0x0000798c: 0x000003ff: SURF: 1 slices (depth), pitch: 1024
0x00007990: 0x00000000: SURF: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007994: 0x00000000: SURF: x,y offset: 0,0, min LOD: 0
0x00007998: 0x00000000: SURF: AUX pitch: 0 qpitch: 0
0x0000799c: 0x09770000: SURF: Clear color: ----
0x00007940: 0x231d7000: SURF': 2D R8G8B8A8_UNORM VALIGN4 HALIGN4 Y-tiled
0x00007944: 0x78000000: SURF: MOCS: 0x78 Base MIP: 0 (0 mips) Surface QPitch: ff0000
0x00007948: 0x001f001f: SURF: 32x32 [AUX_NONE]
0x0000794c: 0x0000007f: SURF: 1 slices (depth), pitch: 128
0x00007950: 0x00000000: SURF: min array element: 0, array extent 1, MULTISAMPLE_1
0x00007954: 0x00000000: SURF: x,y offset: 0,0, min LOD: 0
0x00007958: 0x00000000: SURF: AUX pitch: 0 qpitch: 0
0x0000795c: 0x09770000: SURF: Clear color: ----
0x00007920: 0x00007980: BIND0: surface state address
0x00007924: 0x00007940: BIND1: surface state address
v2: Style cleanups (Matt)
Fix aux mode dword 7->6 (Topi)
Use exp2 instead of pow (Matt)
Add dwords 8-12 to the dump
v3: Needed to update the surface format name getter for the change in the first
patch in the series
Signed-off-by: Ben Widawsky <[email protected]>
Cc: Matt Turner <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OLD:
0x00007e00: 0x10000000: WM SAMP0: filtering
0x00007e04: 0x000d0000: WM SAMP0: wrapping, lod
0x00007e08: 0x00000000: WM SAMP0: default color pointer
0x00007e0c: 0x00000090: WM SAMP0: chroma key, aniso
NEW:
0x00007e00: 0x10000000: SAMPLER_STATE 0: Disabled = no, Base Mip: 0.0, Mip/Mag/Min Filter: NONE/NEAREST/NEAREST, LOD Bias: 0.0
0x00007e04: 0x000d0000: SAMPLER_STATE 0: Min LOD: 0.0, Max LOD: 13.0
0x00007e08: 0x00000000: SAMPLER_STATE 0: Border Color
0x00007e0c: 0x00000090: SAMPLER_STATE 0: Max aniso: RATIO 2:1, TC[XYZ] Address Control: CLAMP|CLAMP|WRAP
v2: Move GET_BITS macro to here (with paren protection) Ben/Topi
Add const to the sampler pointer (Topi)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
0x00007da0: 0xc1da740e: SF_CLIP VP: guardband xmin = -27.306667
0x00007da4: 0x41da740e: SF_CLIP VP: guardband xmax = 27.306667
0x00007da4: 0x41da740e: SF_CLIP VP: guardband ymin = -23.405714
0x00007da8: 0xc1bb3ee7: SF_CLIP VP: guardband ymax = 23.405714
0x00007db0: 0x00000000: SF_CLIP VP: Min extents: 0.00x0.00
0x00007db8: 0x00000000: SF_CLIP VP: Max extents: 299.00x349.00
While here, fix the wrong offsets for the guardband (I didn't check if it used
to be valid on GEN4).
v2: Remove leftover GET_BITS which belongs later in the series. (Topi)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's true that not all surfaces apply for every gen, but for the most part this
is what we want. (The unfortunate case is when we use a valid surface, but not
for the specific GEN).
This was automated with a vim macro.
v2: Shortened common forms such as R8G8B8A8->RGBA8. Note that this makes some of
the sample output in subsequent commits slightly incorrect.
v3: Use the name from the table (Ken). This requires declaring the surface
format array as extern, and declaring the struct in the .h file.
v4: Move the struct back and create a helper function to obtain the name (Ken)
Get rid of the now useless helper in the state_dump.c
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]> (v3)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Recommended-by: Kenneth Graunke <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Ivybridge and Baytrail can't use mach with 2Q quarter control, so just
do it without the accumulator. Stupid accumulator.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
The next commit uses an add(16) with a UW destination with a stride of
2, which needs compression control since it's writing two registers. The
old code would have failed to set compression control correctly.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Ivybridge (and presumably Baytrail) have a bug that prevents this from
working.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Used in the next commit.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen8+'s MUL instruction doesn't ignore the high 16-bits of one source
like on earlier platforms, so we can constant propagate into it without
worry. Integer multiplies (not into the accumulator, which is done for
imul_high) are lowered in lower_integer_multiplication(), so it's safe
there as well.
On Broadwell, fragment shaders only:
total instructions in shared programs: 4377769 -> 4377451 (-0.01%)
instructions in affected programs: 48064 -> 47746 (-0.66%)
helped: 156
On Broadwell, vertex shaders only:
total instructions in shared programs: 2858885 -> 2856313 (-0.09%)
instructions in affected programs: 26380 -> 23808 (-9.75%)
helped: 134
On Broadwell, vertex shaders only (with INTEL_USE_NIR=1):
total instructions in shared programs: 2911688 -> 2865984 (-1.57%)
instructions in affected programs: 1421715 -> 1376011 (-3.21%)
helped: 6186
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
32-bit x 32-bit integer multiplication requires multiple instructions
until Broadwell. This patch just lets us treat the MUL instruction in
the FS backend like it operates on Broadwell, and after optimizations
we lower it into a sequence of instructions on older platforms.
Doing this will allow us to some extra optimization on integer
multiplies.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, when the MinFilter is GL_LINEAR or GL_NEAREST we hide the
actual miplevel count from the hardware (and we avoid re-creating
the miptree structure with all the levels), since we don't expect
levels other than the base level to be needed. Unfortunately,
GLSL's textureSize() function is an exception to this rule. This
function takes a lod parameter that we need to use to return the
size of the appropriate miplevel (if it exists). The spec only
requires that the miplevel exists, so even if the sampler is
configured with a linear or nearest MinFilter, as far as the user
has uploaded miplevels for the texture, textureSize() should return
the appropriate sizes.
This patch fixes this by exposing the actual miplevel count for all
sampling engine textures while keeping the original implementation
for render targets (for render targets textures we do not provide
the miplevel count but the actual LOD we are wrting to, so we
want to make sure that we make this the base level).
Fixes 28 dEQP tests in the following category:
dEQP-GLES3.functional.shaders.texture_functions.texturesize.*
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
| |
Found by Coverity.
Reported-by: Ilia Mirkin <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mesa does not (and probably never will) support GL_ARB_geometry_shader4,
so this function will never exist. Having a function that is
exec="skip" and offset="assign" is just weird.
There are still a couple 'exec="skip" offset="assign"' functions
remaining. These remain because we either support GLX protocol for them
(glSampleMaskSGIS and glSamplePatternSGIS) or older DRI drivers still
need them in the dispatch table (glResizeBuffersMESA). The SGIS
functions can be removed later.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
With DSA we can no longer rely on this being done in st_validate_state
in response to the framebuffer bindings having changed.
This fixes the ext_framebuffer_multisample-bitmap piglit test.
Signed-off-by: Fredrik Höglund <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 3687d75 changed the fs_visitor constructors, but it didn't update
all the users. As a result, 'make check' fails.
I added the explicit cast to the gl_program* parameter to make it more
clear which NULL was which.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For scalar GS support, we either need to add a fourth constructor which
takes the GS structures, or combine the existing two and pass the shader
stage.
Given that they're not significantly different, I opted for the latter.
v2: Remove more stuff from the .h file (Jason and Jordan).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Assume that all drivers that advertise support for NPOT textures
are able to support GL 2.0.
v2: Add a comment.
Signed-off-by: Fredrik Höglund <[email protected]>
Reviewed-by: Adam Jackson <[email protected]>
|