| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This was the only actual difference between Gen4-6 and Gen7+ in terms of
the values we program. The rest was just mechanical structure
rearrangement.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of stuffing bits directly into the brw_sampler_state structure,
we now store them in local variables, then use brw_emit_sampler_state()
to assemble the packet. This separates the decision about what values
to use from the actual packet emission, which makes the code more
reusable across generations.
v2: Put const on a bunch of local variables and move declarations,
as suggested by Topi Pohjolainen.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This simply assembles all the SAMPLER_STATE fields into their proper bit
locations. Making it work on all generations was easy enough; some of
the fields are even in the same place.
Not used by anything yet, but will be soon. I made it non-static so
BLORP can use it too.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
It doesn't edit the value, and this lets us use const in more places.
Needed to implement Topi's review comments for the next patch.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We'll use these to replace the existing structures.
I've adopted the convention that "BRW" applies to all hardware, and
"GENX" applies starting with generation X, but might be replaced by some
later generation.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes it easy to tell that they're grouped together, and also
improves gdb printing.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
brw_upload_sampler_state_table now handles all generations, so we don't
need the vtable mechanism either.
There's still a lot of code duplication; the next patches will address
that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies a few changes from gen7_upload_sampler_state_table; the next
patch will delete that function.
Gen7+ has per-stage sampler state pointer update packets, so we emit
them as soon as we emit a new table for a stage. On Gen6 and earlier,
we have a single packet, so we delay until we've changed everything
that's going to be changed.
v2: Split 3DSTATE_SAMPLER_STATE_POINTERS_XS packet emission into a
helper function (suggested by Topi Pohjolainen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Gen4-6 and Gen7+ code is virtually identical, but both use different
structure types. Switching to use a uint32_t pointer and operate on the
number of DWords will make it possible to share code.
It turns out that SURFACE_STATE is the same number of DWords on every
platform currently; it will be easy to handle a change there, though.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Other than this, brw_update_sampler_state only deals with a single
SAMPLER_STATE structure, and doesn't need to know which position it is
in the table. The caller takes care of dealing with multiple surface
states.
Pushing this up a level allows us to drop the ss_index parameter.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
This was copied from the Gen4-6 code, but is unused.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
sdc_offset is produced and consumed in the same function, so there's no
need to store it in the context, nor pass pointers to it through various
call chains.
Saves 128 bytes per brw_stage_state structure, and makes the code
clearer as well.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's just an array of four floats, and we have an array of four floats,
so this is literally just a memcpy...but with custom structs and strange
macros to give the appearance of doing something more.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The old one has been inaccurate for years.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the driver was originally written, it only supported texturing in
the pixel shader backend; vertex and geometry shader texturing came much
later. Originally, the pixel shader was referred to as "WM" (the
Windowizer/Masker unit). So, this code happened to only be relevant for
the WM stage, at the time.
However, sampler state really applies to all stages, so putting "wm" in
the filename doesn't make sense. I dropped it in gen7_sampler_state.c;
at this point the asymmetry just trips people up.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The "Min/Mag State Not Equal" bit is supposed to be set when the min/mag
filters or address rounding modes differ. BLORP uses identical min/mag
settings, so the bit should be unset.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1D array miptrees were being laid out as a 2D texture with 1 slice.
This happened due to the mesa core storing the 1D array slice count in
the height field. On Intel hardware, we want to create a 2D array with
a height of 1 for the 1D array case.
Fixes assertion failure in piglit (gen6, gen8):
spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow
In release builds of Mesa, this test was observed to cause a GPU hang
on gen8.
Signed-off-by: Jordan Justen <[email protected]>
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450
Tested-by: Ben Widawsky <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
GL_SAMPLE_SHADING is specified as a valid pname for glGet in the
GL_ARB_sample_shading extension. It seems as if we forgot to add it to the
table of pnames.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
mesa/mesa/src/mesa/drivers/dri/common/xmlconfig.c:104:10: warning: #warning "Per application configuration won't work with your OS version." [-Wcpp]
# warning "Per application configuration won't work with your OS version."
Signed-off-by: Yaakov Selkowitz <[email protected]>
Reviewed-by: Jon TURNEY <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes some of the UE4 engine demos (Stylized, Mobile Temple)
render correctly, tested on Intel Haswell machine.
Signed-off-by: Tapani Pälli <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78716
|
|
|
|
|
|
|
|
|
|
| |
This new name isn't so confusing.
I also changed the gallivm limit, because it looked wrong.
Reviewed-by: Brian Paul <[email protected]>
v2: use sizeof(float[4])
|
|
|
|
|
|
|
|
| |
With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit:
* arb_compute_shader-minmax
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to a quick micro-benchmark, this new version is 20% faster on my
Haswell laptop.
v2: Removed the XXX note about x86_64 from the comment
v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC
support for free.
v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Will clarify make the next commit easier to read.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
... to eliminate an ELSE instruction followed immediately by an ENDIF.
instructions in affected programs: 704 -> 700 (-0.57%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Since intel is always going to be little-endian,
GL_UNSIGNED_INT_8_8_8_8_REV is the same as GL_UNSIGNED_BYTE for RGBA and
BGRA textures, so the same acceleration code will work. We might as well
use it.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Obvious copy-and-paste bug.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Piglit's EXT_framebuffer_multisample/alpha-to-coverage-dual-src-blend
test, key->nr_color_regions == 2, but the dual source blend FB write has
ir->target set to 0. So we failed to set "Last Render Target Select" on
any FB write message.
We only emit one FB write per render target, so my comment about setting
LastRT on every FB write directed at the last color region is a bit...
misinformed. According to the documentation, depth buffer writes and
scoreboard updates happen on the FB write with LastRT set, so I believe
we want to set it only once.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
| |
Largely via copy and paste.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will be useful for INTEL_DEBUG=optimizer in the vec4 backend, which
needs to know whether it's currently processing a VS or GS. It isn't
worth adding virtual methods for this case.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Dropping this helps most lines fit in an 80 column terminal. The
absence of WE_normal also helps call attention to WE_all, where
something unusual is going on.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds an implementation of the ClearTexSubImage driver entry point that tries
to set up an FBO to render to the texture and then calls glClearBuffer with a
scissor to perform the actual clear. If an FBO can't be created for the
texture then it will fall back to using _mesa_store_ClearTexSubImage.
When used in combination with _mesa_store_ClearTexSubImage this should provide
an implementation that works for all DRI-based drivers. However as this has
only been tested with the i965 driver it is currently only enabled there.
v2: Only enable the extension for the i965 driver instead of all DRI drivers.
Remove an unnecessary goto. Don't require GL_ARB_framebuffer_object. Add
some more comments.
v3: Use glClearBuffer* to avoid having to modify glClearColor and friends.
Handle sRGB textures. Explicitly disable dithering.
Reviewed-by: Topi Pohjolainen <topi.pohjolainen at intel.com>
|
|
|
|
|
|
|
|
|
| |
The Meta implementation of glClearTexSubImage is going to want to ensure that
dithering is disabled so that it can get a consistent color across the whole
texture when clearing. This adds a state flag to easily save it and set it to
the default value when performing meta operations.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Adds an implmentation of the ClearTexSubImage driver entry point that just
maps the texture and writes the values in. The extension is not yet enabled by
default because it doesn't work with multisample textures as they don't have a
simple linear layout.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds the driver entry point for glClearTexSubImage and fills in the
_mesa_ClearTexImage and _mesa_ClearTexSubImage functions that call it.
v2: Don't clear some of the images if only one of them makes an error
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
In texture_error_check() there was a snippet of code to check whether the
given format and internal format are basically compatible. This has been split
out into its own static helper function so that it can be used by an
implementation of glClearTexImage too.
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Neil Roberts <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
because float depth texture data needs clamping to [0.0, 1.0]. Let the
_mesa_texstore() fallback to slower path.
Fixes Khronos GLES3 CTS tests:
shadow_execution_vert
shadow_execution_frag
V2: Move the check to _mesa_texstore_can_use_memcpy() function.
Add check for floating point data types.
Cc: <[email protected]>
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We actually want to use mov(16), not mov(8).
Fixes 7 Piglit tests: ARB_sample_shading/builtin-gl-sample-mask [2468]
and ARB_sample_shading/builtin-gl-sample-mask-simple [468].
Signed-off-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We might be able to do this without an extra program key field, but this
is non-invasive and fixes the bug, for now.
This fixes the following Piglit tests on Broadwell:
- ARB_sample_shading/builtin-gl-sample-id 2
- ARB_sample_shading/builtin-gl-sample-position 2
- EXT_framebuffer_multisample/multisample-blit 2 color
- EXT_framebuffer_multisample/multisample-blit 2 color linear
- EXT_framebuffer_multisample/multisample-blit 2 depth
- EXT_framebuffer_multisample/no-color 2 depth combined
- EXT_framebuffer_multisample/no-color 2 depth separate
- EXT_framebuffer_multisample/no-color 2 depth single
- EXT_framebuffer_multisample/no-color 2 depth-computed combined
- EXT_framebuffer_multisample/no-color 2 depth-computed separate
- EXT_framebuffer_multisample/no-color 2 depth-computed single
- EXT_framebuffer_multisample/unaligned-blit 2 color msaa
- EXT_framebuffer_multisample/unaligned-blit 2 depth msaa
Signed-off-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=80991
Reviewed-by: Matt Turner <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise, the performance warning for shader recompiles will just say
"something else".
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
It doesn't exist, so attempting to read it will trigger generation
assertions in the brw_inst API.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Printing the hex offsets makes it basically impossible to diff assembly:
if you add even a single instruction, the entire shader shows up as a
difference. So, every time I want to compare assembly, I have to strip
this out.
The hex offsets might be useful when debugging compaction, or when
inspecting the program cache buffer. Since it's occasionally useful,
but uncommon, this patch disables it by default, but makes it easy to
re-enable it temporarily when the need arises.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Avoids regenerating it unnecessarily.
Every program in shader-db improved, none by an amount less than a 1/3
reduction. One Dota2 shader decreased from 62 -> 24.
cfg calculations: 429492 -> 193197 (-55.02%)
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Will let us abstract how the instructions are stored.
Reviewed-by: Topi Pohjolainen <[email protected]>
|