| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
On Haswell, this cuts 1-3 instructions from 183 vertex shaders in
"Shadowrun Returns", "Shatter", and "Trine 2." It adds 2 instructions
to a single fragment shader in "Closure."
total instructions in shared programs: 278803 -> 278546 (-0.09%)
instructions in affected programs: 41930 -> 41673 (-0.61%)
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes gles3conform failures in:
ES3-CTS.shaders.uniform_block.single_nested_struct.per_block_buffer_packed
ES3-CTS.shaders.uniform_block.single_nested_struct_array.per_block_buffer_packed
ES3-CTS.shaders.uniform_block.random.scalar_types.7
ES3-CTS.shaders.uniform_block.random.basic_arrays.4
ES3-CTS.shaders.uniform_block.random.basic_arrays.6
ES3-CTS.shaders.uniform_block.random.basic_instance_arrays.2
ES3-CTS.shaders.uniform_block.random.nested_structs.9
ES3-CTS.shaders.uniform_block.random.all_shared_buffer.3
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Fixes the non-DRI build.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This gathers macros that have been included across components into util so
that the include chain can be more vertical. In particular, this makes
util stand on its own without any dependence whatsoever on the rest of
mesa.
Signed-off-by: "Jason Ekstrand" <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This hash table is used in core Mesa, the GLSL compiler, and the i965
driver, which makes it a good candidate for the new src/util module.
It's much faster than program/hash_table.[ch] (see commit 6991c2922f5
for data), and José's u_hash_table.c has a comment saying Gallium should
probably consider switching to a linear probing hash table at some point.
So this seems like the best candidate for a shared data structure.
Signed-off-by: Kenneth Graunke <[email protected]>
v2 (Jason Ekstrand): Pick up another hash_table use and patch up scons
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a long time, we've wanted a place to put utility code which isn't
directly tied to Mesa or Gallium internals. This patch creates a new
src/util directory for exactly that purpose, and builds the contents as
libmesautil.la.
ralloc seemed like a good first candidate. These days, it's directly
used by mesa/main, i965, i915, and r300g, so keeping it in src/glsl
didn't make much sense.
Signed-off-by: Kenneth Graunke <[email protected]>
v2 (Jason Ekstrand): More realloc uses and some scons fixes
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
both array and index are unsigned types
Signed-off-by: Jan Vesely <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Native integers imply a somewhat different handling of booleans. Instead
of being 1.0/0.0 floats, they are 0 (true) / -1 (false) integers. As such
the original optimization no longer applies.
Reported-by: Glenn Kennard <[email protected]>
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Cc: "10.2" <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]> (v1)
v2: fix src register, use index2D for base of 1
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
In commit 16060c5adcd4d809f97e874fcde763260c17ac18, Eric changed the
code to not relayout just for baselevel changes - only if the range of
miplevels actually increases. So this comment is now wrong.
Notably, the i915 version of the code actually does what the comment
says.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We've moved to using bitshifts (like we did for surface state); nothing
uses the structures anymore.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
These are the last users of struct gen7_sampler_state.
v2: Use a local sampler_state_size variable, to help distinguish the
various 16s (suggested by Topi Pohjolainen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is the last user of the structure.
v2: Use a local variable with a sensible name so people know what 16 is.
(Suggested by Topi Pohjolainen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This simplifies the code, removes use of the old structures, and also
allows us to combine the Gen6 and Gen7+ code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Although the Gen4-6 and Gen7+ variants used different structure types,
they didn't use any of the fields - only the size, which is identical.
So both decoders did exactly the same thing.
Someday we should implement useful decoders for SAMPLER_STATE.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that gen7_sampler_state.c is gone, everything is once again in a
single file.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The code in brw_sampler_state.c now handles all generations; we don't
need the extra Gen7+ only code anymore.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was the only actual difference between Gen4-6 and Gen7+ in terms of
the values we program. The rest was just mechanical structure
rearrangement.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of stuffing bits directly into the brw_sampler_state structure,
we now store them in local variables, then use brw_emit_sampler_state()
to assemble the packet. This separates the decision about what values
to use from the actual packet emission, which makes the code more
reusable across generations.
v2: Put const on a bunch of local variables and move declarations,
as suggested by Topi Pohjolainen.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This simply assembles all the SAMPLER_STATE fields into their proper bit
locations. Making it work on all generations was easy enough; some of
the fields are even in the same place.
Not used by anything yet, but will be soon. I made it non-static so
BLORP can use it too.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
It doesn't edit the value, and this lets us use const in more places.
Needed to implement Topi's review comments for the next patch.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We'll use these to replace the existing structures.
I've adopted the convention that "BRW" applies to all hardware, and
"GENX" applies starting with generation X, but might be replaced by some
later generation.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes it easy to tell that they're grouped together, and also
improves gdb printing.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
brw_upload_sampler_state_table now handles all generations, so we don't
need the vtable mechanism either.
There's still a lot of code duplication; the next patches will address
that.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This copies a few changes from gen7_upload_sampler_state_table; the next
patch will delete that function.
Gen7+ has per-stage sampler state pointer update packets, so we emit
them as soon as we emit a new table for a stage. On Gen6 and earlier,
we have a single packet, so we delay until we've changed everything
that's going to be changed.
v2: Split 3DSTATE_SAMPLER_STATE_POINTERS_XS packet emission into a
helper function (suggested by Topi Pohjolainen).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Gen4-6 and Gen7+ code is virtually identical, but both use different
structure types. Switching to use a uint32_t pointer and operate on the
number of DWords will make it possible to share code.
It turns out that SURFACE_STATE is the same number of DWords on every
platform currently; it will be easy to handle a change there, though.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Other than this, brw_update_sampler_state only deals with a single
SAMPLER_STATE structure, and doesn't need to know which position it is
in the table. The caller takes care of dealing with multiple surface
states.
Pushing this up a level allows us to drop the ss_index parameter.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
This was copied from the Gen4-6 code, but is unused.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
sdc_offset is produced and consumed in the same function, so there's no
need to store it in the context, nor pass pointers to it through various
call chains.
Saves 128 bytes per brw_stage_state structure, and makes the code
clearer as well.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It's just an array of four floats, and we have an array of four floats,
so this is literally just a memcpy...but with custom structs and strange
macros to give the appearance of doing something more.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
The old one has been inaccurate for years.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the driver was originally written, it only supported texturing in
the pixel shader backend; vertex and geometry shader texturing came much
later. Originally, the pixel shader was referred to as "WM" (the
Windowizer/Masker unit). So, this code happened to only be relevant for
the WM stage, at the time.
However, sampler state really applies to all stages, so putting "wm" in
the filename doesn't make sense. I dropped it in gen7_sampler_state.c;
at this point the asymmetry just trips people up.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The "Min/Mag State Not Equal" bit is supposed to be set when the min/mag
filters or address rounding modes differ. BLORP uses identical min/mag
settings, so the bit should be unset.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
1D array miptrees were being laid out as a 2D texture with 1 slice.
This happened due to the mesa core storing the 1D array slice count in
the height field. On Intel hardware, we want to create a 2D array with
a height of 1 for the 1D array case.
Fixes assertion failure in piglit (gen6, gen8):
spec/glsl-1.30/execution/tex-miplevel-selection textureOffset 1DArrayShadow
In release builds of Mesa, this test was observed to cause a GPU hang
on gen8.
Signed-off-by: Jordan Justen <[email protected]>
Cc: "10.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81450
Tested-by: Ben Widawsky <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
GL_SAMPLE_SHADING is specified as a valid pname for glGet in the
GL_ARB_sample_shading extension. It seems as if we forgot to add it to the
table of pnames.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
| |
mesa/mesa/src/mesa/drivers/dri/common/xmlconfig.c:104:10: warning: #warning "Per application configuration won't work with your OS version." [-Wcpp]
# warning "Per application configuration won't work with your OS version."
Signed-off-by: Yaakov Selkowitz <[email protected]>
Reviewed-by: Jon TURNEY <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This makes some of the UE4 engine demos (Stylized, Mobile Temple)
render correctly, tested on Intel Haswell machine.
Signed-off-by: Tapani Pälli <[email protected]>
Acked-by: Anuj Phogat <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=78716
|
|
|
|
|
|
|
|
|
|
| |
This new name isn't so confusing.
I also changed the gallivm limit, because it looked wrong.
Reviewed-by: Brian Paul <[email protected]>
v2: use sizeof(float[4])
|
|
|
|
|
|
|
|
| |
With MESA_EXTENSION_OVERRIDE=GL_ARB_compute_shader, this fixes piglit:
* arb_compute_shader-minmax
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to a quick micro-benchmark, this new version is 20% faster on my
Haswell laptop.
v2: Removed the XXX note about x86_64 from the comment
v3: Use an intrinsic instead of an __asm__ block. This should give us MSVC
support for free.
v4: Enable it for all x86_64 builds, not just with USE_X86_64_ASM
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Will clarify make the next commit easier to read.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
... to eliminate an ELSE instruction followed immediately by an ENDIF.
instructions in affected programs: 704 -> 700 (-0.57%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|