| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
Based on the description of __assume at:
http://msdn.microsoft.com/en-us/library/1b3fsfxw.aspx
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This started hitting an assertion recently. Only affects Haswell
(Ivybridge doesn't support this meddling with the sampler state pointer,
and ARB_gpu_shader5 is not enabled yet on Broadwell)
14 Piglits crash->pass.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Allows the driver to advertise DMA-BUF and throttling.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Improves performance in GLBenchmark 2.7 TRex by 3.88889% +/- 0.336383%
(n=80) at 1280x720 on Broadwell GT3. Together with the previous patch,
it improves performance by 5.42738% +/- 0.541971% (n=10) at 1920x1080.
Note that without the PMA stall fix, this would instead decrease
performance by 22%.
v2: Update comment (noticed by Kristian Høgsberg).
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Certain non-promoted depth cases typically incur stalls. In very
specific cases, we can enable a workaround which improves performance.
Improves performance in GLBenchmark 2.7 TRex by 1.17762% +/- 0.448765%
(n=75) at 1280x720 on Broadwell GT3.
Haswell has this feature as well, but we can't currently write registers
from userspace batches (and we'd incur additional software batch
scanning overhead as well), so we haven't enabled it. Broadwell allows
us to write CACHE_MODE_1. Backporters beware: the formula and flushing
incantation differs between Haswell and Broadwell.
v2: Move pma_stall_bits from brw->state to brw itself (requested by
Kristian Høgsberg).
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This patch adds macros needed for the HiZ PMA stall optimization.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Matt requested this in review feedback on the original patch, which I
completely missed when pushing this series. Kristian also made this
change, but I grabbed the wrong version of the patch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise, calling glPopAttrib on drivers that don't support
ARB_clip_control gives you a GL error, which is surprising at best.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
We're not programming the clear values yet, so this won't work.
This patch should be (effectively) reverted eventually.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Skylake, the MOCS bits are an index into a table of 63 different,
configurable cache configurations. As for previous GENs, we only care about
WB and WT, which are available in the documented default set. Define
SKL_MOCS_WB and SKL_MOCS_WT to the indices for those configucations and use
those for the Skylake MOCS values.
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Skylake has separate controls for enabling the Z Clip Test for the near
and far planes. For now, maintain the legacy behavior by setting both.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Skylake's 3DSTATE_DS packet has a few more fields; we don't support
domain shaders yet though.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
| |
They are the same as for BDW, so just add a case for SKL to the init switch.
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
SKL updates the resolve rectangle scaling factors again.
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
On SKL, 3DSTATE_CONSTANT_* command is not committed until we give
the corresponding 3DSTATE_BINDING_TABLE_POINTERS_* command. If we
fail to do so, the constant buffers wont be read and push constants
will be wrong.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise they overlap and horrible things happen. All the new DWords
are for fast color clear values, which we don't do yet.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We will need to allocate more DWords on Skylake.
v2: Don't mark brw_context parameter const. It's modified.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Skylake introduces a new base address for a feature we don't yet expose.
Setting these to 0 should be safe.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Skylake uploads the stencil reference values in DW3 of the
3DSTATE_WM_DEPTH_STENCIL packet, rather than in COLOR_CALC_STATE.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Skylake has some extra bits in PIPELINE_SELECT, none of which are
interesting for a 3D driver. In order to selectively change them, it
also introduces new "mask bits" in 15:8. We care about the "Pipeline
Selection" bits (1:0), so set the mask to 0x3.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commands has seen the addition of 2 dwords that allow to specify
which channels of which attributes need to be forwarded to the fragment
shader.
v2: Rebase forward a year (done by Ken).
Signed-off-by: Damien Lespiau <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
The CSE pass now prints out why it thinks a value is not a candidate for
adding to the AE set.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
No differences in shader-db on Haswell (Gen 7.5).
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The optimization in commit d056863b covers these cases, which were the
first optimizations I added to the GLSL compiler.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85683
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85691
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Should trigger CL_INVALID_VALUE if device_list is NULL and num_devices
is greater than zero.
Introduced by e5468dfa523be2a7a0d04bb9efcf8ae780957563
Reported by: EdB
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Between release 3.2 and 3.3 LLVM stopped aligning properly when certain
conditions (no allocas, but large number of vectors causing spills to
the stack, and frame pointer omission enabled).
We were already disabling frame-pointer-omission on several build types,
but we now disable it on all build types.
It's not clear whether this affects 32-bits x86 processes only, or if it
can also affect 64-bits x86_64 processes when AVX registers are
available and used. So disable frame-pointer-omission on both
x86/x86_64 to be on the safe side.
See also:
- http://llvm.org/PR21435
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
To help recognize what's supposed to do.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
AFAICT the number of threads is 80, not 70. I am not sure if Ken knows
something I do not.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Need to do a sqrt().
FWIW, the html that Sphinx 1.1.3 generates for the math expressions
looks completely broken.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
| |
To match tgsi_alloc_tokens().
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Use the new helper functions in the tgsi_transform.h file to emit
declarations and instructions.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Pass and return tgsi_token buffers instead of pipe_shader_state.
And update softpipe driver (the only user of this function).
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Fixes polygon stipple if both DO_PSTIPPLE_IN_DRAW_MODULE and
DO_PSTIPPLE_IN_HELPER_MODULE are zero/off.
Reviewed-by: Charmaine Lee <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was a regression introduced by
611d66fe4513e53bde052dd2bab95d448c909a2a
Passing a binary program to clBuildProgram() is legal, but passing one
to clCompileProgram() is not.
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This factors out the validation that is common with clBuildProgram().
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
- Rename si_create_shader() to si_shader_binary_read()
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
|
|
|
|
|
|
|
| |
This adds a query which allows drivers to access the config
information of a specific function within the LLVM generated ELF
binary. This makes it possible for the driver to handle ELF
binaries with multiple kernels / global functions.
|
|
|
|
|
|
| |
It's annoying with octave. Reported by Michael Burian.
Cc: 10.2 10.3 <[email protected]>
|
|
|
|
| |
Signed-off-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We are about to change mesa to spawn threads for deferred glCompileShader and
glLinkProgram, and we need to make sure those threads can send compiler
warnings/errors to the debug output safely.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
glsl_type has several static hash tables and a static ralloc context. They
need to be protected by a mutex as they are not thread-safe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|