| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
driCreateContextAttribs() emits an error if bit
__DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But,
EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of
robust ES contexts. One requests a robust ES context by setting the
EGL_CONTEXT_OPENGL_ROBUST_ACCESS *attribute*, which Mesa's EGL layer
translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS *bit*.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations. The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing. Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up. To do this,
we use an algorithm similar to the on in split_virtual_grfs.
Reviewed-by: Kristian Høgsberg <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This commit moves us to an instruction based model rather than a
register-based model for indirects. This is more accurate anyway as we
have to emit instructions to resolve the reladdr. It's also a lot simpler
because it gets rid of the recursive reladdr problem by design.
One side-effect of this is that we need a whole new algorithm in
move_uniform_array_access_to_pull_constants. This new algorithm is much
more straightforward than the old one and is fairly similar to what we're
already doing in the FS backend.
Reviewed-by: Kristian Høgsberg <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we have MOV_INDIRECT opcodes, we have all of the size information
we need directly in the opcode. With a little restructuring of the
algorithm used in assign_constant_locations we don't need param_size
anymore. The big thing to watch out for now, however, is that you can have
two ranges overlap where neither contains the other. In order to deal with
this, we make the first pass just flag what needs pulling and handle
assigning pull constant locations until later.
Reviewed-by: Kristian Høgsberg <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We aren't using it anymore.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of using reladdr, this commit changes the FS backend to emit a
MOV_INDIRECT whenever we need an indirect uniform load. We also have to
rework some of the other bits of the backend to handle this new form of
uniform load. The obvious change is that demote_pull_constants now acts
more like a lowering pass when it hits a MOV_INDIRECT.
Reviewed-by: Kristian Høgsberg <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
While we're at it, we also add support for the possibility that the
indirect is, in fact, a constant. This shouldn't happen in the common case
(if it does, that means NIR failed to constant-fold something), but it's
possible so we should handle it.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The subnr field is in bytes so we don't need to multiply by type_sz.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
It should work fine without it and the visitor can set it if it wants.
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This will fix the spurious error message: "Failed to query GPU properties."
that was unintentionally added in cc01b63d730.
This patch changes the function to return an int so that the caller is able to
do stuff based on the return value.
The equivalent of this patch was in the original series that fixed up the
warning, but I dropped it at the last moment. It is required to make the desired
behavior of not warning when trying to query GPU properties from the kernel
unless there is something the user can do about it.
v2: Use strerror (Jason)
Make EINVAL check similar in all places (Ian)
NOTE: Broadwell appears to actually have some issue where the kernel returns
ENODEV when it shouldn't be. I will investigate this separately.
Reported-by: Chris Forbes <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reveiewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The old version of the pass only worked on globals and locals and always
left inputs, outputs, uniforms, etc. alone.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The old GLSL IR based lowering doesn't quite work right in all cases,
and fails several dEQP-GLES31 and Vulkan CTS tests. Jason's new
approach in NIR passes all the tests. There's not likely to be a ton
of advantage to lowering early in GLSL IR anyway, so...switch.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
It's not really doing enough anymore to justify a helper function.
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reveiewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
libasan is never linked to shared objects (which doesn't go well with
-z,defs). It must either be linked to the main executable, or (more
practically for OpenGL drivers) be pre-loaded via LD_PRELOAD.
Otherwise works.
I didn't find anything with llvmpipe. I suspect the fact that the
JIT compiled code isn't instrumented means there are lots of errors it
can't catch.
But for non-JIT drivers, the Address/Leak Sanitizers seem like a faster
alternative to Valgrind.
Usage (Ubuntu 15.10):
scons asan=1 libgl-xlib
export LD_LIBRARY_PATH=$PWD/build/linux-x86_64-debug/gallium/targets/libgl-xlib
LD_PRELOAD=libasan.so.2 any-opengl-application
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Copy and paste error in commit eafeb8db66dae7619ff3cb039706b990d718cba7:
i965/tiled_memcpy: Unroll bytes==64 case.
Signed-off-by: Kristian Høgsberg Kristensen <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing code uses SSSE3, and because it isn't compiled in a
separate file compiled with that, it is usually not used (that, of
course, could be fixed...), whereas SSE2 is always present with 64-bit
builds. This should be pretty much as fast as the pshufb version,
albeit those code paths aren't really used on chips without llc in any
case.
v2: fix andnot argument order, add comments
v3: use pshuflw/hw instead of shifts (suggested by Matt Turner), cut comments
v4: [mattst88] Rebase
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
This will make adding SSE2 code a lot cleaner.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Replaces four byte loads and four byte stores with a load, bswap,
rotate, store; or a movbe, rotate, store.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is incorrect to assume that pixel format is always in BGR byte order.
We need to check bitmask parameters (such as |redMask|) to determine whether
the RGB or BGR byte order is requested.
v2: reformat code to stay within 80 character per line limit.
v3: just fix the byte order problem first and investigate SRGB later.
v4: rebased on top of the GLES3 sRGB workaround fix.
v5: rebased on top of the GLES3 sRGB workaround fix v2.
Signed-off-by: Haixia Shi <[email protected]>
Reviewed-by: Stéphane Marchesin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The XMesaVisual instances freed in the visuals table on display close
are being freed with a free() call, instead of XMesaDestroyVisual(),
causing a memory leak.
Signed-off-by: John Sheu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It is incorrect to assume BGRA byte order for the GLES3 sRGB workaround.
v2: use _mesa_get_srgb_format_linear to handle all formats
Signed-off-by: Haixia Shi <[email protected]>
Reviewed-by: Stéphane Marchesin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes the extra multiply visible to NIR's algebraic optimizations
(for constant reassociation) as well as constant folding. This means
that when the result of sin/cos are multiplied by an constant, we can
eliminate the extra multiply altogether, reducing the cost of the
workaround.
It also means we only have to implement it one place, rather than in
both backends.
This makes INTEL_PRECISE_TRIG=1 cost nothing on GPUTest/Volplosion,
which has a ton of sin() calls, but always multiplies them by an
immediate constant. The extra multiply gets folded away.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
I want to be able to read other fields.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
| |
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It will only end up getting exposed on gen8+ since it requires GL ES
3.1, but it should be ready to go on gen7 when support for GL ES 3.1 is
completed there.
Signed-off-by: Ilia Mirkin <[email protected]>
Tested-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We just never bothered to decode this.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
| |
Bit 15 means "interleave" for most messages, but for SIMD8 messages it
means "use channel masks".
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that we can use the much simpler rgba8_copy function, we don't need to
hand different functions out based on direction.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This splits the two copy functions into three: One for unaligned copies,
one for aligned sources, and one for aligned destinations. Thanks to the
previous commit, we are now guaranteed that the aligned ones will *only*
operate on aligned memory so they should be safe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=93962
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Each of the [de]tiling functions has three mem_copy calls:
1) Left edge to tile boundary
2) Tile boundary to tile boundary in a loop
3) Tile boundary to right edge
Copies 2 and 3 start at a tile edge so the pointer to tiled memory is
guaranteed to be at least 16-byte aligned. Copy 1, on the other hand,
starts at some arbitrary place in the tile so it doesn't have any such
alignment guarantees.
Cc: "11.1 11.2" <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
Now that the check is restricted to gen8+, we should always get back a non-zero
positive value for the EU and subslice counts.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Older gen platforms do not actually return a value for sublice and eu total
(IMO, confusingly) they return -ENODEV. This patch defers the SSEU setup until
we have the actual GPU generation to avoid useless warnings when running on
older platforms with older kernels.
Reported-by: Mark Janes <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were walking over the shader source to figure out which
inputs should be marked flat. Now, we can just pull it out of prog_data.
This is needed for properly setting up 3DSTATE_SF/SBE for Vulkan and it
also means that it will get properly cached.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This is needed by the Vulkan driver
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|