| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Noticed while looking at other OSMesa bugs.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
| |
Given that we occasionally touch this code and probably nobody really
wants to think about it, introduce a minimal test so that we know we
haven't completely broken OSMesa.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When run in optirun, applications that linked to `libGLX.so` and then
proceeded to querying Mesa for extension strings caused a SEGV in Mesa.
`glXQueryExtensionsString` was calling a chain of functions that
eventually led to `__glXQueryServerString`. This function would call
`xcb_glx_query_server_string` then `xcb_glx_query_server_string_reply`.
The latter for some unknown reason returned `NULL`. Passing this `NULL`
to `xcb_glx_query_server_string_string_length` would cause a SEGV as the
function tried to dereference it.
The reason behind the function returning `NULL` is yet to be determined,
however, simply checking that the ptr is not `NULL` resolves this. A
similar check has been added to `__glXGetString` for completeness sake,
although not immediately necessary.
In addition to that, we stumbled into a similar problem in
`AllocAndFetchScreenConfigs` which tries to access the configs to free
them if `__glXQueryServerString` fails. This, of course, SEGVs, because the
configs are yet to have been allocated. Simply continuing past the configs
if their config ptrs are `NULL` resolves this. We also switch to `calloc`
to make sure that the config ptrs are `NULL` by default, and not some
uninitialized value.
Cc: [email protected]
Fixes: 24b8a8cfe821 "glx: implement __glXGetString, hide __glXGetStringFromServer"
Fixes: cb3610e37c4c "Import the GLX client side library, formerly from xc/lib/GL/glx. Build it "
Reviewed-by: Adam Jackson <[email protected]>
Signed-off-by: Hal Gentz <[email protected]>
|
|
|
|
|
|
| |
It's somewhat annoying that these are so similar for so little benefit.
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The precision for a function return type is now stored in
ir_function_signature. This will later be useful to implement mediump
to float16 lowering. In the meantime it is also useful to catch errors
where a function is redeclared with a different precision.
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is ported from the fragment shader code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is mostly ported from the fragment shader code.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This just adds the CS invocations counter.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This adds the dispatch code. It creates a job for the number
of blocks in the grid, and dispatches them to the threadpool
implementation. The threadpool then calls the JIT code to
execute the coroutines.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This creates the coroutine execution environment and the
main compute shaders that get executed inside it.
Each compute shader block is executed in it's own coroutine
execution shader, which each "thread" being a coroutine executed
inside it in sequence.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
This doesn't actually build any of the shaders yet, but just
builds up the framework necessary to start building the shaders
and variants.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Compute doesn't share dirty state with the fragment pipeline
so create a separate path for it.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This is mostly a port of the fragment shader framework
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
This adds the jit interface for compute shaders, it's based
on the fragment shader one.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
These mirror the fragment shader structs, this is just a framework.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
The compute shader will need it's own context like the frag shader
has, this just introduces the framework struct and allocates/frees
for it in the right places.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
When the code is executing an hits a barrier, it will suspend
the coroutine and return control to the coroutine dispatcher.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
In order to efficiently run a number of compute blocks, use
a threadpool that just allows for jobs with unique sequential
ids to be dispatched.
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
| |
In order to share the texture/image/sampler code with compute
shaders we need to reorg them to be at the front of context
same as draw does for vs/gs sharing.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
coroutines require a proper pass manager, so add the passes
to the correct places
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
| |
These wrap the coroutine intrinsics and also add some higher
level wrappers around coroutine begin, end and suspend procedures
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
This allows the counter value to be forced to a certain value
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We were only handling the modifiers case and not counting the number of
planes in actual planar images.
Fixes Piglit's ext_image_dma_buf_import-export.
Fixes: fc12fd05f56 ("iris: Implement pipe_screen::resource_get_param")
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111509
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We were previously not doing at least some of the checks. This uses the
same logic that is used in glTexImage*.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Each BSD has slightly different sysctl for retrieving per-CPU times.
FreeBSD returns long while NetBSD returns uint64_t. On OpenBSD return
type differs between summation and per-CPU times. DragonFly is
compatible with FreeBSD.
Signed-off-by: Jan Beich <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Based on the vc4 implementation.
Fixes Android RenderEngine::flush() routine:
android.googlesource.com/platform/frameworks/native/+/refs/tags/android-o-mr1-iot-release-smart-clock-fcs/services/surfaceflinger/RenderEngine/RenderEngine.cpp#225
Signed-off-by: Roman Stratiienko <[email protected]>
Reviewed-by: Qiang Yu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Try more aggressive approach with cloning uniform and coord loads.
Uniform load can be inserted into any instruction, so let's do that. ARM site
claim that penalty for cache miss is one clock, so we don't lose anything if
we merge it into instruction that uses the result. As side effect we can also
pipeline it and thus decrease reg pressure.
Do the same for varyings that hold texture coords, but for different reason:
looks like there's a special path for coords that increases precision if
varying that holds it is pipelined. If we don't pipeline it and load coords
from a register its precision is fp16 and thus only 10 bits which is not enough
to accurately sample textures of size 1024 or larger.
Since instruction can hold only one uniform load and one varying load,
node_to_instr now creates a move using helper introduced in previous commit if
slot is already taken. As side effect of this change we can also try to
pipeline texture loads and create a move if attempt fails.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
It can load value from varying directly as well. Also load_regs is the
only op that has a source, so add src_num field to load node and set it
accordingly.
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
Introduce common helper for creating movs to avoid code duplication
Reviewed-by: Erico Nunes <[email protected]>
Signed-off-by: Vasily Khoruzhick <[email protected]>
|
|
|
|
|
|
|
| |
Fixes: 2cf59861a8128a91bfdd ("nir: Add partial redundancy elimination for compares")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
Fixes: 494ecef6b42198ab6c3e ("freedreno: Add support for drm-shim.")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Fixes: 9775894f102535a79186 ("anv: Move size check from anv_bo_cache_import() to caller (v2)")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Fixes: 955c63d3643f30d7db0c ("util/os_file: resize buffer to what was actually needed")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
Fixes: cb0980e69aa921af7086 ("egl: move alloc & init out of _eglBuiltInDriver{DRI2,Haiku}")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes: 4d0b2c7aaac3cf3de5af ("ttn: Update shader->info as we generate code.")
Signed-off-by: Eric Engestrom <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
When lowering from ubo, use the constant base field in the load_uniform
instruction for the constant part of the offset. Doesn't change much
for constant indexing, but this will help for indirect indexing because
constant-folding can't completely clean up the result.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Should shift before splitting 64b iova into dwords
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This looks like clear copy-and-pasteos, and fixes:
dEQP-GLES2.functional.draw.random.40
(on A307 and A630, both tested in the new CI farm)
Reviewed-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We can get all the information we need from NIR. It's slightly less
accurate, but radeonsi doesn't use the extra information. The old code
also overcounted atomic counters, which led to problems when everything
was used at once.
Fixes KHR-GL45.compute_shader.resources-max.
Reviewed-by: Marek Olšák <[email protected]>
|