aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/v3d
Commit message (Collapse)AuthorAgeFilesLines
* nir: rename nir_var_function to nir_var_function_tempKarol Herbst2019-01-191-1/+1
| | | | | | | | Signed-off-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* v3d: Restructure RO allocations using resource_from_handle.Eric Anholt2019-01-161-29/+38
| | | | | | | | | | | | | | | I had bugs in the old path where I was laying out as tiled (so we'd render tiled) but then only allocating space in the shared object for linear rendering. The resource_from_handle makes it so the same layout choices are made in both the import and export scanout cases. Also, fixes a leak of the fd that was tripping up the CTS. Now that we're checking PIPE_BIND_SHARED to choose to use RO, the DRM_FORMAT_MOD_LINEAR check wasn't needed any more. Fixes visual corruption and MMU faults in X in renderonly mode. Fixes: bd09bb1629a7 ("v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.")
* v3d: If the modifier is not known on BO import, default to linear for RO.Eric Anholt2019-01-161-1/+3
| | | | | | Part of fixing DRI3 rendering with RO on X11. Fixes: e113b21cb779 ("v3d: Add renderonly support.")
* v3d: SHARED but not necessarily SCANOUT buffers on RO must be linear.Eric Anholt2019-01-141-1/+1
| | | | | We don't have a way to talk to RO about modifiers it can do yet, so assume the minimum.
* v3d: Add support for shader_image_load_store.Eric Anholt2019-01-145-2/+196
| | | | | | This is only exposed on V3D 4.1+, because we didn't have the TMU write operations for images on 3.3 (To do GLES 3.1 there, you have to lower it to SSBO load/stores, which is a problem to solve later).
* v3d: Add SSBO/atomic counters support.Eric Anholt2019-01-146-1/+102
| | | | | So far I assume that all the buffers get written. If they weren't, you'd probably be using UBOs instead.
* v3d: Drop the GLSL version level.Eric Anholt2019-01-141-1/+1
| | | | | | This was an arbitrary "we support lots of stuff" value when I started the driver. However, at 400 we expose OES_gpu_shader5, which claims support for dynamically indexing samplers, which the driver doesn't do yet.
* v3d: Add an isr to the simulator to catch GMP violations.Eric Anholt2019-01-143-0/+39
| | | | | | | Otherwise, the simulator raises the GMP interrupt and waits for it to be handled, and v3d ends up spinning in v3d_hw_tick(). Aborting right when violation happens gives us a chance to look at the backtrace of whatever thread triggered the violation.
* v3d: Add support for GL_ARB_framebuffer_no_attachments.Eric Anholt2019-01-143-2/+19
| | | | | | Fixes dEQP-GLES31.functional.state_query.integer.max_framebuffer_height_getboolean when GLES3 is enabled.
* v3d: Add support for flushing dirty TMU data at job end.Eric Anholt2019-01-142-0/+20
| | | | This will be needed for SSBOs and image_load_store.
* v3d: Enable GL_ARB_texture_gather on V3D 4.x.Eric Anholt2019-01-081-0/+5
| | | | | This is part of GLES 3.1, and with the NIR lowering we're now passing the GLES31 testcases.
* nir: rename global/local to private/function memoryKarol Herbst2019-01-081-1/+1
| | | | | | | | | | | | | | | | | | the naming is a bit confusing no matter how you look at it. Within SPIR-V "global" memory is memory accessible from all threads. glsl "global" memory normally refers to shader thread private memory declared at global scope. As we already use "shared" for memory shared across all thrads of a work group the solution where everybody could be happy with is to rename "global" to "private" and use "global" later for memory usually stored within system accessible memory (be it VRAM or system RAM if keeping SVM in mind). glsl "local" memory is memory only accessible within a function, while SPIR-V "local" memory is memory accessible within the same workgroup. v2: rename local to function as well v3: rename vtn_variable_mode_local as well Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Fix up VS output setup during precompiles.Eric Anholt2019-01-041-6/+10
| | | | | | | | | I noticed that a VS I was debugging was missing all of its output stores -- outputs_written was for POS, VAR0, VAR3, while the shader's variables were POS, VAR9, and VAR12. I'm not sure what outputs_written is supposed to be doing here, but we can just walk the declared variables and avoid both this bug and the emission of extra stvpms for less-than-vec4 varyings.
* v3d: Refactor compiler entrypoints.Eric Anholt2019-01-021-26/+6
| | | | | | Before, I had per-stage entryoints with some helpers shared between them. As I extended for compute shaders and shader-db, it turned out that the other common code in the middle wanted to be shared too.
* v3d: Don't forget to include RT writes in precompiles.Eric Anholt2019-01-021-0/+10
| | | | | Looking at some assembly dumps for an optimization, we were clearly missing important parts of the shader!
* v3d: Fix segfault when failing to compile a program.Eric Anholt2019-01-021-2/+4
| | | | | | | We'll still fail at draw time, but this avoids a regression in shader-db execution once I enable TLB writes in precompiles. Fixes: b38e4d313fc2 ("v3d: Create a state uploader for packing our shaders together.")
* v3d: Add support for requesting the sample offsets.Eric Anholt2018-12-301-0/+22
|
* v3d: Hook up some shader-db output to GL_ARB_debug_output.Eric Anholt2018-12-301-0/+12
| | | | | | | This allows the original shader-db project's run.c runner to parse things easily, and is probably a good thing to have for GL_ARB_debug_output in general. I formatted it more like Intel's so I can mostly reuse their report script.
* v3d: Add a "precompile" debug flag for shader-db.Eric Anholt2018-12-291-0/+76
| | | | | | | | | I've been using my apitrace-based shader-db so far, but it's slow (apitrace decompression), intrusive (apitrace windows spamming the screen), and doesn't have much coverage. The original shader-db provides a lot more coverage and compiles faster, at the expense of not having the actual runtime variant key. As v3d has a lot less runtime variation than vc4 did, this tradeoff makes more sense.
* v3d: Hook up perf_debug() output to GL_ARB_debug output as well.Eric Anholt2018-12-202-0/+3
| | | | | This is the right channel to report these things, so that end-users don't need to know each driver's custom debug options.
* v3d: Wire up core pipe_debug_callbackRhys Kidd2018-12-202-0/+14
| | | | | | | This lets the driver use pipe_debug_message() for GL_ARB_debug_output. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Drop shadow comparison state from shader variant key.Eric Anholt2018-12-201-2/+0
| | | | The shadow state is now in the sampler.
* v3d: Fix simulator mode on i915 render nodes.Eric Anholt2018-12-201-28/+73
| | | | | | i915 render nodes refuse the dumb ioctls, so the simulator would crash on the original non-apitrace shader-db. Replace them with direct i915 calls if we detect that we're on one of their gem fds.
* v3d: Load and store aligned utiles all at once.Eric Anholt2018-12-191-8/+114
| | | | | | This calls the expensive uif offset function once per utile, but it still gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over calling it on each pixel.
* v3d: Implement texture_subdata to reduce teximage upload copies.Eric Anholt2018-12-191-29/+85
| | | | | | | This lets us store the non-PBO glTexImage data directly into the tiled image without making an extra untiled memcpy for the gallium transfer. Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around in the kernel mapping and unmapping the transfer's temporary area.
* v3d: Remove dead prototypes for load/store utile functions.Eric Anholt2018-12-191-2/+0
|
* v3d: Don't try to create shadow tiled temporaries for 1D textures.Eric Anholt2018-12-191-1/+2
| | | | | | | They're raster order anyway, so we'd assertion fail along with wasting bandwidth. Fixes: 6ad9e8690d14 ("v3d: Add support for texturing from linear.")
* v3d: Fix check for TFU job completion in the simulator.Eric Anholt2018-12-191-1/+1
| | | | | | | | | | We're waiting for the jobs-completed count to increment (with wrapping), not to reach its starting state. This mostly ended up working out because the next v3d_hw_tick() for a submit CL would end up doing the TFU operation first, but it did fail when a blit was used for glReadPixels() at the end of a test. Fixes: ee0549ff9ab3 ("v3d: Add the V3D TFU submit interface to the simulator.")
* v3d: Put the dst bo first in the list of BOs for TFU calls.Eric Anholt2018-12-191-2/+2
| | | | | | | | | | | | In the UAPI, the first BO is the destination, and the one the kernel should do an exclusive reservation on. Currently we only do exclusive reservations, anyway. However, in the simulator path I was only copying back the "destination" BO (actually src in this case), and this caused regressions once I fixed the simulator to actually complete TFU before returning (since otherwise, the TFU op would happen at the start of the next CL submit and the draw would get the right contents). Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
* v3d: Drop in a bunch of notes about performance improvement opportunities.Eric Anholt2018-12-142-1/+13
| | | | | | These have all been floating in my head, and while I've thought about encoding them in issues on gitlab once they're enabled, they also make sense to just have in the area of the code you'll need to work in.
* v3d: Use the uniform pretty-printer in v3d_write_uniforms()'s debug code.Eric Anholt2018-12-141-1/+3
| | | | | This will be a lot easier than my usual "38400.000000? that looks like a viewport scale" decoding strategy.
* v3d: Move uinfo->data[] dereference to the top of v3d_write_uniforms().Eric Anholt2018-12-141-15/+13
| | | | | | Follows 3954331aff23 ("vc4: Pull uinfo->data[i] dereference out to the top of the loop.") which showed a large performance win for vc4, but also cleans up the code a decent bit.
* v3d: Add support for draw indirect for GLES3.1.Eric Anholt2018-12-142-2/+31
| | | | | | In trying to enable compute shaders, I found that a bunch of deqp-gles31's compute stuff wanted to interact with indirect dispatch. This was easy to do on its own.
* v3d: Add safety checks for resource_create().Eric Anholt2018-12-141-0/+6
| | | | This should ease my debugging next time I screw it up.
* v3d: Add support for texturing from linear.Eric Anholt2018-12-146-3/+110
| | | | | | | Just like vc4, we have to support linear shared BOs for X11 on arbitrary displays. When we're faced with a request to texture from one of those, make a shadow image that we copy using the TFU at the start of the draw call.
* v3d: Add support for using the TFU to do some blits.Eric Anholt2018-12-141-42/+129
| | | | This will be useful in particular for blits from raster to UIF for X11.
* v3d: Don't forget to bump the number of writes when doing TFU ops.Eric Anholt2018-12-141-0/+2
| | | | | | generatemipmap is just filling out the rest of the mipmap that's already been written (by a mapping or a draw call), so it didn't matter. As I reuse the TFU code for linear-to-UIF conversions, it'll start mattering.
* v3d: Set up the right stride for raster TFU.Eric Anholt2018-12-141-1/+1
| | | | | I didn't have any raster images in the generatemipmap path, so the pixels-vs-bytes mixup didn't matter here.
* v3d: Don't forget to wait for our TFU job before rendering from it.Eric Anholt2018-12-141-0/+8
| | | | | | | | Otherwise we may race to read old contents. This didn't show up in the CTS and piglit for me, but it did once I started using the TFU to do linear->UIF blits for X11. Fixes: 2ebca177dc18 ("v3d: Use the TFU to do generatemipmap.")
* shader-packingEric Anholt2018-12-071-1/+2
|
* tfuEric Anholt2018-12-071-1/+1
|
* v3d: Fix a leak of the transfer helper on screen destroy.Eric Anholt2018-12-071-0/+2
| | | | Fixes: 7a30517cce8f ("broadcom/vc5: Start adding support for rendering to Z32F_S8X24_UINT.")
* v3d: Add VIR dumping of TMU config p0/p1.Eric Anholt2018-12-071-14/+6
| | | | I had a bit of it for V3D 3.x, but didn't update it for 4.x.
* v3d: Garbage collect unused uniforms code.Eric Anholt2018-12-071-88/+0
|
* v3d: Split most of TEXTURE_SHADER_STATE setup out of sampler views.Eric Anholt2018-12-071-58/+69
| | | | For shader image load/store, we want most of this logic to be shared.
* v3d: Avoid confusing auto-indenting in TEXTURE_SHADER_STATE packingEric Anholt2018-12-071-4/+4
| | | | | Having "v3dx_pack() {" under each #if branch would confuse emacs's indenter.
* v3d: Fix handling of texture first_layer offsets for 3D textures.Eric Anholt2018-12-071-5/+5
| | | | | I think this bug predated adding v3d_layer_offset(). Noticed during an unrelated refactor.
* v3d: Return the right gl_SampleMaskIn[] value.Eric Anholt2018-12-071-8/+0
| | | | | It's supposed to be the dispatched sample mask for this pixel, not the GL state's sample mask.
* v3d: Don't forget to flush writes to UBOs.Eric Anholt2018-12-072-5/+16
| | | | | If someone did TF into a UBO, we might have left the TF job un-flushed at the point of reading.
* v3d: Make an array for frag/vert texture state in the context.Eric Anholt2018-12-077-42/+21
| | | | | This simplifies a bunch of our texture handling, while introducing the slots necessary for adding new shader stages.