summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: improve setup_slices() debug msgsRob Clark2018-12-221-6/+5
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: simplify special case for 3d layoutRob Clark2018-12-221-9/+10
| | | | | | | | This logic can be re-written as the two cases for 3d (ie. before/after the miplevel sizes start reducing) vs everything else. I think it is easier to read this way. Signed-off-by: Rob Clark <[email protected]>
* freedreno: combine fd_resource_layer_offset()/fd_resource_offset()Rob Clark2018-12-221-13/+2
| | | | | | We really only need this logic in one place. Signed-off-by: Rob Clark <[email protected]>
* gallivm: abort when trying to use non-existing intrinsicRoland Scheidegger2018-12-211-0/+10
| | | | | | | | | | | Whenever llvm removes an intrinsic (we're using), we're hitting segfaults due to llvm doing calls to address 0 in the jitted code instead. However, Jose figured out we can actually detect this with LLVMGetIntrinsicID(), so use this to abort, so we don't have to wonder what got broken. (Of course, someone still needs to fix the code to no longer use this intrinsic.) Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: don't use pavg.b intrinsic on llvm >= 6.0Roland Scheidegger2018-12-212-51/+95
| | | | | | | | | | | | | | | | | | | | | This intrinsic disppeared with llvm 6.0, using it ends up in segfaults (due to llvm issuing call to NULL address in the jited shaders). Add code doing the same thing as the autoupgrade code in llvm so it can be matched and replaced back with a pavgb. While here, also improve lp_test_format, so it tests both with and without cache (as it was, it tested the cache versions only, whereas cache is actually disabled in llvmpipe, and in any case even with it enabled vertex and geometry shaders wouldn't use it). (Although at least for the unorm8 uncached fetch, the code is still quite different to what llvmpipe is using, since that would use unorm8x16 type, whereas the test code is using unorm8x4 type, hence disabling some intrinsic paths.) Fixes: 6f4083143bb8 ("gallivm: use llvm jit code for decoding s3tc") Reviewed-by: Jose Fonseca <[email protected]> Tested-by: Michel Dänzer <[email protected]>
* pipe-loader: meson: reference correct libraryEmil Velikov2018-12-131-1/+1
| | | | | | | | The library is called libgalliumvl_stub - note singular. Fixes: 42ea0631f10 ("meson: build clover") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* vc4: Hook up perf_debug() output to GL_ARB_debug_output as well.Eric Anholt2018-12-202-0/+3
| | | | | This is the right channel to report these things, so that end-users don't need to know each driver's custom debug options.
* vc4: Wire up core pipe_debug_callbackRhys Kidd2018-12-202-0/+14
| | | | | | | This lets the driver use pipe_debug_message() for GL_ARB_debug_output. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Hook up perf_debug() output to GL_ARB_debug output as well.Eric Anholt2018-12-202-0/+3
| | | | | This is the right channel to report these things, so that end-users don't need to know each driver's custom debug options.
* v3d: Wire up core pipe_debug_callbackRhys Kidd2018-12-202-0/+14
| | | | | | | This lets the driver use pipe_debug_message() for GL_ARB_debug_output. Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* v3d: Drop shadow comparison state from shader variant key.Eric Anholt2018-12-201-2/+0
| | | | The shadow state is now in the sampler.
* v3d: Fix simulator mode on i915 render nodes.Eric Anholt2018-12-201-28/+73
| | | | | | i915 render nodes refuse the dumb ioctls, so the simulator would crash on the original non-apitrace shader-db. Replace them with direct i915 calls if we detect that we're on one of their gem fds.
* gallivm: use llvm jit code for decoding s3tcRoland Scheidegger2018-12-207-383/+2239
| | | | | | | | | | | | This is (much) faster than using the util fallback. (Note that there's two methods here, one would use a cache, similar to the existing code (although the cache was disabled), except the block decode is done with jit code, the other directly decodes the required pixels. For now don't use the cache (being direct-mapped is suboptimal, but it's difficult to come up with something better which doesn't have too much overhead.) Reviewed-by: Jose Fonseca <[email protected]>
* v3d: Load and store aligned utiles all at once.Eric Anholt2018-12-191-8/+114
| | | | | | This calls the expensive uif offset function once per utile, but it still gets us a 212.218% +/- 2.41216% (n=10) win on 1024x1024 glTexImage over calling it on each pixel.
* vc4: Move the utile load/store functions to a header for reuse by v3d.Eric Anholt2018-12-192-202/+11
| | | | | These implementations of whole-utile load/stores would be the same for v3d, though the layouts of blocks of utiles has changed.
* v3d: Implement texture_subdata to reduce teximage upload copies.Eric Anholt2018-12-191-29/+85
| | | | | | | This lets us store the non-PBO glTexImage data directly into the tiled image without making an extra untiled memcpy for the gallium transfer. Improves 1024x1024 TexImage perf by ~19%, mostly from not thrashing around in the kernel mapping and unmapping the transfer's temporary area.
* v3d: Remove dead prototypes for load/store utile functions.Eric Anholt2018-12-191-2/+0
|
* v3d: Don't try to create shadow tiled temporaries for 1D textures.Eric Anholt2018-12-191-1/+2
| | | | | | | They're raster order anyway, so we'd assertion fail along with wasting bandwidth. Fixes: 6ad9e8690d14 ("v3d: Add support for texturing from linear.")
* v3d: Fix check for TFU job completion in the simulator.Eric Anholt2018-12-191-1/+1
| | | | | | | | | | We're waiting for the jobs-completed count to increment (with wrapping), not to reach its starting state. This mostly ended up working out because the next v3d_hw_tick() for a submit CL would end up doing the TFU operation first, but it did fail when a blit was used for glReadPixels() at the end of a test. Fixes: ee0549ff9ab3 ("v3d: Add the V3D TFU submit interface to the simulator.")
* v3d: Put the dst bo first in the list of BOs for TFU calls.Eric Anholt2018-12-191-2/+2
| | | | | | | | | | | | In the UAPI, the first BO is the destination, and the one the kernel should do an exclusive reservation on. Currently we only do exclusive reservations, anyway. However, in the simulator path I was only copying back the "destination" BO (actually src in this case), and this caused regressions once I fixed the simulator to actually complete TFU before returning (since otherwise, the TFU op would happen at the start of the next CL submit and the draw would get the right contents). Fixes: 976ea90bdca2 ("v3d: Add support for using the TFU to do some blits.")
* winsys/amdgpu: Pull in LLVM CFLAGSMichel Dänzer2018-12-192-1/+2
| | | | | | | | | Fixes build failure if the LLVM headers aren't in a standard include directory. Fixes: ec22dd34c88f "radeonsi: move SI_FORCE_FAMILY functionality to winsys" Reviewed-by: Nicolai Hähnle <[email protected]>
* virgl: move resource creation / import / destruction to common codeGurchetan Singh2018-12-194-114/+89
| | | | | | We can remove some duplicated code. Reviewed-by: Elie Tournier <[email protected]>
* virgl: move resource metadata into base resourceGurchetan Singh2018-12-194-91/+71
| | | | | | A resource is just a buffer with some metadata. Reviewed-by: Elie Tournier <[email protected]>
* virgl: modify how we handle GL_MAP_FLUSH_EXPLICIT_BITGurchetan Singh2018-12-194-69/+25
| | | | | | | | | | | | | | | | | | Previously, we ignored the the glUnmap(..) operation and flushed before we flush the cbuf. Now, let's just flush the data when we unmap. Neither method is optimal, for example: glMapBufferRange(.., 0, 100, GL_MAP_FLUSH_EXPLICIT_BIT) glFlushMappedBufferRange(.., 25, 30) glFlushMappedBufferRange(.., 65, 70) We'll end up flushing 25 --> 70. Maybe we can fix this later. v2: Add fixme comment in the code (Elie) Reviewed-by: Elie Tournier <[email protected]>
* virgl: make virgl_buffers use resource helpersGurchetan Singh2018-12-192-20/+11
| | | | | | We can reuse the helpers we created. Reviewed-by: Elie Tournier <[email protected]>
* virgl: make transfer code with PIPE_BUFFER targetsGurchetan Singh2018-12-191-2/+4
| | | | | | | util_format_get_blocksize returns 1 for R8 formats (all PIPE_BUFFERs are R8). Reviewed-by: Elie Tournier <[email protected]>
* virgl: consolidate transfer codeGurchetan Singh2018-12-195-59/+73
| | | | | | | | We could allocate and destroy transfers in one place. v2: Keep l_stride around. Reviewed-by: Elie Tournier <[email protected]>
* virgl: store layer_stride in metadataGurchetan Singh2018-12-192-6/+6
| | | | Reviewed-by: Elie Tournier <[email protected]>
* virgl: move vrend_get_tex_image_offset to common codeGurchetan Singh2018-12-193-26/+28
| | | | | | Will be reused. Reviewed-by: Elie Tournier <[email protected]>
* virgl: move virgl_resource_layout to common codeGurchetan Singh2018-12-193-42/+51
| | | | | | Will be reused. Reviewed-by: Elie Tournier <[email protected]>
* virgl: move texture metadata to common codeGurchetan Singh2018-12-192-12/+18
| | | | | | Will be reused. Reviewed-by: Elie Tournier <[email protected]>
* virgl: remove unnessecary codeGurchetan Singh2018-12-191-3/+0
| | | | | | | | | With commit 89b479, we moved to tracking buffer cleanliness when binding. TEST=dEQP-GLES31.functional.image_load_store.buffer.load_store.r32ui Reviewed-by: Elie Tournier <[email protected]>
* virgl: texture_transfer_pool --> transfer_poolGurchetan Singh2018-12-196-11/+11
| | | | | | It's used for all types of resources. Reviewed-by: Elie Tournier <[email protected]>
* radeonsi: const-ify the si_query_opsNicolai Hähnle2018-12-193-5/+5
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: split perfcounter queries from si_query_hwNicolai Hähnle2018-12-191-50/+93
| | | | | | | Remove a level of indirection to make the code more explicit -- should make it easier to follow what's going on. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: factor si_query_buffer logic out of si_query_hwNicolai Hähnle2018-12-194-110/+99
| | | | | | | | | | | This is a move towards using composition instead of inheritance for different query types. This change weakens out-of-memory error reporting somewhat, though this should be acceptable since we didn't consistently report such errors in the first place. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move query suspend logic into the top-level si_query structNicolai Hähnle2018-12-193-44/+62
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move remaining perfcounter code into si_perfcounter.cNicolai Hähnle2018-12-197-766/+643
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: track constant buffer bind history in si_pipe_set_constant_bufferNicolai Hähnle2018-12-191-2/+3
| | | | | | Other callers of si_set_constant_buffer don't need it. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use si_set_rw_shader_buffer for setting streamout buffersNicolai Hähnle2018-12-191-50/+11
| | | | | | Reduce the number of places that encode buffer descriptors. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add an si_set_rw_shader_buffer convenience functionNicolai Hähnle2018-12-192-45/+64
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: avoid using hard-coded SI_NUM_RW_BUFFERSNicolai Hähnle2018-12-191-1/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: show the fixed function TCS in debug dumpsNicolai Hähnle2018-12-191-2/+8
| | | | | | This is rather important for merged VS/TCS as LSHS shaders... Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: const-ify si_set_tesseval_regsNicolai Hähnle2018-12-191-2/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename SI_RESOURCE_FLAG_FORCE_TILING to clarify its purposeNicolai Hähnle2018-12-193-4/+4
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: don't set RAW_WAIT for CP DMA clearsNicolai Hähnle2018-12-191-1/+2
| | | | | | There is never a read-after-write hazard because the command doesn't read. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: use SET_UCONFIG_REG_INDEX packets when availableNicolai Hähnle2018-12-192-5/+15
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_init_draw_functions and make some functions staticNicolai Hähnle2018-12-194-22/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: extract declare_vs_blit_inputsNicolai Hähnle2018-12-191-18/+25
| | | | | | Prepare for some later refactoring. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move SI_FORCE_FAMILY functionality to winsysNicolai Hähnle2018-12-192-34/+36
| | | | | | | This helps some debugging cases by initializing addrlib with slightly more appropriate settings. Reviewed-by: Marek Olšák <[email protected]>