summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/a6xx: Share shader state constructor and destructorKristian H. Kristensen2019-09-186-190/+76
| | | | | | | Also, swap vs and fs constructor or so fs comes first. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a6xx: Track location of gl_Position out as we link itKristian H. Kristensen2019-09-181-3/+5
| | | | | | | | | | | | | | When using xfb and rasterizing, the fragment shader may have fewer inputs than the vertex shader outputs. We can't rely on gl_Position to be placed at fs->total_in, but have to instead remember where we add it in the link map and use that location. Fixes 100+ tesselation dEQPs under dEQP-GLES31.functional.tessellation.primitive_discard.* dEQP-GLES31.functional.tessellation.user_defined_io.* Reviewed-by: Eric Anholt <[email protected]>
* iris: Avoid uploading SURFACE_STATE descriptors for UBOs if possibleKenneth Graunke2019-09-183-17/+53
| | | | | | | | | | | | | | | | | | If we can entirely push uniform data, we don't need a SURFACE_STATE descriptor for pulling data. Since constant uploads are a very common operation, and being able to push all data is also very common, we would like to avoid the overhead in this case. This patch defers uploading new descriptors. Instead of handling that at iris_set_constant_buffer, we do it at iris_update_compiled_shaders, where we can see the currently bound shader variants. If any need pull descriptors, and descriptors are missing, we update them and flag that the binding table also needs to be refreshed. Improves performance in GFXBench5 gl_driver2 on an i7-6770HQ by 31.9774% +/- 1.12947% (n=15). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Track per-stage bind history, reduce work accordinglyKenneth Graunke2019-09-184-6/+16
| | | | | | | | | | | | | | We now track per-stage bind history for constant and shader buffers, shader images, and sampler views by adding an extra res->bind_stages field to go with res->bind_history. This lets us flag IRIS_DIRTY_CONSTANTS for only the specific stages involved, and also skip some CPU overhead in iris_rebind_buffer. Cuts 4% of 3DSTATE_CONSTANT_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Don't flag IRIS_DIRTY_BINDINGS for constant usage historyKenneth Graunke2019-09-181-2/+1
| | | | | | | | | | | | | | | | | | | The underlying buffer isn't changing - so we don't need to update any SURFACE_STATE descriptors - we just might have new constants, meaning we need to re-emit 3DSTATE_CONSTANT_XS. On Gen9, this means we need to update 3DSTATE_BINDING_TABLE_POINTERS_XS too, but that's now handled by the explicit check in the previous patch. On Gen9, this should cause us to re-emit the binding table /pointer/ on writing to a buffer with PIPE_BIND_CONSTANT_BUFFER, rather than emitting a whole new /table/. On Gen8 and Gen11, this avoids binding table churn altogether. Cuts 61% of 3DSTATE_BINDING_TABLE_POINTERS_XS packets in a Shadow of Mordor trace on Icelake. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Explicitly emit 3DSTATE_BTP_XS on Gen9 with DIRTY_CONSTANTS_XSKenneth Graunke2019-09-181-1/+6
| | | | | | | | | | | | | Right now, we usually flag both IRIS_DIRTY_{CONSTANTS,BINDINGS}_XS, because we have SURFACE_STATE for constant buffers in case the shaders access them via pull mode. But this flagging is overkill in many cases. Gen8 and Gen11 don't need it at all. Gen9 doesn't need that large of a hammer in all cases. Just handle it explicitly so the right thing happens. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Flag IRIS_DIRTY_BINDINGS_XS on constant buffer rebindsKenneth Graunke2019-09-181-1/+2
| | | | | | | We upload a new SURFACE_STATE for the UBO/SSBO in question, which means that we need new binding tables as well. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radeonsi: include drm_fourcc.h to fix the buildMarek Olšák2019-09-181-0/+1
|
* radeonsi: implement pipe_screen::resource_get_paramMarek Olšák2019-09-181-22/+78
| | | | | | v2: return DRM_FORMAT_MOD_INVALID from the function Reviewed-by: Kenneth Graunke <[email protected]> (v1)
* gallium: extend resource_get_param to be as capable as resource_get_handleMarek Olšák2019-09-187-16/+56
| | | | | Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ac: move ac_get_num_physical_vgprs into radeon_infoMarek Olšák2019-09-181-3/+3
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move ac_get_num_physical_sgprs into radeon_infoMarek Olšák2019-09-181-2/+2
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move ac_get_max_wave64_per_simd into radeon_infoMarek Olšák2019-09-181-1/+1
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move num_sdp_interfaces into radeon_infoMarek Olšák2019-09-181-15/+1
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: move PBB MAX_ALLOC_COUNT into radeon_infoMarek Olšák2019-09-181-31/+1
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* etnaviv: fix two-sided stencilJonathan Marek2019-09-185-30/+44
| | | | | | | | | | | | | * Set missing STENCIL_CONFIG_EXT2 bits * Swap stencil sides when rendering CCW Fixes following deqp tests (which were 99% failing): dEQP-GLES2.functional.fragment_ops.depth_stencil.* Note: deqp tests require --deqp-gl-config-name=rgba8888d24s8ms0 Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* panfrost: Allocate tiler and scratchpad BOs per-batchBoris Brezillon2019-09-184-41/+68
| | | | | | | | | If we want to execute several batches in parallel they need to have their own tiler and scratchpad BOs. Let move those objects to panfrost_batch and allocate them on a per-batch basis. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add FBO BOs to batch->bos earlierBoris Brezillon2019-09-184-3/+17
| | | | | | | | | | If we want the batch dependency tracking to work correctly we must make sure all BOs are added to the batch->bos set early enough. Adding FBO BOs when generating the fragment job is clearly to late. Add a panfrost_batch_add_fbo_bos helper and call it in the clear/draw path. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add the panfrost_batch_create_bo() helperBoris Brezillon2019-09-184-25/+28
| | | | | | | | This helper automates the panfrost_bo_create()+panfrost_batch_add_bo()+ panfrost_bo_unreference() sequence that's done for all per-batch BOs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't return imported/exported BOs to the cacheBoris Brezillon2019-09-182-0/+9
| | | | | | | | We don't know who else is using the BO in that case, and thus shouldn't re-use it for something else. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add panfrost_bo_{alloc,free}()Boris Brezillon2019-09-181-76/+68
| | | | | | | | | Thanks to that we avoid the recursive call into panfrost_bo_create() and we can get rid of panfrost_bo_release() by inlining the code in panfrost_bo_unreference(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop using panfrost_bo_release() outside of pan_bo.cBoris Brezillon2019-09-184-7/+8
| | | | | | | | | | | | | | | panfrost_bo_unreference() should be used instead. The only difference caused by this change is that the scratchpad, tiler_heap and tiler_dummy BOs are now returned to the cache instead of being freed when a context is destroyed. This is only a problem if we care about context isolation, which apparently is not the case since transient BOs are already returned to the per-FD cache (and all contexts share the same address space anyway, so enforcing context isolation is almost impossible). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop passing screen around for BO operationsBoris Brezillon2019-09-187-37/+37
| | | | | | | | Store a screen pointer in panfrost_bo so we don't have to pass a screen object to all functions manipulating the BO. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Don't check if BO is mmaped before calling panfrost_bo_mmap()Boris Brezillon2019-09-181-5/+1
| | | | | | | panfrost_bo_mmap() already takes care of that. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop exposing panfrost_bo_cache_{fetch,put}()Boris Brezillon2019-09-182-8/+2
| | | | | | | | They are not expected to be called directly, users should use panfrost_bo_{create,release}() instead. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move the BO API to its own headerBoris Brezillon2019-09-1816-74/+112
| | | | | | | | Right now, the BO API is spread over pan_{allocate,resource,screen}.h. Let's move all BO related definitions to a separate header file. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: s/PAN_ALLOCATE_/PAN_BO_/Boris Brezillon2019-09-187-19/+19
| | | | | | | | Change the prefix for BO allocation flags to make it consistent with the rest of the BO API. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move panfrost_bo_{reference,unreference}() to pan_bo.cBoris Brezillon2019-09-182-19/+20
| | | | | | | | This way we have all BO related functions placed in the same source file. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of pan_drm.cBoris Brezillon2019-09-1812-444/+382
| | | | | | | | | | | | | | | | pan_drm.c was only meaningful when we were supporting 2 kernel drivers (mali_kbase, and the drm one). Now that there's now kernel-driver abstraction we're better off moving those functions were they belong: * BO related functions in pan_bo.c * fence related functions + query_gpu_version() in pan_screen.c * submit related functions in pan_job.c While at it, we rename the functions according to the place they're being moved to. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop passing has_draws to panfrost_drm_submit_vs_fs_batch()Boris Brezillon2019-09-183-5/+4
| | | | | | | | has_draws can be inferred directly from the batch->last_job value, no need to pass it around. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Kill a useless memset(0) in panfrost_create_context()Boris Brezillon2019-09-181-1/+0
| | | | | | | | ctx is allocated with rzalloc() which takes care of zero-ing the memory region. No need to call memset(0) on top. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add polygon_list to the batch BO set at allocation timeBoris Brezillon2019-09-182-4/+7
| | | | | | | | | That's what we do for other per-batch BOs, and we'll soon add an helper to automate this create_bo()+add_bo()+bo_unreference() sequence, so let's prepare the code to ease this transition. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add missing panfrost_batch_add_bo() callsBoris Brezillon2019-09-181-1/+4
| | | | | | | | | | Some BOs are used by batches but never explicitly added to the BO set. This is currently not a problem because we wait for the execution of a batch to be finished before releasing a BO, but we will soon relax this rule. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use the correct type for the bo_handle arrayBoris Brezillon2019-09-181-1/+2
| | | | | | | | The DRM driver expects an array of u32, let's use the correct type, even if using an int works in practice because it's still a 32-bit integer. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop exposing internal panfrost_*_batch() functionsBoris Brezillon2019-09-182-14/+3
| | | | | | | | panfrost_{create,free,get}_batch() are only called inside pan_job.c. Let's make them static. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* etnaviv: disable ARB_shadowChristian Gmeiner2019-09-181-0/+2
| | | | | | | | | Looks like only HALT2 GPUs have support for it but that is not yet implemented so disable ARB_shadow for now. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* Revert "gallium: remove PIPE_CAP_TEXTURE_SHADOW_MAP"Christian Gmeiner2019-09-183-0/+7
| | | | | | | | | | There are GPUs that do not support this feature. This reverts commit e871abe452ad40efcccb0bab6b88fc31d0551e29 Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* virgl: Remove wrong EAGAIN handling for drmIoctlLepton Wu2019-09-181-3/+3
| | | | | | | | | | | drmIoctl handles EAGAIN itself and actually it always return -1 on errors. Remove the wrong handling of its return value. Also, print a warning when it fails. v2: - use _debug_printf instead of fprintf (Gurchetan Singh) Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v1)
* iris: Skip allocating a null surface when there are 0 color regions.Kenneth Graunke2019-09-172-2/+9
| | | | | | | | | | | The compiler now sets the "Null Render Target" bit in the RT write extended message descriptor, causing it to write to an implicit null surface without us needing to set one up in the binding table. Together with the last patch, this improves performance in Car Chase on an Icelake 8x8 (locked to 700Mhz) by 0.0445526% +/- 0.0132736% (n=832). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* gallium/xlib: Fix glXMakeCurrent(dpy, None, None, ctx)Adam Jackson2019-09-172-27/+40
| | | | | This is entirely legal in GL 3.0+. I wonder how many more times I'll need to fix this specific bug.
* gallium/xlib: Remove MakeCurrent_PrevContextAdam Jackson2019-09-171-12/+5
| | | | | As the comment notes, this is not thread-safe. You can just as easily use GetCurrentContext instead, so, do that.
* gallium/xlib: Remove drawable caching from the MakeCurrent pathAdam Jackson2019-09-171-32/+3
| | | | | AFAICT this only exists to avoid hitting XMesaFindBuffer, which is a linear search. But you don't have that many GLX drawables, so whatever.
* ci: Run tests on i386 cross buildsAdam Jackson2019-09-171-1/+3
| | | | | | | | | Yes, some tests fail, but we can turn those into XFAILs at meson time. Better to keep the things that work working than not cover them at all. Unfortunately XPASS results will not cause the build to fail until we update CI to meson 0.51 or newer. Reviewed-by: Daniel Stone <[email protected]>
* iris: close screen fd on iris_destroy_screenTapani Pälli2019-09-171-0/+1
| | | | | | | | | Otherwise it never gets closed, this fixes errors seen with deqp-egl where we end up opening 1024 files. Fixes: 2dce0e94 ("iris: Initial commit of a new 'iris' driver for Intel Gen8+ GPUs.") Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* swr: Limit DEBUG workaround to LLVM < 7Michel Dänzer2019-09-173-3/+21
| | | | | | As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <[email protected]>
* gallivm: Limit DEBUG workaround to LLVM < 7Michel Dänzer2019-09-171-0/+4
| | | | | | As of version 7, LLVM uses LLVM_DEBUG instead of just DEBUG. Reviewed-by: Timothy Arceri <[email protected]>
* etnaviv: a bit of micro-optimizationChristian Gmeiner2019-09-172-1/+4
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Jonathan Marek <[email protected]>
* lima: reset scissor state if scissor test is disabledIcenowy Zheng2019-09-171-0/+4
| | | | | | | | | | | The PLBU seems to preserve scissor state between draws, and since lima doesn't emit PLBU_CMD_SCISSORS() if scissor test is disabled, it uses state from previous draw. Fix it by emitting PLBU_CMD_SCISSORS() for full fb if scissor test is disabled. Signed-off-by: Icenowy Zheng <[email protected]> Reviewed-by: Vasily Khoruzhick <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* gallium/gdi: use GALLIUM_FOO rather than HAVE_FOOErik Faye-Lund2019-09-162-10/+10
| | | | | | | | | This matches what other targets do, and makes it easier to port to meson. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* scons: Make scons and meson agree about path to glapi generated headersDylan Baker2019-09-162-1/+2
| | | | | | | | Currently scons puts them in src/mapi/glapi, meosn puts them in src/mapi/glapi/gen. This results in some things being compilable only by one or the other, put them in the same places so that everyone is happy. Reviewed-by: Eric Engestrom <[email protected]>