aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: move texture storage allocation outside of radeonsiMarek Olšák2019-09-094-51/+97
| | | | | | possible code sharing with radv Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: move HTILE allocation outside of radeonsiMarek Olšák2019-09-094-91/+93
| | | | | | | ac_surface computes it for amdgpu. radeon_drm_surface computes it for radeon. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: handle NO_DCC earlyMarek Olšák2019-09-091-5/+7
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac/surface: add RADEON_SURF_NO_FMASKMarek Olšák2019-09-094-12/+14
| | | | | | This controls FMASK and CMASK computation for MSAA. Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* r300,r600,radeonsi: set winsys_handle::stride,offset in drivers, not winsysesMarek Olšák2019-09-096-20/+12
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* r300,r600,radeonsi: read winsys_handle::stride,offset in drivers, not winsysesMarek Olšák2019-09-096-47/+20
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: fix wave occupancy computationsMarek Olšák2019-09-094-21/+49
| | | | | Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: only support at most 1024 threads per blockMarek Olšák2019-09-091-8/+2
| | | | | | LLVM 10 won't support 2048. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi: disable DCC when importing a texture from an incompatible driverMarek Olšák2019-09-091-4/+12
| | | | | | and unify the code. Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: don't call gfx10_destroy_query with compute-only contextsMarek Olšák2019-09-091-1/+1
| | | | | | | This fixes a crash. Cc: 19.2 <[email protected]> Reviewed-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* radeonsi/gfx10: use fma for TGSI_OPCODE_FMAMarek Olšák2019-09-093-5/+16
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: use fma on gfx10Marek Olšák2019-09-092-1/+9
| | | | Acked-by: Pierre-Eric Pelloux-Prayer <[email protected]>
* ac: enable LLVM atomic optimizationsMarek Olšák2019-09-091-1/+9
|
* virgl: Fix pipe_resource leaks under multi-sample.Lepton Wu2019-09-101-1/+3
| | | | | | | Fixes: 900a80f9e4f ("virgl: virgl_transfer should own its virgl_resource") Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Chia-I Wu <[email protected]>
* iris: Avoid flushing for cache history on transfer range flushesKenneth Graunke2019-09-092-2/+13
| | | | | | | | | | | | | | | | | The VBO module maps a buffer with GL_MAP_FLUSH_EXPLICIT, and keeps appending data, and calling glFlushMappedBufferRange(). We were invalidating the VF cache each time it flushed a new range, which results in a ton of VF flushes. If the contents of the destination in the target range are undefined (never even possibly written), this patch makes us assume that it's likely not in the cache and so cache invalidations are required. If the destination range is defined, we continue cache flushing as we may need to expunge stale data. This eliminates 88% of the VF cache invalidates on Manhattan 3.0. Improves performance in Manhattan 3.0 on my Icelake 8x8 with the GPU frequency locked to 700Mhz by 0.376724% +/- 0.0989183% (n=10).
* iris: Optimize out redundant sampler state bindsKenneth Graunke2019-09-091-2/+8
| | | | | | This cuts roughly 85% of the 3DSTATE_SAMPLER_STATE_POINTERS_PS calls in the J2DBench images test. For some reason, the state tracker is calling bind_sampler_state with the same sampler state in a bunch of cases.
* iris: Add support for the always_flush_cache=true debug option.Kenneth Graunke2019-09-097-0/+39
| | | | This can be useful for debugging missing flushes.
* mesa: Eliminate gl_config::rgbModeAdam Jackson2019-09-098-68/+31
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Eliminate gl_config::have{Accum,Depth,Stencil}BufferAdam Jackson2019-09-0913-46/+18
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: Remove unused gl_config::indexBitsAdam Jackson2019-09-095-7/+1
| | | | | Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/xlib: Fix an obvious thinkoAdam Jackson2019-09-091-1/+1
| | | | | x == !GLX_DIRECT_COLOR is a fancy way of writing x == 0, which is clearly not what was meant.
* iris: Ignore line stipple information if it's disabledKenneth Graunke2019-09-091-3/+5
| | | | | | | | | | | | | | | The line stipple pattern and factor only matter if line stippling is actually enabled. Otherwise, we can safely ignore it. PBO upload may give us zero for line stipple information, while normal drawing tends to give us an actual stipple pattern such as 0xffff. This was causing us to flag IRIS_DIRTY_LINE_STIPPLE way too often, leading to useless 3DSTATE_LINE_STIPPLE commands, which are non-pipelined and thus very expensive. Improves performance in Manhattan 3.0 on Skylake GT4e by 0.149261% +/- 0.0380796% (n=210). On an Icelake 8x8 with the GPU frequency locked at 700Mhz, improves by 0.423756% +/- 0.222843% (n=3).
* lima/ppir: drop fge/flt/feq/fne optionsVasily Khoruzhick2019-09-091-4/+0
| | | | | | | | | These are supposed to be lowered into sge/slt/seq/sne equivalents. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima: run opt_algebraic between int_to_float and boot_to_float for vsVasily Khoruzhick2019-09-091-4/+5
| | | | | | | | | int_to_float emits ftrunc and ftrunc lowering generates bool ops. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: fix warning in gpir disassemblerVasily Khoruzhick2019-09-091-1/+1
| | | | | | | | | | | | | Fixes following warning: ../src/gallium/drivers/lima/ir/gp/disasm.c: In function ‘print_src’: ../src/gallium/drivers/lima/ir/gp/disasm.c:241:20: warning: array subscript 28 is above array bounds of ‘char[5]’ [-Warray-bounds] 241 | "xyzw"[src - gpir_codegen_src_attrib_x]); Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: lower fceilVasily Khoruzhick2019-09-091-0/+1
| | | | | | | | | GP doesn't support fceil so we need to lower it. Reviewed-by: Connor Abbott <[email protected]> Reviewed-by: Erico Nunes <[email protected]> Reviewed-by: Andreas Baierl <[email protected]> Signed-off-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Disallow moves for schedule_first nodesConnor Abbott2019-09-091-1/+5
| | | | | | | | | | | | The entire point of schedule_first is that the node has to be scheduled as soon as possible without any moves because it doesn't produce a proper floating-point value, or its value changes depending on where you read it. We were still introducing a move for preexp2 in some cases though, even if it got scheduled as soon as possible, which broke some exp() tests. Fix that. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Fix fake dep handling for schedule_first nodesConnor Abbott2019-09-092-10/+30
| | | | | | | | | | | | | The whole point of schedule_first nodes is that they need to be scheduled as soon as possible, so if a schedule_first node is the successor in a fake dependency that prevents it from being scheduled after its parent, that can cause problems. We need to add these fake dependencies to the parent as well, and we need to guarantee that the pre-RA scheduler puts schedule_first nodes right before their parents in order to prevent this from adding cycles to the dependency graph. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Fix schedule_first insertion logicConnor Abbott2019-09-091-2/+3
| | | | | | | | | | | The idea was to make sure schedule_first nodes were always first in the ready list. I made sure they were inserted first, but not that other nodes wouldn't later be scheduled ahead of them. Fixes [email protected]@execution@built-in-functions@vs-exp-float and probably others. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Ignore unscheduled successors in can_use_complex()Connor Abbott2019-09-091-1/+2
| | | | | | | | | | The point of the function is to avoid creating a complex move which is used by certain slots in the next instruction, but unscheduled successors will never be in the next instruction. Found while debugging a crash that the previous commit fixed. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* lima/gpir: Do all lowerings before rschedConnor Abbott2019-09-093-23/+2
| | | | | | | | | | | | | | | | | | | | The scheduler assumes that load nodes are always duplicated so that they can always be scheduled eventually and therefore they never need to be spilled. But some lowerings were running after the pre-RA scheduler, whereas duplication has to happen before then since it's needed for the scheduler to do a better job reducing register pressure. This meant that lowerings were introducing multiple uses of a load instruction, which broke the scheduler's expectation and resulted in infinite loops in situations where the only nodes available to spill were load nodes. Spilling load nodes would be silly, so we want to fix the lowerings rather than the scheduler. Just do all lowerings before the pre-RA scheduler, which also helps with reducing pressure since the scheduler can more accurately compute the pressure. Fixes lima/mesa#104. Reviewed-by: Vasily Khoruzhick <[email protected]> Tested-by: Vasily Khoruzhick <[email protected]>
* android: anv: libmesa_vulkan_common: add libmesa_util static dependencyMauro Rossi2019-09-081-1/+2
| | | | | | | | | | | | | Change needed to fix the following building error: In file included from external/mesa/src/intel/vulkan/anv_device.c:43: external/mesa/src/util/xmlpool.h:115:10: fatal error: 'xmlpool/options.h' file not found ^~~~~~~~~~~~~~~~~~~ 1 error generated. Fixes: 4dcb1ff ("anv: add support for driconf") Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* panfrost: Rename pan_bo_cache.c into pan_bo.cBoris Brezillon2019-09-082-1/+1
| | | | | | | | So we can move all the BO logic into this file instead of having it spread over pan_resource.c, pan_drm.c and pan_bo_cache.c. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of the now unused SLAB allocatorBoris Brezillon2019-09-083-47/+0
| | | | | | | | | The last users have been converted to use plain BOs. Let's get rid of this abstraction. We can always consider adding it back if we need it at some point. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of unused panfrost_context fieldsBoris Brezillon2019-09-081-4/+0
| | | | | | | | Some fields in panfrost_context are unused (probably leftovers from previous refactor). Let's get rid of them. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Convert ctx->{scratchpad, tiler_heap, tiler_dummy} to plain BOsBoris Brezillon2019-09-083-18/+21
| | | | | | | | | ctx->{scratchpad,tiler_heap,tiler_dummy} are allocated using panfrost_drm_allocate_slab() but they never any of the SLAB-based allocation logic. Let's convert those fields to plain BOs. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Make transient allocation rely on the BO cacheBoris Brezillon2019-09-085-104/+16
| | | | | | | | | Right now, the transient memory allocator implements its own BO caching mechanism, which is not really needed since we already have a generic BO cache. Let's simplify things a bit. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Stop passing a ctx to functions being passed a batchBoris Brezillon2019-09-084-21/+23
| | | | | | | | The context can be retrieved from batch->ctx. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass a batch to panfrost_drm_submit_vs_fs_batch()Boris Brezillon2019-09-083-9/+9
| | | | | | | | | Given the function name it makes more sense to pass it a job batch directly. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: s/job/batch/Boris Brezillon2019-09-0816-259/+264
| | | | | | | | | | | | What we currently call a job is actually a batch containing several jobs all attached to a rendering operation targeting a specific FBO. Let's rename structs, functions, variables and fields to reflect this fact. Suggested-by: Alyssa Rosenzweig <[email protected]> Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* egl: Add GL_MESA_EGL_sync supportHeinrich Fink2019-09-082-4/+8
| | | | | | | | This commit follow OES_EGL_sync to universially enable use of EGL sync objects with desktop OpenGL contexts. Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* registry: update gl.xml with GL_MESA_EGL_sync tokenHeinrich Fink2019-09-081-0/+1
| | | | | | | As added by upstream GL registry changes Reviewed-by: Daniel Stone <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* android: fix linking issues with liblogTapani Pälli2019-09-072-1/+4
| | | | | | | | Fixes Android build errors observed in Intel CI. Fixes: f9f7cbc1aa3 "util: android logging support" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* iris: Support the disable_throttling=true driconf option.Kenneth Graunke2019-09-063-0/+6
|
* nir/dead_cf: Repair SSA if the pass makes progressJason Ekstrand2019-09-061-2/+13
| | | | | | | | | | | | | | | | | | | | The dead_cf pass calls into the CF manipulation helpers which attempt to keep NIR's SSA form sane. However, when the only break is removed from a loop, dominance gets messed up anyway because the CF SSA clean-up code only looks at phis and doesn't consider the case of code becoming unreachable. One solution to this would be to put the loop into LCSSA form before we modify any of its contents. Another (and the approach taken by this pass) is to just run the repair_ssa pass afterwards because the CF manipulation helpers are smart enough to keep all the use/def stuff sane; they just don't always preserve dominance properties. While we're here, we clean up some bogus indentation. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111405 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111069 Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/repair_ssa: Insert deref casts when neededJason Ekstrand2019-09-061-2/+29
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/repair_ssa: Repair dominance for unreachable blocksJason Ekstrand2019-09-061-4/+8
| | | | | | | | | | | | NIR currently assumes that unreachable blocks are trivially dominated by everything. However, when considering well-formed SSA, there is no path from any block to an unreachable block. Therefore, we can break any use-def chains where the use is in an unreachable block. This removes any dependencies on code created by uses in unreachable blocks and lets DCE do a better job of cleaning it up. Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a block_is_unreachable helperJason Ekstrand2019-09-062-0/+15
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Don't infinitely recurse in lower_ssa_defs_to_regs_blockJason Ekstrand2019-09-061-5/+15
| | | | | Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Handle complex derefs in nir_split_array_varsJason Ekstrand2019-09-061-2/+5
| | | | | | | | We already bail and don't split the vars but we were passing a NULL to _mesa_hash_table_search which is not allowed. Fixes: f1cb3348f1 "nir/split_vars: Properly bail in the presence of ..." Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>