aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
...
* panfrost: Move the batch submission logic to panfrost_batch_submit()Boris Brezillon2019-09-135-172/+127
| | | | | | | | | | | | | | We are about to patch panfrost_flush() to flush all pending batches, not only the current one. In order to do that, we need to move the 'flush single batch' code to panfrost_batch_submit(). While at it, we get rid of the existing pipelining logic, which is currently unused and replace it by an unconditional wait at the end of panfrost_batch_submit(). A new pipeline logic will be introduced later on. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move the fence creation in panfrost_flush()Boris Brezillon2019-09-134-15/+15
| | | | | | | | | panfrost_flush() is about to be reworked to flush all pending batches, but we want the fence to block on the last one. Let's move the fence creation logic in panfrost_flush() to prepare for this situation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Delay payloads[].offset_start initializationBoris Brezillon2019-09-131-3/+3
| | | | | | | | | | panfrost_draw_vbo() Might call the primeconvert/without_prim_restart helpers which will enter the ->draw_vbo() again. Let's delay payloads[].offset_start initialization so we don't initialize them twice. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Prepare things to avoid flushes on FB switchBoris Brezillon2019-09-132-8/+12
| | | | | | | | | | | panfrost_attach_vt_xxx() functions are now passed a batch, and the generated FB desc is kept in panfrost_batch so we can switch FBs without forcing a flush. The postfix->framebuffer field is restored on the next attach_vt_framebuffer() call if the batch already has an FB desc. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass a batch to panfrost_set_value_job()Boris Brezillon2019-09-131-4/+2
| | | | | | | | So we can emit SET_VALUE jobs for a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use ctx->wallpaper_batch in panfrost_blit_wallpaper()Boris Brezillon2019-09-131-4/+5
| | | | | | | | | | We'll soon be able to flush a batch that's not currently bound to the context, which means ctx->pipe_framebuffer will not necessarily be the FBO targeted by the wallpaper draw. Let's prepare for this case and use ctx->wallpaper_batch in panfrost_blit_wallpaper(). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass a batch to functions emitting FB descsBoris Brezillon2019-09-136-53/+44
| | | | | | | | So we can emit such jobs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Pass a batch to panfrost_{allocate,upload}_transient()Boris Brezillon2019-09-1310-41/+57
| | | | | | | | | | | We need that if we want to upload transient buffers to a batch that's not currently bound to the context, which in turn will be needed if we want to relax the batch serialization we have right now (only flush batches when we need to: on a flush request, or when one batch depends on the result of other batches). Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Allow testing if a specific batch is targeting a scanout FBBoris Brezillon2019-09-135-24/+23
| | | | | | | | | | | Rename panfrost_is_scanout() into panfrost_batch_is_scanout(), pass it a batch instead of a context and move the code to pan_job.c. With this in place, we can now test if a batch is targeting a scanout FB even if this batch is not bound to the context. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Get rid of the unused 'flush jobs accessing res' infraBoris Brezillon2019-09-133-49/+0
| | | | | | | | Will be replaced by something similar but using a BOs as keys instead of resources. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use a pipe_framebuffer_state as the batch keyBoris Brezillon2019-09-132-38/+17
| | | | | | | | | | This way we have all the fb_state information directly attached to a batch and can pass only the batch to functions emitting CMDs, which is needed if we want to be able to queue CMDs to a batch that's not currently bound to the context. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* radeon/vcn: exclude raven2 from vcn 2.0 encode initializationIndrajit Das2019-09-131-1/+1
| | | | | Signed-off-by: Indrajit Das <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* panfrost: Rework midgard_pair_load_store() to kill the nested foreach loopBoris Brezillon2019-09-131-34/+29
| | | | | | | | | | | | | | | | | | | | | | | mir_foreach_instr_in_block_safe() is based on list_for_each_entry_safe() which is designed to protect against removal of the current entry, but removing the entry placed just after the current one will lead to a use-after-free situation. Luckily, the midgard_pair_load_store() logic guarantees that the instruction being removed (if any) is never placed just after ins which in turn guarantees that the hidden __next variable always points to a valid object. Took me a bit of time to realize that this code was safe, so I'm suggesting to get rid of the inner mir_foreach_instr_in_block_from() loop and rework the code so that the removed instruction is always the current one (which is what the list_for_each_entry_safe() API was initially designed for). While at it, we also get rid of the unecessary insert(ins)/remove(ins) dance by simply moving the instruction around. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Fix a list_assert() in schedule_block()Boris Brezillon2019-09-131-4/+6
| | | | | | | | | list_for_each_entry() does not allow modifying the current item pointer. Let's rework the skip-instructions logic in schedule_block() to not break this rule. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* v3d: fix TF primitive counts for resume without drawIago Toral Quiroga2019-09-133-1/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The V3D documentation states that primitive counters are reset when we emit Tile Binning Mode Configuration items, which we do at the start of each draw call, however, in the actual hardware this doesn't seem to take effect when transform feedback is not active (this doesn't happen in the simulator). This causes a problem in the following scenario: glBeginTransformFeedback() glDrawArrays() glPauseTransformFeedback() glDrawArrays() glResumeTransformFeedback() glEndTransformFeedback() The TF pause will trigger a flush of the primitive counters, which results in a correct number of primitives up to that point. In theory, the counter should then be reset when we execute the draw after pausing TF, but that doesn't happen, and since TF is enabled again by the resume command before we end recording, by the time we end the transform feedback recording we again check the counters, but instead of reading 0, we read again the same value we read at the time we paused, incorrectly accumulating that value again. In theory, we should be able to avoid this by using the other method to reset the primitive counters: using operation 1 instead of 0 when we flush the counts to the buffer at the time we pause, but again, this doesn't seem to be work and we still see obsolete counts by the time we end transform feedback. This patch fixes the problem by not accumulating TF primitive counts unless we know we have actually queued draw calls during transform feedback, since that seems to effectively reset the counters. This should also be more performant, since it saves unnecessary stalls for the primitive counters to be updated when we know there haven't been any new primitives drawn. Fixes CTS tests: dEQP-GLES3.functional.transform_feedback.* Reviewed-by: Eric Anholt <[email protected]>
* v3d: remove redundant update of queued draw callsIago Toral Quiroga2019-09-131-2/+0
| | | | | | | | This was updating the counter for the indexed draw path only, but we are already updating the counter for all paths a bit later, so this is only duplicating counts for indexed paths. Reviewed-by: Eric Anholt <[email protected]>
* v3d: make sure we have enough space in the CL for the primitive counts packetIago Toral Quiroga2019-09-131-0/+1
| | | | | | Fixes: 0f2d1dfe65 ("v3d: use the GPU to record primitives written to transform feedback") Reviewed-by: Eric Anholt <[email protected]>
* v3d: add missing line break for performance debug messageIago Toral Quiroga2019-09-131-1/+1
| | | | Reviewed-by: Eric Anholt <[email protected]>
* panfrost/ci: Use releases for Volt dEQPTomeu Vizoso2019-09-132-4/+6
| | | | | | So we can better correlate different results to versions of the runner. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost/ci: Update kernel to 5.3-rc8Tomeu Vizoso2019-09-132-2/+2
| | | | | | | We haven't updated in a long time, so better do it now and again when 5.3 is released. Signed-off-by: Tomeu Vizoso <[email protected]>
* panfrost/ci: Run dEQP with the surfaceless platformTomeu Vizoso2019-09-134-24/+44
| | | | | | | | | Instead of running it with the Wayland platform, which introduces unwanted dependencies and complexity. Makes tests run 30% faster, as well. Signed-off-by: Tomeu Vizoso <[email protected]>
* radv: fix allocating number of user sgprs if streamout is usedSamuel Pitoiset2019-09-131-1/+1
| | | | | | | | | | streamout_buffers is assigned after that function, so the previous fix was completely wrong. This probably fix something when streamout buffers and push constants are used/inlined in the same shader. Fixes: 378e2d24143 ("radv: fix computing number of user SGPRs for streamout buffers") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/fs: Handle UNDEF in split_virtual_grfsJason Ekstrand2019-09-131-1/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When the UNDEF instruction was added, we didn't do anything special in split_virtual_grfs. This mean that anything with an UNDEF wasn't getting split which causes problems for the compiler. Among other things, it makes RA harder because things are in bigger chunks. It also meant that dvec4s weren't getting split which means that they are larger than the maximum register size. Shader-db results on Kaby Lake: total instructions in shared programs: 14959202 -> 14960035 (<.01%) instructions in affected programs: 96197 -> 97030 (0.87%) helped: 140 HURT: 128 helped stats (abs) min: 1 max: 17 x̄: 1.62 x̃: 1 helped stats (rel) min: 0.09% max: 6.15% x̄: 0.65% x̃: 0.45% HURT stats (abs) min: 1 max: 825 x̄: 8.28 x̃: 1 HURT stats (rel) min: 0.13% max: 139.83% x̄: 1.70% x̃: 0.50% 95% mean confidence interval for instructions value: -2.96 9.18 95% mean confidence interval for instructions %-change: -0.56% 1.51% Inconclusive result (value mean confidence interval includes 0). total loops in shared programs: 4372 -> 4372 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 352646771 -> 352840997 (0.06%) cycles in affected programs: 218600800 -> 218795026 (0.09%) helped: 21167 HURT: 21411 helped stats (abs) min: 1 max: 2924 x̄: 36.89 x̃: 10 helped stats (rel) min: <.01% max: 41.90% x̄: 2.97% x̃: 0.98% HURT stats (abs) min: 1 max: 26027 x̄: 45.54 x̃: 10 HURT stats (rel) min: <.01% max: 324.46% x̄: 3.88% x̃: 1.06% 95% mean confidence interval for cycles value: 2.87 6.26 95% mean confidence interval for cycles %-change: 0.40% 0.55% Cycles are HURT. total spills in shared programs: 8840 -> 8953 (1.28%) spills in affected programs: 126 -> 239 (89.68%) helped: 1 HURT: 2 total fills in shared programs: 21782 -> 21914 (0.61%) fills in affected programs: 431 -> 563 (30.63%) helped: 1 HURT: 3 LOST: 0 GAINED: 5 Shader-db results on Haswell: total instructions in shared programs: 13320918 -> 13320769 (<.01%) instructions in affected programs: 40998 -> 40849 (-0.36%) helped: 146 HURT: 56 helped stats (abs) min: 1 max: 8 x̄: 2.73 x̃: 2 helped stats (rel) min: 0.16% max: 8.60% x̄: 2.52% x̃: 2.22% HURT stats (abs) min: 2 max: 23 x̄: 4.45 x̃: 4 HURT stats (rel) min: 0.21% max: 10.26% x̄: 6.83% x̃: 10.26% 95% mean confidence interval for instructions value: -1.26 -0.21 95% mean confidence interval for instructions %-change: -0.62% 0.77% Inconclusive result (%-change mean confidence interval includes 0). total loops in shared programs: 4373 -> 4373 (0.00%) loops in affected programs: 0 -> 0 helped: 0 HURT: 0 total cycles in shared programs: 374518258 -> 374384193 (-0.04%) cycles in affected programs: 231101954 -> 230967889 (-0.06%) helped: 21427 HURT: 19438 helped stats (abs) min: 1 max: 2035 x̄: 31.09 x̃: 8 helped stats (rel) min: <.01% max: 40.95% x̄: 2.42% x̃: 0.86% HURT stats (abs) min: 1 max: 20875 x̄: 27.38 x̃: 8 HURT stats (rel) min: <.01% max: 59.09% x̄: 2.49% x̃: 0.80% 95% mean confidence interval for cycles value: -4.49 -2.07 95% mean confidence interval for cycles %-change: -0.14% -0.04% Cycles are helped. total spills in shared programs: 23406 -> 23411 (0.02%) spills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 total fills in shared programs: 34845 -> 34850 (0.01%) fills in affected programs: 3 -> 8 (166.67%) helped: 0 HURT: 2 LOST: 0 GAINED: 0 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111566 Fixes: f4ef34f207d1 "intel/fs: Add an UNDEF instruction to avoid..." Reviewed-by: Francisco Jerez <[email protected]>
* mesa: fix texStore for FORMAT_Z32_FLOAT_S8X24_UINTJiadong Zhu2019-09-121-3/+3
| | | | | | | | | | | | _mesa_texstore_z32f_x24s8 calculates source rowStride at a pace of 64-bit, this will make inaccuracy offset if the width of src image is an odd number. Modify src pointer to int_32* as source image format is gl_float which is 32-bit per pixel. Reviewed by Ilia Mirkin Signed-off-by: Jiadong Zhu <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* freedreno/a6xx: pre-calculate userconst stateobj sizeRob Clark2019-09-125-3/+45
| | | | | | | | The AnTuTu "garden" benchmark overflows the fixed size constbuffer stateobject, so lets be more clever and calculate (a potentially slightly pessimistic) actual size. Signed-off-by: Rob Clark <[email protected]>
* gallium: Restore VSX for llvm >= 4Adam Jackson2019-09-121-0/+14
| | | | | | | | Accidentally dropped in 4fdd455eeb7cffadee86f06c685005a3b64ce94b. Fixes: 4fdd455e ("gallium: Require LLVM >= 3.4) Reported-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* egl/android: Fix build since the DRI fourcc removal.Eric Anholt2019-09-121-0/+1
| | | | | | | Fixes: 272f9cfe6a19 ("dri: Use DRM_FORMAT_* instead of defining our own copy.") Reviewed-by: John Stultz <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno: fix compiler warningRob Clark2019-09-121-1/+1
| | | | | | | fd6_blitter.c:724:31: warning: passing argument 1 of ‘fd_resource_level_linear’ discards ‘const’ qualifier from pointer target type [-Wdiscarded-qualifiers] Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* mesa/st: Fallback to name lookup when the variable have no ParameterCaio Marcelo de Oliveira Filho2019-09-121-2/+46
| | | | | | | | | | | | | | This brings back the fallback previously present in st_nir_lookup_parameter_index(): if there's no parameter associated with the variable, use a parameter from a variable with the same prefix. We'll have to sort out something for SPIR-V, but in the meantime let's fix GLSL. Fixes: b6384e57f5f ("mesa/st: Lookup parameters without using names") Reviewed-by: Eric Anholt <[email protected]> Tested-by: Eric Anholt <[email protected]>
* glx: Remove unused indirection for glx_context->fillImageAdam Jackson2019-09-124-22/+10
| | | | | | | This slot is always filled in with __glFillImage. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson/v3d: replace partial list of nir dep files with idep_nir_headersEric Engestrom2019-09-121-2/+2
| | | | | | | "partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* meson/iris: replace partial list of nir dep files with idep_nir_headersEric Engestrom2019-09-121-3/+2
| | | | | | | "partial" because `nir_intrinsics_h` was missing. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* v3d: flag dirty state when binding compute statesJose Maria Casanova Crespo2019-09-124-38/+47
| | | | | | | | | | | | | | | | As introduced in "v3d: flag dirty state when binding new sampler states" we need to add support for compute states. New flag VC5_DIRTY_COMPTEX and VC5_DIRTY_UNCOMPILED_CS are introduced. Reaching 33 flags at the dirty field forces us to change the type to uint_64. Flags are reordered and empty continuous bits are available for future pipeline stages. v2: Update flag conditions to compile cs shader. (Eric Antholt) Now dirty flags use uint_64t and flags are reordered. Added VC5_DIRTY_UNCOMPILED_CS flag. Reviewed-by: Eric Anholt <[email protected]>
* tgsi_to_nir: Translate TGSI_INTERPOLATE_COLOR as INTERP_MODE_NONEDanylo Piliaiev2019-09-121-1/+1
| | | | | | | | | | | | | | | | Translating TGSI_INTERPOLATE_COLOR as INTERP_MODE_SMOOTH made it for drivers impossible to have flatshaded color inputs. Translate it to INTERP_MODE_NONE which drivers interpret as smooth or flat depending on flatshading state. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=111467 Fixes: 770faf54 ("tgsi_to_nir: Improve interpolation modes.") Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_point_size: assume scalar PSIZIago Toral Quiroga2019-09-121-14/+3
| | | | | Reviewed-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* gallium/ttn: VARYING_SLOT_PSIZ and VARYING_SLOT_FOGC are scalarIago Toral Quiroga2019-09-121-0/+10
| | | | Reviewed-by: Eric Anholt <[email protected]>
* prog_to_nir: VARYING_SLOT_PSIZ is a scalarIago Toral Quiroga2019-09-121-3/+5
| | | | | | | v2: remove stray change (Erik Faye-Lund) Reviewed-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* egl/android: Only keep BGRA EGL configs as fallbackLepton Wu2019-09-121-0/+11
| | | | | | | | | | | Stock Android code actually doesn't support BGRA format EGL configs. It's hard coded to use RGBA_8888 as window format for BGRA EGL configs here: https://android.googlesource.com/platform/frameworks/native/+/1eb32e2/opengl/libs/EGL/eglApi.cpp#608 So just remove it from EGL configs if RGBA is supported. Signed-off-by: Lepton Wu <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* egl/android: Enable HAL_PIXEL_FORMAT_RGBA_1010102 formatrenchenglei2019-09-121-0/+3
| | | | | | | | | | | | The patch adds support for HAL_PIXEL_FORMAT_RGBA_1010102 on Android platform. Fixes android.media.cts.DecoderTest#testVp9HdrStaticMetadata which failed in egl due to "Unsupported native buffer format 0x2b" on Android. Reviewed-by: Tapani Pälli <[email protected]> Signed-off-by: Chenglei Ren <[email protected]>
* iris: trivial whitespace fixesKenneth Graunke2019-09-111-2/+2
|
* u_format: float type for R11G11B10_FLOAT/R9G9B9E5_FLOATJonathan Marek2019-09-111-2/+2
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* u_format: add ETC2 to util_format_srgb/util_format_linearJonathan Marek2019-09-111-0/+12
| | | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* dri: Use DRM_FORMAT_* instead of defining our own copy.Eric Anholt2019-09-116-112/+114
| | | | | | | | | | | | We have only two defines that aren't from DRM_FORMAT_*: SARGB and SABGR. Keep only those as __DRI_IMAGE_FOURCC and garbage collect the rest. While this header is also used from the X server, the X server doesn't use any __DRI_IMAGE enums. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* st/mesa: Only pause queries if there are any active queries to pause.Kenneth Graunke2019-09-115-4/+17
| | | | | | | | | | | | | Previously, ReadPixels, PBO upload/download, and clears would call cso_save_state with CSO_PAUSE_QUERIES, causing cso_context to call pipe->set_active_query_state() twice for each operation. This can potentially cause driver work to enable/disable statistics counters. But often, there are no queries happening which need to be paused. By keeping a simple tally of active queries, we can skip this work. Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* Fix missing dri2_load_driver on platform_drmJean Hertel2019-09-111-0/+15
| | | | | | Signed-off-by: Jean Hertel <[email protected]> Acked-by: Eric Engestrom <[email protected]> Reviewed-by: Adam Jackson <[email protected]>
* intel/gen11+: Enable Hardware filtering of Semi-Pipelined State in WMAnuj Phogat2019-09-114-0/+29
| | | | | | | Initial benchmarking didn't show any performance benefits. But it might eventually. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* genxml/gen11+: Add COMMON_SLICE_CHICKEN4 registerAnuj Phogat2019-09-112-0/+10
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* egl/dri2: Refuse to add EGLConfigs with no supported surface typesAdam Jackson2019-09-111-13/+16
| | | | | | | | For example, the surfaceless platform only supports pbuffers. If the driver supports MSAA, we would still create a config, but it would have no supported surface types. That's meaningless, so don't do it. Reviewed-by: Eric Engestrom <[email protected]>
* gallium: Require LLVM >= 3.9Adam Jackson2019-09-115-183/+3
| | | | | | | | To go any further than this would be to break the current version of Android. Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Require LLVM >= 3.8Adam Jackson2019-09-111-4/+2
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>