aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* iris: Free the buffer when reading from the disk cache.Kenneth Graunke2019-06-041-3/+8
|
* panfrost/midgard: Don't promote non-SSA to pipeline registersAlyssa Rosenzweig2019-06-051-1/+3
| | | | | | | Fixes: 33800f4612 ("panfrost/midgard: Implement "pipeline register" prepass") Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno: Drop invalid scissor optimization.Eric Anholt2019-06-041-7/+0
| | | | | | | We do support TF now, so it's no longer valid. Besides, if we want this optimization, we should probably have mesa/st doing it right for everyone. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Reuse glsl_get_sampler_coordinate_components().Eric Anholt2019-06-041-25/+5
| | | | | | | | We have the GLSL type, so we can just ask it how many coordinates there are. The GLSL function already has Vulkan cases that we'd probably want eventually. Reviewed-by: Rob Clark <[email protected]>
* freedreno: Improve the pi approximations in trig lowering.Eric Anholt2019-06-041-2/+2
| | | | | | | | | | | | When comparing our sin/cos behavior to the closed source driver, I noticed that we were off by a bit (or, in the case of 1/2pi, 3 bits). Fixes: dEQP-GLES3.functional.shaders.random.trigonometric.vertex.52 dEQP-GLES3.functional.shaders.random.all_features.vertex.0 Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* ac: rename LLVM <= 7 helpers for readabilityMarek Olšák2019-06-041-37/+37
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* ac: fix a typo in ac_build_wg_scan_bottomMarek Olšák2019-06-041-1/+1
| | | | | Cc: 19.1 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* glx: Fix error message when no driverName is availableCaio Marcelo de Oliveira Filho2019-06-041-1/+1
| | | | | | | | | | | | | | | | | | | Just provide a "(null)" literal in case driverName is NULL. In file included from ../src/glx/dri3_glx.c:76: ../src/glx/dri3_glx.c: In function ‘dri3_create_screen’: ../src/glx/dri_common.h:70:36: error: ‘%s’ directive argument is null [-Werror=format-overflow=] 70 | #define CriticalErrorMessageF(...) dri_message(_LOADER_FATAL, __VA_ARGS__) | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ../src/glx/dri3_glx.c:1002:4: note: in expansion of macro ‘CriticalErrorMessageF’ 1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName); | ^~~~~~~~~~~~~~~~~~~~~ ../src/glx/dri3_glx.c:1002:50: note: format string is defined here 1002 | CriticalErrorMessageF("failed to load driver: %s\n", driverName); | ^~ cc1: some warnings being treated as errors Reviewed-by: Kenneth Graunke <[email protected]>
* virgl: resolve to correct level during texture readChia-I Wu2019-06-041-2/+2
| | | | | | | | | | When PIPE_TRANSFER_READ requires a resolve, we blit from the host storage to a temporary storage, and do a format conversion from the temporary storage to the guest storage. This change makes sure we convert to the correct level of the guest storage. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* virgl: fix texture resolving with compressed formatsChia-I Wu2019-06-041-12/+17
| | | | | | | | | util_format_translate_3d expects the source box to be aligned to the block size. When resolving, make sure the size of the staging buffer is aligned to the block size. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis <[email protected]>
* freedreno: Add printf pattern string.Bas Nieuwenhuizen2019-06-041-1/+1
| | | | | | | Some new flag setting disallows it due to being a security risk. Fixes: c9c1e261064 "mesa: prevent common string formatting security issues" Reviewed-by: Rob Clark <[email protected]>
* Revert "vl: Enable DRM by default."Bas Nieuwenhuizen2019-06-043-4/+4
| | | | | | | | | | Reason: meson.build:586:7: ERROR: Unknown variable "dep_libdrm". if building without x11 platform. This reverts commit 392c60928a5debbe6782ed1aa136597504bfbc5b.
* panfrost/midgard: .pos propagationAlyssa Rosenzweig2019-06-041-8/+72
| | | | | | | | | | | | | A previous optimization converts fmax(x, 0.0) instructions to fmov.pos. This pass then propagates the .pos from the move up to the source instruction (when possible). From there, copy propagation will eliminate the move. In the future, we might prefer to do this in common NIR code like we do for saturate, as Bifrost can also benefit. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Cleanup copy propagationAlyssa Rosenzweig2019-06-041-11/+4
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Implement "pipeline register" prepassAlyssa Rosenzweig2019-06-044-2/+96
| | | | | | | | | | | | | | This prepass, run after scheduling but before RA, specializes to pipeline registers where possible. It walks the IR, checking whether sources are ever used outside of the immediate bundle in which they are written. If they are not, they are rewritten to a pipeline register (r24 or r25), valid only within the bundle itself. This has theoretical benefits for power consumption and register pressure (and performance by extension). While this is tested to work, it's not clear how much of a win it really is, especially without an out-of-order scheduler (yet!). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Helpers for pipelineAlyssa Rosenzweig2019-06-045-9/+79
| | | | | Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Refactor schedule/emit pipelineAlyssa Rosenzweig2019-06-046-707/+744
| | | | | | | | | | | | | | | | | | | | | First, this moves the scheduler and emitter out of midgard_compile.c into their own dedicated files. More interestingly, this slims down midgard_bundle to be essentially an array of _pointers_ to midgard_instructions (plus some bundling metadata), rather than the instructions and packing themselves. The difference is critical, as it means that (within reason, i.e. as long as it doesn't affect the schedule) midgard_instrucitons can now be modified _after_ scheduling while having changes updated in the final binary. On a more philosophical level, this removes an IR. Previously, the IR before scheduling (MIR) was separate from the IR after scheduling (post-schedule MIR), requiring a separate set of utilities to traverse, using different idioms. There was no good reason for this, and it restricts our flexibility with the RA. So unify all the things! Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Cleanup RA (stylistic changes)Alyssa Rosenzweig2019-06-041-16/+30
| | | | | | | Trivial. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Share MIR utilitiesAlyssa Rosenzweig2019-06-042-40/+46
| | | | | | | These are more generally useful than the files they were constrained to. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Misc. cleanup for readibilityAlyssa Rosenzweig2019-06-042-15/+35
| | | | | | | | Mostly, this fixes a number of instances of lines >> 80 chars, refactoring them into something legible. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Extend RA to non-vec4 sourcesAlyssa Rosenzweig2019-06-041-77/+278
| | | | | | | | | | | | | | | | | | | | This represents a major break with the former RA design. We now use conflicting register classes to represent the subdivision of Midgard's 128-bit registers into varying sizes and arrangement. We determine class based on the number of components in the instructions' masks. To support this, we include a number of helpers in the RA to allow composing swizzles and masks, such that MIR written implicitly assuming .xyzw sources can be transformed to use actual (non-aligned) sources. The net result is a marked decrease in register pressure on non-vec4-exclusive shaders. We could still be doing much better. Not implemented yet are: - Register spilling - Per-component liveness Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Set masks on ld_varyAlyssa Rosenzweig2019-06-041-1/+3
| | | | | | | | | | | | These masks distinguish scalar/vec2/vec3 loads from the default vec4, which helps with assembly readability (since it's immediately obvious how many components are _actually_ affected, rather than doing mysterious things to an unknown number of unused components). Later in the series, this will enable smarter register allocation, as the unused components will not be interpreted abnormally. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Fix liveness analysis bugsAlyssa Rosenzweig2019-06-041-2/+8
| | | | | | | | | This fixes liveness analysis with respect to inline constants and branching. in practice, the symptom is abnormally high register pressure. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Set int outmod for "pasted" codeAlyssa Rosenzweig2019-06-041-0/+4
| | | | | | | | | | These snippets of integer assembly are injected for various purposes. Eventually, we'll want to implement these in NIR directly. Regardless, the "default" output modifier is different between floats and ints, so let's set the right one. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Hoist some utility functionsAlyssa Rosenzweig2019-06-043-64/+71
| | | | | | | | These were static to midgard_compile.c but are more generally useful across the compiler. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* panfrost/midgard: Remove pinningAlyssa Rosenzweig2019-06-042-27/+2
| | | | | | | | | This mechanism is only used by blend shaders, so just use a move here. Ideally, it'll be copy-propped and DCE'd away; this removes a source of considerable indirection and will simplify RA logic. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Ryan Houdek <[email protected]>
* nir/algebraic: Simplify max(abs(a), 0.0) -> abs(a)Alyssa Rosenzweig2019-06-041-0/+1
| | | | | | | | | | | This pattern was noticed in glmark's jellyfish scene. v2: Add inexact qualifier due to NaN behaviour. Minimal shader-db changes (slightly helped). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* mesa: prevent common string formatting security issuesMark Janes2019-06-041-0/+4
| | | | | | | | | | | | | | | | Adds a compile-time error for obvious security issues like: printf(string_var); The proposed flag is more tolerant than -Wformat-nonliteral. Specifically, it tolerates common mesa formatting like: static const char *shader_template = "really long string %d"; printf(shader_template, uniform_number); Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110833 Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/fs: Add an UNDEF instruction to avoid excess live rangesJason Ekstrand2019-06-046-5/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | With 8 and 16-bit types and anything where we have to use non-trivial strides registersto deal with restrictions, we end up with things that look like partial writes even though we don't care about any values in the register except those written by that instruction. This is particularly important when dealing with loops because liveness sees is_partial_write and the fact that an old version from a previous loop iteration may be valid at that point and extends all purely partially written values to the entire loop. This commit adds a new UNDEF instruction which does nothing (the generator doesn't emit anything) but which does a fake write to the register. This informs liveness that we don't care about any values before that point so it won't consider those registers to be falsely live. We can safely emit UNDEF instructions for all SSA values that come in from NIR and nearly all temporaries generated by various stages of the compiler. In particular, we need to insert UNDEF instructions when we handle region restrictions because the newly allocated registers are almost guaranteed to be partially written. No shader-db changes. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110432 Reviewed-by: Matt Turner <[email protected]>
* spirv: Update the OpenCL.std.h headerCaio Marcelo de Oliveira Filho2019-06-042-144/+339
| | | | | | | | | | | | This corresponds to commit 8b911bd2ba37677037b38c9bd286c7c05701bcda on GitHub. We previously tweaked OpenCL.std.h from upstream to be included in C code. Now upstream header can be included, however the symbol names are slightly different (include an OpenCLstd_ prefix), so this patch also fixes vtn_opencl.c to use those. Reviewed-by: Karol Herbst <[email protected]>
* radv: Use bo metadata for imported image tiling on Android.Bas Nieuwenhuizen2019-06-043-14/+61
| | | | | | This way we handle linear images etc. correctly. Acked-by: Samuel Pitoiset <[email protected]>
* vl: Enable DRM by default.Bas Nieuwenhuizen2019-06-043-4/+4
| | | | | | | | | | | | | | | If libdrm is found the pipe loader enables drm anyway, and that is pretty much the only extra dependency this code has. This enables creating libva display using a drm fd without having to enable the DRM (GBM really) backend of EGL, which is completely unrelated. Leaving the X11 platforms alone as they would still result in the additional inclusion of extra deps. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* anv: Advertise support for VK_EXT_fragment_shader_interlockJason Ekstrand2019-06-043-0/+12
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Implement SPV_EXT_fragment_shader_interlockJason Ekstrand2019-06-042-0/+38
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Update the headers from latest Khronos masterJason Ekstrand2019-06-042-3/+330
| | | | | | | This corresponds to 8b911bd2ba37677037b38c9bd286c7c05701bcda in https://github.com/KhronosGroup/SPIRV-Headers. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* vulkan: Update the XML and headers to 1.1.110Jason Ekstrand2019-06-042-23/+456
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* ac/nir: mark some texture intrinsics as convergentRhys Perry2019-06-041-0/+18
| | | | | | | | | | | | Otherwise LLVM can sink them and their texture coordinate calculations into divergent branches. v2: simplify the conditions on which the intrinsic is marked as convergent v3: only mark as convergent in FS and CS with derivative groups Cc: <[email protected]> Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: fix some compiler warningsRhys Perry2019-06-041-4/+4
| | | | | | | | | Fixes -Woverflow warnings with GCC 9.1.1 v2: use a cast instead of a bitwise and Signed-off-by: Rhys Perry <[email protected]> Reviewed-By: Bas Nieuwenhuizen <[email protected]>
* intel/fs: Skip registers faster when setting spill costsJason Ekstrand2019-06-041-2/+10
| | | | | | | | | | | | | | This might be slightly faster since we're doing one read rather than two before we decide to skip. The more important reason, however, is because no_spill prevents us from re-spilling spill registers. In the new world in which we don't re-calculate liveness every spill, we may not have valid liveness for spill registers so we shouldn't even look their live ranges up. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110825 Fixes: e99081e76d4 "intel/fs/ra: Spill without destroying the..." Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Tested-by: Tapani Pälli <[email protected]>
* radeonsi/nir: Fix type in bindless address computationConnor Abbott2019-06-041-2/+2
| | | | | | Bindless handles in GL are 64-bit. This fixes an assert failure in LLVM. Reviewed-by: Marek Olšák <[email protected]>
* etnaviv: implement set_active_query_state(..) for hw queriesChristian Gmeiner2019-06-041-1/+10
| | | | | | | | | | Clear w/ quad uses a normal draw which adds up to OQ. st/meta uses set_active_query_state(..) to tell the driver to pause queries in such cases. Fixes spec@arb_occlusion_query@occlusion_query_meta_save piglit. Signed-off-by: Christian Gmeiner <[email protected]>
* radv: do not use gfx fast depth clears for layered depth/stencil imagesSamuel Pitoiset2019-06-041-0/+1
| | | | | | | | | | The driver should only fast depth clears with the graphics path when the view covers all image layers, otherwise this might corrupt layers when HTILE is enabled. Cc: 19.0 19.1 [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac,radv: do not emit vec3 for raw load/store on SISamuel Pitoiset2019-06-044-8/+20
| | | | | | | | It's unsupported, only load/store format with vec3 are supported. Fixes: 6970a9a6ca9 ("ac,radv: remove the vec3 restriction with LLVM 9+")" Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* intel/compiler: Fix assertions in brw_alu3Sagar Ghuge2019-06-031-3/+3
| | | | | | | | | v2: Fix assertion for src1 (Ian Romanick) Fixes: 3b967e17 (intel/compiler: Avoid false positive assertions) Signed-off-by: Sagar Ghuge <[email protected]> Suggested-by: Matt Turner <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* iris: Fix SO stride units for DrawTransformFeedbackKenneth Graunke2019-06-032-2/+2
| | | | | | | | | | | Mesa measures in DWords. The hardware also claims to measure in DWords. Except the SO_WRITE_OFFSET field is actually bits 31:2, with 1:0 MBZ. Which means that it really measures in bytes. So, convert to bytes. Without this, our offset / stride denominator was 1/4th the size it should be, leading to 4x the vertex count that we should have had. Fixes GTF-GL46.gtf40.GL3Tests.transform_feedback2.transform_feedback2_two_buffers
* st/glsl: make sure to propagate initialisers to driver storageTimothy Arceri2019-06-045-27/+23
| | | | | | | | | | | | This essentially reverts 20234cfe3a20. Fixes piglit test: tests/spec/arb_get_program_binary/execution/uniform-after-restore.shader_test Fixes: 20234cfe3a20 "st/mesa: don't propagate uniforms when restoring from cache" Reviewed-by: Tapani Pälli <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110784
* spirv: Like Uniform, do nothing for UniformIdCaio Marcelo de Oliveira Filho2019-06-032-0/+3
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Implement SpvOpCopyLogicalCaio Marcelo de Oliveira Filho2019-06-031-0/+2
| | | | | | | | This is the same as SpvOpCopyObject but without the type checking, which is how vtn_composite_copy works, so we just need to hook the operation. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Generalize OpSelectCaio Marcelo de Oliveira Filho2019-06-031-38/+48
| | | | | | | | | | | | | | | SPIR-V 1.4 supports OpSelect over any composite type, and also allows scalar boolean condition for vector types -- a case which we already handled to support old GLSLang. Added a helper function to recursively perform nir_bcsel, that makes easier to support structs. v2: Replace asserts() with vtn_fail_if(). (Jason) v3: Simplify Condition and Result types verifications. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Move OpSelect handling to a functionCaio Marcelo de Oliveira Filho2019-06-031-60/+66
| | | | | | This will make a later change easier to review. Reviewed-by: Jason Ekstrand <[email protected]>