summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spirv: Only copy needed components for OpSpecConstantOpJason Ekstrand2019-06-191-1/+6
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use a single path for OpSpecConstantOp of OpVectorShuffleJason Ekstrand2019-06-191-37/+19
| | | | | | | | | Now that nir_const_value is a scalar, there's no reason why we need multiple paths here and it's just extra paths to keep working. While we're here, we also add a vtn_fail_if check that component indices are in-bounds. Reviewed-by: Karol Herbst <[email protected]>
* spirv: Use vtn_constan_uint() for array lengths and gather componentsJason Ekstrand2019-06-191-4/+2
| | | | Reviewed-by: Karol Herbst <[email protected]>
* spirv: Add a vtn_constant_int helperJason Ekstrand2019-06-192-17/+19
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Add a real is_integer helperJason Ekstrand2019-06-193-2/+10
| | | | Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Rename is_integer to is_integer_32Jason Ekstrand2019-06-1914-32/+33
| | | | | | | It only accepts 32-bit integers so it should have a more descriptive name. This patch should not be a functional change. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Ignore bit sizes in contains_integer()Jason Ekstrand2019-06-191-1/+1
| | | | | | | | | All of the callers for this function are looking at interpolation qualifiers and want to make sure they're declared flat. Any 64-bit integer inputs need to be flat. It's also makes the function make more sense since "integer" is fairly generic. Reviewed-by: Karol Herbst <[email protected]>
* glsl/types: Handle all bit sizes in glsl_type_is_integerJason Ekstrand2019-06-191-1/+1
| | | | | | | All of the callers of this function really just want to know if the type is an integer and don't care about bit size. Reviewed-by: Karol Herbst <[email protected]>
* glsl/nir_opt_access: Update uniforms correctly when only vars changeCaio Marcelo de Oliveira Filho2019-06-191-1/+13
| | | | | | | | | | | | | | Even if only variables access flags are changed, the existing NIR infrastructure expects metadata to be explicitly preserved, so do that. Don't care about avoiding preserve to be called twice since the cost is negligible. This scenario can be triggered by dead variables, and also by other intrinsics that read the variables -- but not cause progress to be made when processing the intrinsics. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Fix getting the sampler dim when arrays are involvedCaio Marcelo de Oliveira Filho2019-06-191-1/+2
| | | | | | | | | | Unwrap any array in the variable type so we can get the sampler dim. This fixes piglit test spec/arb_arrays_of_arrays/execution/image_store/basic-imageStore-const-uniform-index.shader_test. Fixes: f2d0e48ddc7 "glsl/nir: Add optimization pass for access flags" Reviewed-by: Kenneth Graunke <[email protected]>
* meson: Search for execinfo.hJory Pratt2019-06-194-7/+7
| | | | | | | | | | Rather than checking __GLIBC__/__UCLIBC__ macros as a proxy for execinfo.h presence, just check directly. This allows the build to work on musl. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: Heap-allocate 256K zlib bufferJory Pratt2019-06-191-1/+8
| | | | | | | | | | | | | | | | The disk cache code tries to allocate a 256 Kbyte buffer on the stack. Since musl only gives 80 Kbyte of stack space per thread, this causes a trap. See https://wiki.musl-libc.org/functional-differences-from-glibc.html#Thread-stack-size (In musl-1.1.21 the default stack size has increased to 128K) [mattst88]: Original author unknown, but I think this is small enough that it is not copyrightable. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* anv: Fix wrong printf formatterKenneth Graunke2019-06-191-1/+1
| | | | %lu is for unsigned long, %zu is for size_t. Just cast the data.
* iris: Bail on queries for INTEL_NO_HW=1.Kenneth Graunke2019-06-191-0/+5
| | | | | | We don't execute any of the commands to record snapshots, so we can't actually produce a real result. We do however need to avoid waiting on a syncpt which will never be signalled. So, just return 0.
* virgl: Support VIRGL_BIND_SHAREDDavid Riley2019-06-192-0/+3
| | | | | | | Support a new virgl bind type for shared buffers. Signed-off-by: David Riley <[email protected]> Reviewed-By: Gert Wollny <[email protected]>
* anv: write spirv-nir logs back to the applicationLionel Landwerlin2019-06-191-0/+35
| | | | | | | Using the existing VK_EXT_debug_report extension. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac/nir: Set speculatable for buffer loads where allowedConnor Abbott2019-06-191-3/+4
| | | | | | | | | | | | | | | | | | | | | This brings the nir path in line with the TGSI path. Totals from affected shaders: SGPRS: 2984 -> 2984 (0.00 %) VGPRS: 2792 -> 2652 (-5.01 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 247380 -> 248072 (0.28 %) bytes LDS: 0 -> 0 (0.00 %) blocks Max Waves: 121 -> 132 (9.09 %) Wait states: 0 -> 0 (0.00 %) Most of the change came from DiRT: Showdown, and came from sinking SSBO loads. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Use reorderable access flagConnor Abbott2019-06-191-4/+12
| | | | | | No changes with radeonsi shader-db. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a helper to determine if an intrinsic can be reorderedConnor Abbott2019-06-193-11/+13
| | | | | | | This is simple now, but we're going to be adding a few more conditions to this later. Reviewed-by: Timothy Arceri <[email protected]>
* st/nir: Use gl_nir_opt_accessConnor Abbott2019-06-191-0/+2
| | | | | | Nothing uses its results yet, that will come with the following commits. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Add optimization pass for access flagsConnor Abbott2019-06-194-0/+324
| | | | | | | | | | | | | | Right now, this just deduces when we can arbitrarily reorder SSBO and image loads, matching the existing logic in radeonsi's TGSI->LLVM pass. This approach can't handle some things that nir_opt_copy_prop_vars can, but it can handle images, and with GCM it lets us hoist reads outside of loops. We can also pass this information to LLVM which lets it do its own optimizations on it. This is GLSL only as I haven't tested it on Vulkan yet, and it would probably need a few changes to work there. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add reorderable memory access enumConnor Abbott2019-06-192-1/+10
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* nir/copy_prop_vars: Ignore volatile accessesConnor Abbott2019-06-191-0/+13
| | | | | | | | | The spec explicitly says that volatile writes can't be removed and volatile reads do not guarantee that the same value will still be around after the read, as if there were a barrier after each read/write. Just ignore them. Reviewed-by: Timothy Arceri <[email protected]>
* glsl/nir: Propagate access qualifiersConnor Abbott2019-06-192-6/+59
| | | | | | | | | | We were completely ignoring these before, except for putting them on variables. While we're here, don't set access qualifiers when converting to bindless since glsl_to_nir will already have set a more accurate qualifier that includes any qualifiers on struct members that are dereferenced. Reviewed-by: Timothy Arceri <[email protected]>
* nir: Allow qualifiers on copy_deref and image instructionsConnor Abbott2019-06-196-12/+48
| | | | | | | | In the next commit, we'll properly handle access qualifiers on struct members by propagating them to load/store instructions, but these instructions had no way to specify the qualifier. Reviewed-by: Timothy Arceri <[email protected]>
* ac,radeonsi: Always mark buffer stores as inaccessiblememonlyConnor Abbott2019-06-198-93/+61
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | inaccessiblememonly means that it doesn't modify memory accesible via normal LLVM pointers. This lets LLVM's dead store elimination, memcpy forwarding, etc. ignore functions with this attribute. We don't represent descriptors as pointers, so this property is always true of buffer and image stores. There are plans to represent descriptors via pointers, but this just means that now nothing is inaccessiblememonly, as LLVM will then understand loads/stores via its usual alias analysis. Radeonsi was mistakenly only setting it if the driver could prove that there were no reads, and then it was cargo-culted into ac_llvm_build and ac_llvm_to_nir. Rip it out of everything. statistics with nir enabled: Totals from affected shaders: SGPRS: 152 -> 152 (0.00 %) VGPRS: 128 -> 132 (3.12 %) Spilled SGPRs: 0 -> 0 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 9324 -> 9244 (-0.86 %) bytes LDS: 2 -> 2 (0.00 %) blocks Max Waves: 17 -> 17 (0.00 %) Wait states: 0 -> 0 (0.00 %) The only difference was a manhattan31 shader. Acked-by: Timothy Arceri <[email protected]> Acked-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* egl: add missing #includeEric Engestrom2019-06-191-0/+1
| | | | | | | | close() is in <unistd.h> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* radv: disable viewport clamping even if FS doesn't write ZSamuel Pitoiset2019-06-191-3/+1
| | | | | | | This fixes new CTS dEQP-VK.pipeline.depth_range_unrestricted.*. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: implement compressed FMASK texture reads with RADV_PERFTEST=tccompatcmaskSamuel Pitoiset2019-06-197-1/+103
| | | | | | | | | | | | | This allows us to disable the FMASK decompress pass when transitioning from CB writes to shader reads. This will likely be improved and enabled by default in the future. No CTS regressions on GFX8 but a few number of multisample CTS failures on GFX9 (they look related to the small hint). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix FMASK expand with SRGB formatsSamuel Pitoiset2019-06-191-1/+2
| | | | | | | | Found while working on DCC for MSAA. Fixes: 6b976024a87 ("radv: add support for FMASK expand") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* panfrost: Move to use ralloc for some allocationsTomeu Vizoso2019-06-197-37/+44
| | | | | | | | | | | | | | We have some serious leaks, so plug some and also move to ralloc to limit the lifetime of some objects to that of their parent. Lots more such work to do. For some reason, this fixes: dEQP-GLES2.functional.lifetime.attach.deleted_output.texture_framebuffer Signed-off-by: Tomeu Vizoso <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* egl: Don't add hardware device if there is no render node v2.Mathias Fröhlich2019-06-191-2/+2
| | | | | | | | | | | | | | | Do not offer a hardware drm backed egl device if no render node is available. The current implementation will fail on this egl device. On top it issues a warning that is actually missleading. There are finally more error paths that can fail on the way to a hardware backed egl device. Fixing all of them would kind of require opening the drm device and see if there is a usable driver associated with the device. The taken approach avoids a full probe and fixes at least this kind of problem on kvm virtualization hosts I observe here. Fixes: dbb4457d985 ("egl: add EGL_EXT_device_drm support") Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* etnaviv: support GL_ARB_seamless_cubemap_per_textureChristian Gmeiner2019-06-193-6/+10
| | | | | | | Passes spec@amd_seamless_cubemap_per_texture@amd_seamless_cubemap_per_texture Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-By: Guido Günther <[email protected]>
* etnaviv: update headers from rnndbChristian Gmeiner2019-06-196-22/+33
| | | | | | Update to etna_viv commit a3bf0da. Signed-off-by: Christian Gmeiner <[email protected]>
* radeonsi: fix undefined shift in macro definitionDave Airlie2019-06-191-1/+1
| | | | | | Pointed out by coverity Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nouveau: fix frees in unsupported IR error paths.Dave Airlie2019-06-194-0/+6
| | | | | | | | This is pointless in that we won't ever hit those paths in real life, but coverity complains. Fixes: f014ae3c7cce ("nouveau: add support for nir") Reviewed-by: Ilia Mirkin <[email protected]>
* panfrost: Move clearing logic into pan_jobRohan Garg2019-06-183-48/+68
| | | | Reviewed-by: Alyssa Rosenzweig <[email protected]>
* virgl: fix sync issue regarding discard/unsync transfersChia-I Wu2019-06-181-5/+15
| | | | | | | | | | | | | | | | | | | | | | GL_MAP_INVALIDATE_BUFFER_BIT cannot be treated as GL_MAP_INVALIDATE_RANGE_BIT naively. When we run into ptr = glMapBufferRange(buf, 0, size, GL_WRITE_BIT|GL_MAP_INVALIDATE_BUFFER_BIT); memcpy(ptr, data1, size); glUnmapBuffer(buf); ptr = glMapBufferRange(buf, size, size, GL_WRITE_BIT|GL_MAP_UNSYNCHRONIZED_BIT); memcpy(ptr, data2, size); glUnmapBuffer(buf); we never want data1 to be copy_transfer'ed. Because that would mean that data2 might overwrite valid data. Signed-off-by: Chia-I Wu <[email protected]> Reviewed-by: Alexandros Frantzis [email protected] Fixes: a22c5df0794 ("virgl: Use buffer copy transfers to avoid waiting when mapping") Reviewed-by: Emil Velikov <[email protected]>
* panfrost: Enable sRGBAlyssa Rosenzweig2019-06-181-4/+0
| | | | | | | Now that sRGB formats are supported for both rendering and sampling, advertise support. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Disable AFBC on sRGB buffersAlyssa Rosenzweig2019-06-181-0/+7
| | | | | | | | The performance impact is slightly mitigated by tiling the render target, but it's undeniably still slow compared to AFBC. Unfortunately, it doesn't look like AFBC and sRGB play nice... Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Enable sRGB fixed-function blendingAlyssa Rosenzweig2019-06-182-3/+17
| | | | | | | | For fixed-function, we have hardware to handle sRGB so we just set a flag. For blend shaders, it's rather more involved; this is currently unimplemented. Assert it out for now; we don't need it quite yet. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Specify sRGB in the render targetAlyssa Rosenzweig2019-06-181-1/+4
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement sRGB texturingAlyssa Rosenzweig2019-06-181-1/+1
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Add sRGB render target flagAlyssa Rosenzweig2019-06-182-0/+2
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Implement tiled renderingAlyssa Rosenzweig2019-06-181-0/+4
| | | | | | | | We already can sample from Mali's linear/tiled encoding (the one from Utgard -- AFBC is mostly unrelated); let's be able to render to it as well. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Decode rendering block typeAlyssa Rosenzweig2019-06-183-7/+37
| | | | | | | A mode for rendering tiled/uncompressed was noticed, so we reshuffle the MFBD render target definitions to explicitly include block type. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Refactor texture targetsAlyssa Rosenzweig2019-06-186-27/+81
| | | | | | | | | | | | | | | | | This combines the two cmdstream bits "is_3d" and "is_not_cubemap" into a single 2-bit texture target selection, noticing it's the same as the 2-bit selection in Midgard and Bifrost texturing ops. Accordingly, we share this definition and add the missing entry for 1D/buffer textures. This requires a nontrivial (but functionally similar) refactor of all parts of the driver to use the new definitions appropriately. Theoretically, this should add support for buffer textures, but that's obviously not tested and probably wouldn't work. While doing so, we notice the sRGB enable bit, which we document and decode as well here so we don't forget about it. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Figure out job requirements in pan_job.cRohan Garg2019-06-183-8/+16
| | | | | | | | Requirements for a job should be figured out in pan_job.c v2: [Alyssa] Fix early return Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Reset job counters once the job is submittedRohan Garg2019-06-182-5/+4
| | | | | | Move the reset out of frame invalidation into job submission Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Initial implementation of panfrost_job_submitRohan Garg2019-06-183-5/+23
| | | | | | | | Start fleshing out panfrost_job v2: [Alyssa: Remove unused variable, warning introduced] Reviewed-by: Alyssa Rosenzweig <[email protected]>