summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* radv: remove radv_shader_context::num_output_{clips,culls}Samuel Pitoiset2018-08-311-7/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: adjust the cull dist mask in scan_shader_output_decl()Samuel Pitoiset2018-08-311-3/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: get length of the clip/cull distances array from usage maskSamuel Pitoiset2018-08-311-9/+40
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: do not recompute the output usage mask for clipdist twiceSamuel Pitoiset2018-08-311-4/+1
| | | | | | | The shader info pass takes care of this now. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: gather the output usage mask for clip/cull distances correctlySamuel Pitoiset2018-08-311-0/+8
| | | | | | | It's a special case because both are combined into a single array. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: add set_output_usage_mask() helperSamuel Pitoiset2018-08-311-17/+26
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radv: fix passing clip/cull distances from VS to PSSamuel Pitoiset2018-08-314-1/+51
| | | | | | | | | | | | | | | | | CTS doesn't test input clip/cull distances for the fragment shader stage, which explains why this was totally broken. I wrote a simple test locally that works now. This fixes a crash with GTA V and DXVK. Note that we are exporting unused parameters from the vertex shader now, but this can't be optimized easily because we don't keep the fragment shader info... Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107477 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* egl/wayland: do not leak wl_buffer when it is lockedJuan A. Suarez Romero2018-08-312-7/+16
| | | | | | | | | | | | | | If color buffer is locked, do not set its wayland buffer to NULL; otherwise it can not be freed later. Rather, flag it in order to destroy it later on the release event. v2: instruct release event to unlock only or free wl_buffer too (Daniel) This also fixes dEQP-EGL.functional.swap_buffers_with_damage.* tests. CC: Daniel Stone <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* ac/radeonsi: fix CIK copy max sizeDave Airlie2018-08-311-1/+3
| | | | | | | | | | | | While adding transfer queues to radv, I started writing some tests, the first test I wrote fell over copying a buffer larger than this limit. Checked AMDVLK and found the correct limit. Cc: <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix regression in indirect input swizzles.Dave Airlie2018-08-311-2/+5
| | | | | | | | | This fixes: tests/spec/arb_enhanced_layouts/execution/component-layout/vs-fs-array-dvec3.shader_test since I reworked the 64-bit swizzles. Fixes: bb17ae49ee2 (gallivm: allow to pass two swizzles into fetches.) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: fix tess/gs fetchs for new swizzle.Dave Airlie2018-08-311-5/+8
| | | | | | | | I have piglit results from my machine, but I must have messed up, and not built mesa in between properly. Fixes: bb17ae49ee2 (gallivm: allow to pass two swizzles into fetches.) Reviewed-by: Marek Olšák <[email protected]>
* mesa: ignore VAO IDs equal to 0 in glDeleteVertexArraysMarek Olšák2018-08-301-0/+4
| | | | | | | | This fixes a firefox crash. Fixes: 781a78914c798dc64005b37c6ca1224ce06803fc Reviewed-by: Ian Romanick <[email protected]>
* Revert "intel/tools/aubwrite: Always use physical addresses for traces."Kenneth Graunke2018-08-302-13/+11
| | | | | | | | This reverts commit f8cfc7766016d0ff7d52953e7a992b1e77c521d0. This appears to break intel_dump_gpu for Gen9 systems - I can load them in the simulator, but nothing happens. Reverting the patch makes the simulator properly execute our commands and shaders again.
* intel/nir: Lowering image loads and stores trashes all metadataJason Ekstrand2018-08-301-2/+2
| | | | | | | | | | | This fixes the GL_ARB_fragment_shader_interlock piglit test on gen8 platforms where the lack of metadata dirtying was causing another pass to accidentally delete a much needed loop. https://bugs.freedesktop.org/show_bug.cgi?id=107745 Fixes: 37f7983bcca1 "intel/compiler: Do image load/store lowering..." Jason Ekstrand <[email protected]> writes: Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/screen: Allow modifiers on sRGB formatsJason Ekstrand2018-08-301-3/+11
| | | | | | | | | | | | | | | | | | | This effectively reverts a26693493570a9d0f0fba1be617e01ee7bfff4db which was a misguided attempt at protecting intel_query_dma_buf_modifiers from invalid formats. Unfortunately, in some internal EGL cases, we can get an SRGB format validly in this function. Rejecting such formats caused us to not allow CCS in some cases where we should have been allowing it. This regressed the performance of some SynMark tests as well as GfxBench ALU2, Tessellation and Manhattan 3.0 tests There's some question of whether or not we really should be using SRGB "fourcc" formats that aren't actually in drm_foucc.h but there's not much harm in allowing them through here. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107223 Fixes: a26693493570 "i965/screen: Return false for unsupported..." Tested-By: Eero Tamminen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl/dri2: Guard against invalid fourcc formatsJason Ekstrand2018-08-301-0/+15
| | | | | | | | | | | | We already reject attempts to import images with invalid fourcc formats but don't really guard the queries all that well. This makes us error out in any calls to eglQueryDmaBufModifiersEXT if the given format is not a valid fourcc format. We also add an assert to ensure that drivers don't advertise any non-fourcc formats. Cc: [email protected] Tested-By: Eero Tamminen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl/dri2: Add a helper for the number of planes for a FOURCC formatJason Ekstrand2018-08-301-11/+21
| | | | | | | | | This also serves as a convenient "is this a fourcc format" check as well which we'll take advantage of in the next commit. Cc: [email protected] Tested-By: Eero Tamminen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* radv/meta: Set num_components on image_store intrinsicsJason Ekstrand2018-08-303-0/+6
| | | | | | | | | | | | Now that image load/store intrinsics are variable-width, we need to set num_components accordingly. In 15d39f474b890, both glsl_to_nir and spirv_to_nir were updated to properly set num_components but radv meta was left behind. Fixes: 15d39f474b890 "nir: Make image load/store intrinsics..." Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* gallivm: Detect VSX separately from AltivecVicki Pfau2018-08-303-19/+17
| | | | | | | | | | Previously gallivm would attempt to use VSX instructions on all systems where it detected that Altivec is supported; however, VSX was added to POWER long after Altivec, causing lots of crashes on older POWER/PPC hardware, e.g. PPC Macs. By detecting VSX separately from Altivec we can automatically disable it on hardware that supports Altivec but not VSX Signed-off-by: Vicki Pfau <[email protected]>
* nv50: bump compat glsl level to same as coreIlia Mirkin2018-08-291-1/+1
| | | | | | | Passes the compat piglits. I'm sure that there will be odd issues that aren't caught by them, but at least it should basically work. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: bump compat GLSL version to match coreIlia Mirkin2018-08-291-1/+1
| | | | | | This passes the handful of tests in piglit. Signed-off-by: Ilia Mirkin <[email protected]>
* glsl: avoid lowering texcoord array except in simple casesIlia Mirkin2018-08-291-0/+6
| | | | | | | | | With compat creeping up to geometry and tess shaders, lowering texcoord accesses/writes becomes more complicated. Since it's an optimization anyways, just avoid the complication for now. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* docs: update calendar 18.2.0-rc5 is out, extend to 18.2.0-rc6Andres Gomez2018-08-301-2/+2
| | | | Signed-off-by: Andres Gomez <[email protected]>
* st/mesa, gallium: add a workaround for No Mans SkyTimothy Arceri2018-08-306-0/+13
| | | | | | | | The spec seems clear this is not allowed but the Nvidia binary forces apps to add layout qualifiers so this works around the issue for No Mans Sky until the CTS can be sorted out. Reviewed-by: Marek Olšák <[email protected]>
* glsl: add a mechanism to allow layout qualifiers on function paramsTimothy Arceri2018-08-304-0/+25
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The spec is quite clear this is not allowed: From Section 4.4. (Layout Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers can appear in several forms of declaration. They can appear as part of an interface block definition or block member, as shown in the grammar in the previous section. They can also appear with just an interface-qualifier to establish layouts of other declarations made with that qualifier: layout-qualifier interface-qualifier ; Or, they can appear with an individual variable declared with an interface qualifier: layout-qualifier interface-qualifier declaration ;" From Section 4.10 (Memory Qualifiers) of the GLSL 4.60 spec: "Layout qualifiers cannot be used on formal function parameters, and layout qualification is not included in parameter matching." However on the Nvidia binary driver they actually fail to compile if image function params don't have a layout qualifier. This results in applications such as No Mans Sky using layout qualifiers on params. I've submitted a CTS test to expose this problem in the Nvidia driver but until that is resolved this patch will help Mesa drivers work around the issue. Reviewed-by: Marek Olšák <[email protected]>
* glsl: skip stringification in preprocessor if in unreachable branchTimothy Arceri2018-08-301-2/+4
| | | | | | | This fixes compilation of some "No Mans Sky" shaders where the stringification happens in branches intended for DX12. Reviewed-by: Ian Romanick <[email protected]>
* radv: Add missing checks in radv_get_image_format_properties.Bas Nieuwenhuizen2018-08-301-0/+19
| | | | | CC: <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallivm: allow to pass two swizzles into fetches.Dave Airlie2018-08-304-41/+79
| | | | | | | | | | | | This hijacks the top 16-bits of swizzle, to pass in the swizzle for the second channel. This fixes handling .yx swizzles of 64-bit values. This should fixup radeonsi and llvmpipe. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107524 Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: enable radeonsi_zerovram for No Mans SkyTimothy Arceri2018-08-301-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add radeonsi_zerovram driconfig optionTimothy Arceri2018-08-303-3/+13
| | | | | | | More and more games seem to require this so lets make it a config option. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: enable GL 4.5 in compat profileTimothy Arceri2018-08-301-2/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: enable ARB_direct_state_access in compat for GL3.1+Timothy Arceri2018-08-304-104/+114
| | | | | | | | We could enable it for lower versions of GL but this allows us to just use the existing version/extension checks that are already used by the core profile. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add a thorough clear/copy_buffer benchmarkMarek Olšák2018-08-299-153/+599
|
* radeonsi: let internal compute dispatches tune WAVES_PER_SHMarek Olšák2018-08-292-0/+9
|
* radeonsi: add TGSI_SEMANTIC_CS_USER_DATA for reading up to 4 SGPRs with TGSIMarek Olšák2018-08-296-3/+34
|
* radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA_SI for measuring DMA on SIMarek Olšák2018-08-292-0/+20
| | | | DMA on SI doesn't support the timestamp packet, so it's emulated.
* radeonsi: add SI_QUERY_TIME_ELAPSED_SDMA for measuring SDMA performanceMarek Olšák2018-08-295-2/+55
|
* radeonsi: add flag L2_STREAM for minimal cache usageMarek Olšák2018-08-293-6/+13
|
* gallium: add TGSI_MEMORY_STREAM_CACHE_POLICYMarek Olšák2018-08-294-4/+12
| | | | For internal radeonsi shaders.
* intel/compiler: Remove surface_idx from brw_image_paramJason Ekstrand2018-08-296-29/+9
| | | | | | | Now that the drivers are lowering to surface indices themselves, we no longer need to push the surface index into the shader. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Use TXS for image_size when we have a typed surfaceJason Ekstrand2018-08-295-4/+74
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965: Lower away image derefs in the driverJason Ekstrand2018-08-2910-169/+371
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Add handle/index-based image intrinsicsJason Ekstrand2018-08-293-20/+82
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-299-33/+44
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/link,i965: Make ImageAccess four-stateJason Ekstrand2018-08-294-15/+22
| | | | | | | | | | | | | | | | | The GLSL spec allows you to set both the "readonly" and "writeonly" qualifiers on images to indicate that it can only be used with imageSize. However, we had no way of representing this int he linked shader and flagged it as GL_READ_ONLY. This is good from a "does it use this buffer?" perspective but not from a format and access lowering perspective. By using GL_NONE for if "readonly" and "writeonly" are both set, we can detect this case in the driver and handle it correctly. Nothing currently relies on the type of surface in the "readonly" + "writeonly" case but that's about to change. i965 is the only drier which uses the ImageAccess field and gl_bindless_image::access is currently unused. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Use two components for 1D array image sizesJason Ekstrand2018-08-292-29/+20
| | | | | | | | | | Having the array length component stored in .z was a small convenience for the ISL image param filling code and an annoyance in the NIR lowering code. The only convenience of treating 1D arrays like 2D arrays in the lowering code is in the address calculation code so let's put all the complexity there as well. Reviewed-by: Kenneth Graunke <[email protected]>
* isl: Use the view array length for the image sizeJason Ekstrand2018-08-291-2/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Do image load/store lowering to NIRJason Ekstrand2018-08-2910-1120/+896
| | | | | | | | | | | | | | | | | | | | | | | This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <[email protected]>
* nir/types: Add a wrapper for coordinate_componentsJason Ekstrand2018-08-292-0/+8
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Remove dead image loads in lower_input_attacnmentsJason Ekstrand2018-08-291-2/+2
| | | | | | | Dead code will get rid of them eventually but it's better if they're just gone so we guarantee they won't trip up later passes. Reviewed-by: Kenneth Graunke <[email protected]>