summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radv: Disallow indirect outputs for GS on GFX9 as well.Bas Nieuwenhuizen2017-10-231-3/+1
| | | | | | | | Since it also uses the output vector before writing to memory. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* ac/nir: Fix nir_texop_lod on GFX for 1D arrays.Bas Nieuwenhuizen2017-10-231-1/+3
| | | | | Fixes: 1bcb953e166 'radv: handle GFX9 1D textures' Reviewed-by: Dave Airlie <[email protected]>
* radv/ac/nir: only emit tess factors to storage if tes reads themDave Airlie2017-10-233-2/+4
| | | | | | | | | | Otherwise we just need to write them to the tf ring. this seems to improve the tessellation demo on Bonarie ~2190->~2230 fps Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Don't use vgpr indexing for outputs on GFX9.Bas Nieuwenhuizen2017-10-221-0/+5
| | | | | | | | Due to LLVM bugs. Fixes a bunch of dEQP-VK.glsl.indexing.* tests. Fixes: e38685cc62e 'Revert "radv: disable support for VEGA for now."' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Account for compact array index in GS input load from LDS.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | Mirrors the vram path. Fixes: d4ecc3c9299 'ac/nir: Add loading from LDS for merged GS.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't compile shaders when they are cached already.Bas Nieuwenhuizen2017-10-211-19/+23
| | | | | | | | | | When the gs_copy_shader is NULL (due to an incomplete cache), but the main shaders are found, we still do the nir, but we shouldn't compile the shaders again. For merged shaders we should also account for the missing shaders. Fixes: ce03c119ce0 'radv: Add code to compile merged shaders.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't check for max GL GS invocations.Bas Nieuwenhuizen2017-10-211-2/+0
| | | | | | | We specify 127 instead of 32 as the limit in vulkan. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* radv: Don't explicitly reference vertex shader for draw_id.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | | | With merged shaders the vertex shader may not exist. This got in because the offending patch was written before merged shaders were upstream, but committed after. Fixes: 75dfab24a2c 'radv: refactor indirect draws with radv_draw_info' Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Don't reset cmd_buffer->state.dirty.Bas Nieuwenhuizen2017-10-211-2/+0
| | | | | | | | | Otherwise for non-indexed draws we set and immediately unset RADV_CMD_DIRTY_INDEX_BUFFER. As all the set functions should clear their own bit, this is unnecessary. Fixes: 341529dbee5 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <[email protected]>
* radv: Correctly detect changed shaders for vertex descriptors.Bas Nieuwenhuizen2017-10-211-6/+6
| | | | | | | | As they were emitted after the new pipeline, the changed pipeline detection was not working anymore. Fixes: 341529dbee5 'radv: use optimal packet order for draws' Reviewed-by: Samuel Pitoiset <[email protected]>
* ac/nir: Set larged wrokgroup size for GS on GFX9.Bas Nieuwenhuizen2017-10-211-1/+1
| | | | | | | They don't take a single wave anymore and we need the barriers. Fixes: 6bc42855f92 'radv: enable GS on GFX9' Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: Take the max workgroup size of all provided shaders.Bas Nieuwenhuizen2017-10-211-1/+6
| | | | | Fixes: ffaf4d608a1 'radv: Enable tessellation shaders for GFX9.' Reviewed-by: Dave Airlie <[email protected]>
* radv: Fix pipeline cache locking issuesAlex Smith2017-10-211-7/+23
| | | | | | | | | | | | Need to lock around the whole process of retrieving cached shaders, and around GetPipelineCacheData. This fixes GPU hangs observed when creating multiple pipelines in parallel, which appeared to be due to invalid shader code being pulled from the cache. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: don't assert on device init on CannonlakeLionel Landwerlin2017-10-211-2/+4
| | | | | | | v2: Warn that support is still in alpha (Jason) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: disable stencil pma fix on Gen > 9Lionel Landwerlin2017-10-211-0/+2
| | | | | | | | This workaround isn't listed on Gen10. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* blorp: enable R32G32B32X32 blorp ccs copiesLionel Landwerlin2017-10-211-0/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* meson: Fix vc5 deps on the XML-generated headers.Eric Anholt2017-10-202-2/+2
| | | | | I typoed and was depending on v3d_xml.h (the gzipped xml)_, not on the v3d_packet_v33_pack.h that the compiler and QPU packing actually use.
* broadcom/vc5: Propagate vc4 aliasing fix to vc5.Eric Anholt2017-10-201-1/+1
| | | | See e5fea0d621af2b14cf6c5e364eeaf293db460f2a
* broadcom/vc4: Fix aliasing issueStefan Schake2017-10-201-1/+1
| | | | | | | This was causing Android clang version 3.8.256229 to miscompile, presumably due to strict aliasing. Fixes: 14dc281c1332 ("vc4: Enforce one-uniform-per-instruction after optimization.")
* meson: Add support for EGL glvndDylan Baker2017-10-201-2/+44
| | | | | | Signed-off-by: Dylan Baker <[email protected]> Tested-by: Eric Engestrom <[email protected]> Reviewed-by: Lyude Paul <[email protected]>
* meson: build libEGLDylan Baker2017-10-206-25/+216
| | | | | | | | | | | | | | | | | | This is based heavily on Daniel Stone's work for the same, rebased on master and with a number of TODO's fixed. This does not implement glvnd (which is coming in a later patch) Meson builds egl slightly differently than autotools, namely it doesn't build an intermediate shared library. It doesn't do this because meson doesn't have problems with the name of the library being dynamically generated, so the glvnd and non-glvnd code can follow the same path. v2: - Don't reuse variable (Eric E.) Signed-off-by: Dylan Baker <[email protected]> Tested-by: Eric Engestrom <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* meson: move wayland_drm_protocol generation to wayland-drmDylan Baker2017-10-202-15/+13
| | | | | | | | | | | These files are needed by both vulkan wayland-wsi and by egl wayland-wsi, since the XML file is in src/egl/wayland/wayland-drm and we can include this directory in such a way that it will be loaded before egl and vulkan this allows us to avoid multiple calls to the same generator. Signed-off-by: Dylan Baker <[email protected]> Reviewed-and-Tested-by: Eric Engestrom <[email protected]>
* nir: Print the components referenced for split or packed shader in/outs.Eric Anholt2017-10-201-1/+25
| | | | | | | | | | | | | | Having 4 variables all called "gl_in_TexCoord0@n" isn't very informative, much better to see: decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0 (VARYING_SLOT_VAR0.x, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@0 (VARYING_SLOT_VAR0.y, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@1 (VARYING_SLOT_VAR0.z, 1, 0) decl_var shader_in INTERP_MODE_NONE float gl_in_TexCoord0@2 (VARYING_SLOT_VAR0.w, 1, 0) v2: Handle arrays and structs better (by Timothy) Reviewed-by: Timothy Arceri <[email protected]>
* nir: Add a safety check that we don't remove dead I/O vars after lowering.Eric Anholt2017-10-201-4/+14
| | | | | | | | | The pass only looks at var load/store intrinsics, not input load/store intrinsics, so assert that we don't see the other type. v2: Adjust comment indentation. Reviewed-by: Timothy Arceri <[email protected]>
* radv: disable implicit sync for radv allocated bos v3Andres Rodriguez2017-10-214-1/+8
| | | | | | | | | | | | | | | | | Implicit sync kicks in when a buffer is used by two different amdgpu contexts simultaneously. Jobs that use explicit synchronization mechanisms end up needlessly waiting to be scheduled for long periods of time in order to achieve serialized execution. This patch disables implicit synchronization for all radv allocations except for wsi bos. The only systems that require implicit synchronization are DRI2/3 and PRIME. v2: mark wsi bos as RADV_MEM_IMPLICIT_SYNC v3: Add drm version check (Bas) Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: factor out radv_alloc_memoryAndres Rodriguez2017-10-212-5/+25
| | | | | | | | | This allows us to pass extra parameters to the memory allocation operation that are not defined in the vulkan spec. This is useful for internal usage. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Expose VK_EXT_global_priorityAndres Rodriguez2017-10-214-0/+5
| | | | | | | Expose the extension string as supported Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: don't skip PS/VS partial flushAndres Rodriguez2017-10-211-8/+6
| | | | | | | | | | | | This patch helps lower high priority compute latency. Found by bisecting a perf regression on computeparticles with high priority compute queues enabled. Reverting this micro-optimization doesn't seem to have any negative effect on performance on Dota2 or ssao. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Implement VK_EXT_global_priorityAndres Rodriguez2017-10-214-8/+62
| | | | | | | | | This extension allows the caller to change a queue's system wide priority. This is useful for applications with specific latency constraints. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: hardcode shader WAVE_LIMIT to the maximum valueAndres Rodriguez2017-10-211-7/+14
| | | | | | | | | | | | | This is part of a cooperative scheduling approach used by radv. All drivers in the stack must opt-in to resource arbitration, otherwise GL based apps will be able to ignore system priorities. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: hardcode shader WAVE_LIMIT to the maximum valueAndres Rodriguez2017-10-211-9/+18
| | | | | | | | | | | | | | When WAVE_LIMIT is set, a submission will opt-in for SPI based resource scheduling. Because this mechanism is cooperative, we must ensure that all submissions have this field set, otherwise they will bypass resource arbitration. We always hardcode the field to its maximum value, instead of attempting to calculate an approximate usage. In testing, there were no benefits to using anything other than the maximum. Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* vulkan: update headers & registry to VK 1.0.63Andres Rodriguez2017-10-211-84/+180
| | | | | Signed-off-by: Andres Rodriguez <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* broadcom/vc5: Use SETMSF to handle discards.Eric Anholt2017-10-202-25/+12
| | | | | | | | A bit of spec text suggested that (like vc4) condition codes should be used for discards, and the simulator was fine with it, but the 7268 disagrees and you have to use SETMSF instead or the color comes through. Fixes glsl-fs-discard-01 and many of the interpolation-with-clipping tests.
* broadcom/vc5: Set the snorm/unorm packing functions to be lowered.Eric Anholt2017-10-201-0/+4
| | | | | | We don't have native instructions for them, so set up the lowering. Once we support the bfi instructions that get generated, they should start actually working.
* broadcom/vc5: Fix pasteo that broke vertex texturing.Eric Anholt2017-10-201-1/+1
| | | | | We weren't ever filling in the texture state record, so we'd dereference NULL from the shader.
* broadcom/vc5: Move default attribute value setup to the CSO and fix them.Eric Anholt2017-10-203-29/+23
| | | | | | | | | I was generating some stub values to bring the driver up, but fill them in properly now. We now set 1.0 or 1u as appropriate, and thanks to being in their own BO it fixes piglit failures on the 7268 (where our 4-byte alignment was insufficient). Fixes const-packHalf2x16.shader_test
* broadcom/vc5: Move most of the shader state attribute record to the CSO.Eric Anholt2017-10-204-65/+90
| | | | | This should reduce our draw-time overhead, and puts the code where it should go long term.
* broadcom/vc5: Fix build failure frm nir_shader::stage removal.Eric Anholt2017-10-201-4/+4
| | | | Fixes: 59fb59ad54d3 ("nir: Get rid of nir_shader::stage")
* i965/fs: Use align1 mode on ternary instructions on Gen10+Matt Turner2017-10-201-4/+8
| | | | | | | | | Align1 mode offers some nice features over align16, like access to more data types and the ability to use a 16-bit immediate. This patch does not start using any new features. It just emits ternary instructions in align1 mode. Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add align1 ternary instruction emission supportMatt Turner2017-10-201-55/+160
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add align1 ternary instruction disassembler supportMatt Turner2017-10-202-75/+288
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add align1 ternary instruction-word supportMatt Turner2017-10-201-0/+108
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add align1 ternary instruction support to conversion functionsMatt Turner2017-10-204-34/+101
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add align1 ternary instruction field encodingsMatt Turner2017-10-201-0/+35
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add functions to abstract access to 3src register typesMatt Turner2017-10-202-20/+23
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Rename brw_inst's functions that access the 3src register typeMatt Turner2017-10-203-18/+18
| | | | | | | | | Put hw_ in the name so that it's clear these are the hardware encodings. Similar to commit 9fb832332868 ("i965: Rename brw_inst's functions that access the register type") Reviewed-by: Scott D Phillips <[email protected]>
* i965: Rename brw_inst 3src functions in preparation for align1Matt Turner2017-10-204-86/+92
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Print subreg in units of type-size on ternary instructionsMatt Turner2017-10-201-5/+26
| | | | | | | | The instruction word contains SubRegNum[4:2] so it's in units of dwords (hence the * 4 to get it in terms of bytes). Before this patch, the subreg would have been wrong for DF arguments. Reviewed-by: Scott D Phillips <[email protected]>
* i965: Add functions for brw_reg_type <-> hw 3src typeMatt Turner2017-10-202-0/+58
| | | | Reviewed-by: Scott D Phillips <[email protected]>
* i965: Move brw_reg_type_is_floating_point to brw_reg_type.hMatt Turner2017-10-202-13/+15
| | | | | | | I'm going to call this from brw_inst.h, and I don't want to have to include all of brw_reg.h. Reviewed-by: Scott D Phillips <[email protected]>