aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: allocate the array of immediates dynamicallySamuel Pitoiset2017-01-133-13/+24
| | | | | | | | | | | | | | | Currently, we can store up to 256 immediates in a static array, but this is not always enough. Instead, allocate a dynamic array like what we currently do for temps. This fixes a segfault with dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 No regressions found with full piglit run. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: remove some unused macros and functionsGrazvydas Ignotas2017-01-132-33/+1
| | | | | | | | | These seem unlikely to be used. Also remove irrelevant comment about SKL. v2: forgot to rebase on master Signed-off-by: Grazvydas Ignotas <[email protected]>
* anv: Avoid some resolves for samplable HiZ buffersNanley Chery2017-01-121-18/+49
| | | | | | | v2: Simplify nested ifs (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Enable sampling from HiZNanley Chery2017-01-122-4/+17
| | | | | | | v2: Restrict ISL_AUX_USAGE_HIZ to depth aspects Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Don't fast depth clear samplable HiZ buffers on BDWNanley Chery2017-01-121-0/+9
| | | | | | | | Avoid the resolves that would be required if fast depth clears were allowed for such buffers. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Add a helper to determine sampling with HiZNanley Chery2017-01-121-0/+7
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* isl/surface_state: Handle ISL_AUX_USAGE_HIZNanley Chery2017-01-121-0/+29
| | | | | | | v2: Remove redundant x/y offset asserts (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Perform HiZ resolves only on layout transitionsNanley Chery2017-01-122-56/+42
| | | | | | | | | | This is a better mapping to the Vulkan API and improves performance in all tested workloads. v2: Remove unnecessary image view aspect checks (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Disable HiZ for input attachmentsNanley Chery2017-01-122-16/+24
| | | | | | | | | v2 (Jason Ekstrand): - Add spec citation - Drop conditional Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Avoid resolves incurred by fast depth clearsNanley Chery2017-01-123-6/+23
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Prepare for transitioning to the requested final layoutNanley Chery2017-01-122-0/+6
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Store depth stencil layoutsNanley Chery2017-01-123-0/+17
| | | | | | | | | Store the current and requested depth stencil layouts so that we can perform the appropriate HiZ resolves for a given transition while recording a render pass. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Add helpers to handle depth buffer layout transitionsNanley Chery2017-01-121-0/+50
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Delete anv's HiZ op emit functionNanley Chery2017-01-123-233/+0
| | | | | | | This is no longer used. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Use the gen8 BLORP HiZ resolving functionNanley Chery2017-01-121-3/+24
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Add a gen8 HiZ op resolve functionNanley Chery2017-01-122-0/+88
| | | | | | | | | Add an entry point for resolving using BLORP's gen8 HiZ op function. v2: Manually add the aux info Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Use gen8 BLORP HiZ clearing functionsNanley Chery2017-01-122-5/+50
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp_clear: Add gen8 HiZ clearing functionsNanley Chery2017-01-122-0/+93
| | | | | | | | | | | | | Add an entry point for the optimized gen8 BLORP HiZ sequence. commit c9eaf12de20ac4143fe79d42018bdbb5a391356f fixed a bug that was unknowingly worked around by forcing additional clear rectangle alignment restrictions not specified in the PRMs. Now that the bug is no longer present, omit the additional alignment restrictions. v2: Adjust code comment about padding Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Enable HiZ support for multiple subpassesNanley Chery2017-01-123-13/+8
| | | | | | | | | We'll be using layout transitions later on in the series which can occur within and between subpasses. Turn this on now to simplify the change later. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Use ::anv_attachment_state for toggling HiZ per subpassNanley Chery2017-01-121-2/+4
| | | | | | | | | We're about to enable HiZ support for multiple subpasses. Use this field to keep track of whether or not subpass operations should treat the depth buffer as having an auxiliary HiZ buffer. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZNanley Chery2017-01-125-16/+18
| | | | | | | | | | | The helper doesn't provide additional functionality over the current infrastructure. v2: Add comment to anv_image::aux_usage (Jason Ekstrand) v3: Clarify comment for aux_usage (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Handle ISL_AUX_USAGE_HIZNanley Chery2017-01-121-1/+2
| | | | | | | | | Prevent assert failures that would occur in the next patch. v2: Don't remove asserts from blorp/blit (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/blorp: Add the BDW+ optimized HZ_OP sequence to BLORPNanley Chery2017-01-121-0/+87
| | | | | | | | | | | We'll be switching to layout-transition based resolves which can occur outside of a render pass. Add this sequence to BLORP, as using BLORP will enable emitting depth stencil state outside of a render pass (among other benefits). The depth buffer extent is ignored to enable eventual usage in VkCmdClearAttachments(). Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ac: automake: ensure that ./common is generatedEmil Velikov2017-01-131-0/+1
| | | | | | | | | | | Depending on the autoconf (or friends) version one may or may not have the ./common folder created. Thus in the latter case we'll fail to generate the file. Reviewed-by: Thierry Reding <[email protected]> Tested-by: Darren Salt <[email protected]> Reported-by: Darren Salt <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* nvc0/ir: only try to check for zero LOD if we aren't already forcing itIlia Mirkin2017-01-121-1/+1
| | | | | | | | | | There's a levelZero flag which forces texturing to pick level zero (and not consume an explicit LOD argument). This is set for MS targets, but could also be set for any other incoming instruction. As that is what determines whether a LOD argument is present, check that rather than the more indirect isMS logic. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: take extra push space into account for pushbuf_space callsIlia Mirkin2017-01-1215-56/+26
| | | | | | | | | | | | | | | | | | | | | | Ever since a long time ago when I messed around with fences, I ensure that after a PUSH_SPACE call there is enough space to write a fence out into the pushbuf. However the PUSH_SPACE macro is not all-knowing, and so sometimes we have to invoke nouveau_pushbuf_space manually with the relocs/pushes args set. If we don't take the extra allocation from PUSH_SPACE into account, then we will end up accidentally flushing when the code was not expecting a flush. This can lead to various runtime and rendering failures. The amount of extra allocation isn't that important - it has to be at least 8 based on the current nouveau_winsys.h setting, but even more won't hurt. I just rounded up to powers of 2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99354 Cc: "12.0 13.0" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Ben Skeggs <[email protected]>
* mapi: update the asm code to support x32Grazvydas Ignotas2017-01-131-3/+28
| | | | | | | | | | | Fixes crashes when both glx-tls and asm are enabled on x32. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94512 Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=575458 Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* ac/nir: use ac_emit_fdiv throughoutNicolai Hähnle2017-01-131-22/+6
| | | | | | | | ... and eliminate emit_fdiv and nir_to_llvm_context::fpmath_md_*, which are now unused. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: use ac_build_gather_values[_extended] throughoutNicolai Hähnle2017-01-131-65/+24
| | | | | | | | ... and eliminate the non-ac copies. Mostly straight-forward search & replace. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/nir: use ac_emit_llvm_intrinsic throughoutNicolai Hähnle2017-01-131-79/+41
| | | | | | | | ... by straight-forward search & replace, and eliminate emit_llvm_intrinsic. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused si_prepare_cube_coordsNicolai Hähnle2017-01-132-200/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: unify cube map coordinate handling between radeonsi and radvNicolai Hähnle2017-01-136-197/+440
| | | | | | | | | | | | | | | Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only touch first three coordinates in si_prepare_cube_coordsNicolai Hähnle2017-01-131-12/+1
| | | | | | | | Sourcing coords_arg[4] is actually never correct, since bias is handled differently in tex_fetch_args anyway. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused si_llvm_cube_to_2d_coordsNicolai Hähnle2017-01-131-28/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: restrict cube map derivative computations to the correct planeNicolai Hähnle2017-01-131-23/+107
| | | | | | | | | | | | | | | | | | | | As remarked by the comment in the original code, the old algorithm fails when (tc + deriv) points at a different cube face. Instead, simply project the derivative directly to the plane of the selected cube face. The new code is based on exactly differentiating (using the chain rule) the projection onto a plane corresponding to a fixed cube map face (which is still selected in the usual way based on the texture coordinate itself). The computations end up fairly involved, but we do save two reciprocal computations. Fixes GL45-CTS.texture_cube_map_array.sampling. v2: add 0.5 offset to tex coords only after derivative calculation v3: go back to 1.5 offset Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: communicate cube map coordinates more explicitlyNicolai Hähnle2017-01-131-33/+43
| | | | | | | v2: fix compile error that snuck in during rebase Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: move .gitignore for sid_tables.h tooGrazvydas Ignotas2017-01-131-0/+0
| | | | | | | | b838f642 "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nir/gcm: Fix a typo in a commentJason Ekstrand2017-01-121-1/+1
| | | | Reported-by: Matt Turner <[email protected]>
* nir/gcm: Rework the schedule late loopJason Ekstrand2017-01-121-5/+6
| | | | | | | | | | | | This fixes a bug in code motion that occurred when the best block is the same as the schedule early block. In this case, because we're checking (lca != def->parent_instr->block) at the top of the loop, we never get to the check for loop depth so we wouldn't move it out of the loop. This commit reworks the loop to be a simple for loop up the dominator chain and we place the (lca != def->parent_instr->block) check at the end of the loop. Reviewed-by: Matt Turner <[email protected]>
* glx: Add missing glproto dependency for gallium-xlib glxChuck Atkins2017-01-121-0/+1
| | | | | | | | Cc: [email protected] Cc: Bruce Cherniak <[email protected]> Signed-of-by: Chuck Atkins <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* ac, radeonsi: automake: add missing builddir includeEmil Velikov2017-01-122-0/+2
| | | | | | | | | | The generated file is correctly stored in the builddir as of earlier commit. Yet the commit forgot to add the respective include flag thus the compiler would error out failing to find sid_tables.h Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=99389 Fixes: d1dc22eb466 "ac: automake: rework sid_tables.h generation" Signed-off-by: Emil Velikov <[email protected]>
* radv: Call NIR passes using NIR_PASS_V.Bas Nieuwenhuizen2017-01-121-17/+7
| | | | | | | | Port of faa1edeeb7bbe9321c79587e592dce812e8caa78 "anv/pipeline: Call NIR passes using NIR_PASS_V" Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: Call nir_lower_constant_initializers.Bas Nieuwenhuizen2017-01-121-0/+13
| | | | | | | | | Port of c5d664f9dc2d281c74844cef36ecb9f5862a8f6a "anv/pipeline: Call nir_lower_constant_initializers" Signed-off-by: Bas Nieuwenhuizen <[email protected]> Cc: <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: Only call remove_dead_variables once.Bas Nieuwenhuizen2017-01-121-3/+3
| | | | | | | | Port of 43e0b0d4b255d910616c10e3e01bfec5db469e0e "anv/pipeline: Only call remove_dead_variables once" Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* st/nine: Protect dtors with mutexAxel Davy2017-01-124-19/+64
| | | | | | | | | | | | | | | | | | | | | When the flag D3DCREATE_MULTITHREAD is set, a global mutex is used to protect nine calls. However for performance reasons, AddRef and Release didn't hold the mutex, and instead used atomics. Unfortunately at item release, the item can be destroyed, and that destruction path should be protected by a mutex (at least for some objects). Without this patch, it is possible an app thread is in a dtor while another thread is making gallium nine calls. It is possible that two threads are using the same gallium pipe, which is forbiden. The problem has been made worse with csmt, because it can cause hang, since nine_csmt_process is not threadsafe. Fixes Hitman hang, and possibly others. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Flush the queue at device dtorAxel Davy2017-01-121-1/+6
| | | | | | | | Flush the queue to get refcounts right, and properly release the items, instead of throwing away all pending commands. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Process pending commands on ResetAxel Davy2017-01-123-0/+5
| | | | | | | | Some nine_state_* and nine_context_* functions used for Reset() require all pending commands are flushed. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Flush pending commands if needed for surface9 changesAxel Davy2017-01-122-13/+32
| | | | | | | nine_context uses NineSurface9 fields, thus we need to flush pending commands using the surface before changing the fields. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Rework CreatePipeSurfaceAxel Davy2017-01-122-22/+30
| | | | | | Create both surfaces in one call. Signed-off-by: Axel Davy <[email protected]>
* st/nine: Remove duplicated checksAxel Davy2017-01-122-10/+7
| | | | | | | | There is no need to check on csmt_active before calling nine_csmt_process, because the function checks already. Signed-off-by: Axel Davy <[email protected]>