aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/radeonsi
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: don't forget to add HTILE to the buffer list for texturingMarek Olšák2017-01-201-6/+13
| | | | | | | | | | | | | This fixes VM faults. Discovered by Samuel Pitoiset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450 Cc: 17.0 13.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> (cherry picked from commit e490b7812cae778c61004971d86dc8299b6cd240)
* radeonsi: fix texture gather on stencil texturesNicolai Hähnle2017-01-201-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | At least on VI, texture gather doesn't work with a 24_8 data format, so use 8_8_8_8 and a modified swizzle instead. A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select the X24S8 pipe format because we don't support stencil-only render targets properly. With mip-mapping this can lead to a setup where the tiling is incompatible with stencil texturing, and a flushed stencil texture is used. For the flushed stencil, a literal X24S8 is used because there were issues with an 8bpp DB->CB copy. Longer term, it would be good if we could get away from these workarounds, i.e. properly support an S8 format for stencil-only rendering and flushed stencil. Since stencil texturing is somewhat rare, it's not a high priority. Fixes GL45-CTS.texture_cube_map_array.sampling. Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> (cherry picked from commit 3cd092c41508dde2e6259f09df1736911a828548)
* radeonsi: Always leave poly_offset in a valid stateZachary Michaels2017-01-201-1/+3
| | | | | | | | | | | This commit makes si_update_poly_offset set poly_offset to NULL if uses_poly_offset is false. This way poly_offset either points into the currently queued rasterizer, or it is NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451 Cc: "13.0 17.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit d7d32b3bfe86bd89d94d59393907bce1cb9dab7c)
* radeonsi: determine in advance which VBOs should be added to the buffer listMarek Olšák2017-01-183-4/+11
| | | | | | v2: now it should be correct Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptorsMarek Olšák2017-01-181-8/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: reject invalid vertex buffer indices at state creationMarek Olšák2017-01-182-5/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a global dirty mask for shader pointersMarek Olšák2017-01-184-41/+51
| | | | | | Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a bitmask-based loop in si_decompress_texturesMarek Olšák2017-01-183-7/+31
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip an unnecessary mutex lock for L2 prefetchesMarek Olšák2017-01-181-5/+7
| | | | | | the mutex lock is inside util_range_add. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: si_cp_dma_prepare is a no-op for L2 prefetchesMarek Olšák2017-01-182-5/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATEMarek Olšák2017-01-182-10/+15
| | | | | | | the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the correct target machine when building shader variantsMarek Olšák2017-01-182-14/+29
| | | | | | | | | | If the shader selector is created with a different context than the shader variant, we should use the calling context's target machine for the shader variant. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move shader pipe context state into a separate structureMarek Olšák2017-01-182-14/+22
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* android: ac/debug: move sid_tables.h generation and IB decode to amd/commonMauro Rossi2017-01-181-12/+3
| | | | | | | | | | | | | | | This patch is the porting to android of the following commits: b838f64 "ac/debug: Move sid_tables.h generation to common code." 0ef1b4d "ac/debug: Move IB decode to common code." Fixes android building errors due to sid_tables.h and ac_debug.c, ac_debug.h moved to amd/common Tested by building nougat-x86 Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* android: radeonsi: fix LLVMInitializeAMDGPU* functions declarationMauro Rossi2017-01-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:129:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:130:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:131:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:132:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+Marek Olšák2017-01-171-2/+5
| | | | | | Cc: 17.0 13.0 <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add flags parameter to texture barrierIlia Mirkin2017-01-161-1/+1
| | | | | | | | This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_TGSI_FS_FBFETCHIlia Mirkin2017-01-161-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix R600_DEBUG=nooptvariantNicolai Hähnle2017-01-161-1/+1
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Vedran Miletić <[email protected]>
* radeonsi: implement GL_FIXED vertex formatMarek Olšák2017-01-163-7/+20
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: implement 32-bit SNORM/UNORM/SSCALED/USCALED vertex formatsMarek Olšák2017-01-163-18/+90
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make fix_fetch 64-bitMarek Olšák2017-01-165-9/+9
| | | | | | v2: add u_bit_consecutive64 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: replace si_shader_context::soa by bld_baseSamuel Pitoiset2017-01-133-82/+78
| | | | | | | | | | We no longer need to use lp_build_tgsi_soa_context. No regressions founds with full piglit run. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: replace ctx->soa.outputs by ctx->outputsSamuel Pitoiset2017-01-132-23/+26
| | | | | | | | | The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: move si_shader_context::soa::addr to si_shader_contextSamuel Pitoiset2017-01-133-11/+12
| | | | | | | | | The plan is to replace si_shader_context::soa with its parent structure (ie. bld_base). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: allocate the array of immediates dynamicallySamuel Pitoiset2017-01-133-13/+24
| | | | | | | | | | | | | | | Currently, we can store up to 256 immediates in a static array, but this is not always enough. Instead, allocate a dynamic array like what we currently do for temps. This fixes a segfault with dEQP-GLES31.functional.ssbo.layout.random.all_shared_buffer.23 No regressions found with full piglit run. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused si_prepare_cube_coordsNicolai Hähnle2017-01-132-200/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* amd/common: unify cube map coordinate handling between radeonsi and radvNicolai Hähnle2017-01-133-1/+11
| | | | | | | | | | | | | | | Code is taken from a combination of radv (for the more basic functions, to avoid gallivm dependencies) and radeonsi (for the new and improved derivative calculations). v2: add 0.5 offset to tex coords only after derivative calculation v3: - really only touch the first three coordinates - rebase on the removal of the 1.5 --> 0.5 offset change Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only touch first three coordinates in si_prepare_cube_coordsNicolai Hähnle2017-01-131-12/+1
| | | | | | | | Sourcing coords_arg[4] is actually never correct, since bias is handled differently in tex_fetch_args anyway. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: remove unused si_llvm_cube_to_2d_coordsNicolai Hähnle2017-01-131-28/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: restrict cube map derivative computations to the correct planeNicolai Hähnle2017-01-131-23/+107
| | | | | | | | | | | | | | | | | | | | As remarked by the comment in the original code, the old algorithm fails when (tc + deriv) points at a different cube face. Instead, simply project the derivative directly to the plane of the selected cube face. The new code is based on exactly differentiating (using the chain rule) the projection onto a plane corresponding to a fixed cube map face (which is still selected in the usual way based on the texture coordinate itself). The computations end up fairly involved, but we do save two reciprocal computations. Fixes GL45-CTS.texture_cube_map_array.sampling. v2: add 0.5 offset to tex coords only after derivative calculation v3: go back to 1.5 offset Reviewed-by: Bas Nieuwenhuizen <[email protected]> (v2) Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: communicate cube map coordinates more explicitlyNicolai Hähnle2017-01-131-33/+43
| | | | | | | v2: fix compile error that snuck in during rebase Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: move .gitignore for sid_tables.h tooGrazvydas Ignotas2017-01-131-1/+0
| | | | | | | | b838f642 "ac/debug: Move sid_tables.h generation to common code." moved sid_tables.h but forgot the corresponding .gitignore. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac, radeonsi: automake: add missing builddir includeEmil Velikov2017-01-121-0/+1
| | | | | | | | | | The generated file is correctly stored in the builddir as of earlier commit. Yet the commit forgot to add the respective include flag thus the compiler would error out failing to find sid_tables.h Bugzila: https://bugs.freedesktop.org/show_bug.cgi?id=99389 Fixes: d1dc22eb466 "ac: automake: rework sid_tables.h generation" Signed-off-by: Emil Velikov <[email protected]>
* radeonsi: num_records is in units of stride for swizzled buffers even on VINicolai Hähnle2017-01-121-2/+0
| | | | | | The old setting didn't hurt, but this is cleaner. Reviewed-by: Marek Olšák <[email protected]>
* ac/debug: Dump indirect buffers.Bas Nieuwenhuizen2017-01-091-3/+6
| | | | | | | | | | | | | | This is for handling chained command buffers and secondary command buffers. It doesn't handle the trace id for secondary command buffers yet, but I don't think that is possible in general with just writes, as we could call a secondary command buffer multiple times. I think this is good enough for now, as the most useful case is the chaining when we grow an IB. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/debug: Move IB decode to common code.Bas Nieuwenhuizen2017-01-093-332/+15
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/debug: Move sid_tables.h generation to common code.Bas Nieuwenhuizen2017-01-093-308/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: fix the Witcher 2 black transitionsMarek Olšák2017-01-091-2/+13
| | | | | | | | v2: do it properly Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98238 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set si_shader_context::input_decls for ranged decls correctlyMarek Olšák2017-01-091-1/+4
| | | | | | | | This has no effect because no code uses those members with ranged decls. Tested-by: Edmondo Tommasina <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: cleanly communicate whether si_shader_dump should check R600_DEBUGMarek Olšák2017-01-095-13/+15
| | | | | | Tested-by: Edmondo Tommasina <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add TC L2 prefetch for shaders and VBO descriptorsMarek Olšák2017-01-063-1/+50
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add CP DMA flags for greater control over synchronizationMarek Olšák2017-01-063-16/+31
| | | | | | | for L2 prefetch Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: cleanly communicate which CP DMA packet is firstMarek Olšák2017-01-061-11/+21
| | | | | Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add HUD queries for cache flush statsMarek Olšák2017-01-061-0/+5
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't count fast clears and prefetches into CP DMA statsMarek Olšák2017-01-061-2/+6
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't wait for compute shaders in texture_barrierMarek Olšák2017-01-061-2/+1
| | | | | | | it doesn't interact with compute shaders in any way Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: assume that a TES without POSITION precedes GSMarek Olšák2017-01-061-1/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: unduplicate VS color export codeMarek Olšák2017-01-061-9/+2
| | | | | | | it's exactly the same as the other ones Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up more HAVE_LLVM #ifdefsMarek Olšák2017-01-061-8/+11
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>