aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
...
* swr: Align query results allocationGeorge Kyriazis2017-01-242-4/+5
| | | | | | | | | | | | Some query results struct contents are declared as cache line aligned. Use aligned malloc, and align the whole struct, to be safe. Fixes crash when compiling with clang. CC: <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]> (cherry picked from commit 00847e4f14dd237dfcdb2c3d15be1325a08ccf5a)
* swr: Prune empty nodes in CalculateProcessorTopology.Bruce Cherniak2017-01-241-0/+9
| | | | | | | | | | | | | CalculateProcessorTopology tries to figure out system topology by parsing /proc/cpuinfo to determine the number of threads, cores, and NUMA nodes. There are some architectures where the "physical id" begins with 1 rather than 0, which was creating and empty "0" node and causing a crash in CreateThreadPool. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97102 Reviewed-By: George Kyriazis <[email protected]> CC: <[email protected]> (cherry picked from commit b829206b0739925501bcc68233437d6d03b79795)
* r600: implement DDIVNicolai Hähnle2017-01-241-0/+59
| | | | | | | Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit e4f8f9a638c1ffb9b76840b088290f11f0f91813)
* r600: factor out cayman_emit_unary_double_rawNicolai Hähnle2017-01-241-20/+42
| | | | | | | | | We will use it for DDIV. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit 488560cfe6ee2206f7a7f894694ebc43b419be61)
* r600: double multiply can handle only one multiply at a timeNicolai Hähnle2017-01-241-17/+19
| | | | | | | | | | | It seems clear that trying to multiply two pairs of doubles would result in the temporary register getting overwritten by the second pair. So make the code more explicit. Tested-by: Glenn Kennard <[email protected]> Tested-by: James Harvey <[email protected]> Cc: 17.0 <[email protected]> (cherry picked from commit 76b02d2fe1df5351f67f53d07b37952043f0a84c)
* freedreno/a5xx: set frag shader threadsizeRob Clark2017-01-241-2/+7
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 31daeb5bf14334bc0d39f28c9102cd15d834abfc)
* freedreno/a5xx: set fragcoordxy properlyRob Clark2017-01-241-1/+1
| | | | | | | | | | | | What a3xx docs call IJPERSPCENTERREGID.. the xy coord passed into bary.f. We were incorrectly setting both this and gl_FragCoord.xy to the same register resulting in all sorts of hilarity. Fixes stk, vdrift, 0ad, probably a bunch others. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 8d6af93e76bb9e592293b632b22b2b756cc0cae8)
* freedreno/a5xx: fix psizeRob Clark2017-01-242-8/+5
| | | | | | | | | Note spritelist (POINTLIST_PSIZE) seems not to be a thing anymore on a5xx. Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 6cc93bedc15d09395ab6a92a0a129d06a8cd8ae8)
* freedreno/a5xx: srgb fixRob Clark2017-01-241-1/+2
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 141a4f86d6b9c0c4dbde511b741576a103f8f7ff)
* freedreno/a5xx: fix int vbosRob Clark2017-01-241-1/+3
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 69fbb458cf59fbab5f6675ad256a266b04d54700)
* freedreno/a5xx: fix clear for uint/sint formatsRob Clark2017-01-241-19/+28
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 16671e970444f154ffa60d2aaadee4d065eb6103)
* freedreno/a5xx: fix cull stateRob Clark2017-01-241-5/+5
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 4d9aa4f67d6316feea93901bf29b76a68c4333cd)
* freedreno: update generated headersRob Clark2017-01-246-13/+36
| | | | | | Signed-off-by: Rob Clark <[email protected]> Cc: "17.0" <[email protected]> (cherry picked from commit 4c39458460075f6c1ea9e4607769513b96c6dd82)
* gallium/hud: add missing break in hud_cpufreq_graph_install()Samuel Pitoiset2017-01-201-0/+1
| | | | | | | | Fixes: e99b9395bef "gallium/hud: Add support for CPU frequency monitoring" Cc: [email protected] Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (cherry picked from commit 383fc8e9f340e80695aca2cd585957af0e081eb9)
* radeonsi: don't forget to add HTILE to the buffer list for texturingMarek Olšák2017-01-201-6/+13
| | | | | | | | | | | | | This fixes VM faults. Discovered by Samuel Pitoiset. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=98975 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99450 Cc: 17.0 13.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> (cherry picked from commit e490b7812cae778c61004971d86dc8299b6cd240)
* radeonsi: fix texture gather on stencil texturesNicolai Hähnle2017-01-201-2/+16
| | | | | | | | | | | | | | | | | | | | | | | | At least on VI, texture gather doesn't work with a 24_8 data format, so use 8_8_8_8 and a modified swizzle instead. A bit of background: When creating a GL_STENCIL_INDEX8 texture, we select the X24S8 pipe format because we don't support stencil-only render targets properly. With mip-mapping this can lead to a setup where the tiling is incompatible with stencil texturing, and a flushed stencil texture is used. For the flushed stencil, a literal X24S8 is used because there were issues with an 8bpp DB->CB copy. Longer term, it would be good if we could get away from these workarounds, i.e. properly support an S8 format for stencil-only rendering and flushed stencil. Since stencil texturing is somewhat rare, it's not a high priority. Fixes GL45-CTS.texture_cube_map_array.sampling. Cc: 17.0 <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Acked-by: Edward O'Callaghan <[email protected]> (cherry picked from commit 3cd092c41508dde2e6259f09df1736911a828548)
* radeonsi: Always leave poly_offset in a valid stateZachary Michaels2017-01-201-1/+3
| | | | | | | | | | | This commit makes si_update_poly_offset set poly_offset to NULL if uses_poly_offset is false. This way poly_offset either points into the currently queued rasterizer, or it is NULL. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99451 Cc: "13.0 17.0" <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> (cherry picked from commit d7d32b3bfe86bd89d94d59393907bce1cb9dab7c)
* gallivm: use #ifdef not #if for PIPE_ARCH_BIG_ENDIANDave Airlie2017-01-201-1/+1
| | | | | | | | | This fixes the build on ppc/s390. Reviewed-by: Roland Scheidegger <[email protected]> Cc: "17.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]> (cherry picked from commit ef71b867ee152d8161a8c7320e89843801236249)
* radeonsi: determine in advance which VBOs should be added to the buffer listMarek Olšák2017-01-183-4/+11
| | | | | | v2: now it should be correct Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use fewer pointer dereferences in upload_vertex_buffer_descriptorsMarek Olšák2017-01-181-8/+9
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: reject invalid vertex buffer indices at state creationMarek Olšák2017-01-182-5/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a global dirty mask for shader pointersMarek Olšák2017-01-184-41/+51
| | | | | | Only vertex buffers use a separate bool flag. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use a bitmask-based loop in si_decompress_texturesMarek Olšák2017-01-183-7/+31
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: skip an unnecessary mutex lock for L2 prefetchesMarek Olšák2017-01-181-5/+7
| | | | | | the mutex lock is inside util_range_add. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: si_cp_dma_prepare is a no-op for L2 prefetchesMarek Olšák2017-01-182-5/+12
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add SI_CPDMA_SKIP_BO_LIST_UPDATEMarek Olšák2017-01-182-10/+15
| | | | | | | the next commit will use it in a clever way, because the CP DMA prefetch doesn't need this. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use the correct target machine when building shader variantsMarek Olšák2017-01-182-14/+29
| | | | | | | | | | If the shader selector is created with a different context than the shader variant, we should use the calling context's target machine for the shader variant. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99419 Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: move shader pipe context state into a separate structureMarek Olšák2017-01-182-14/+22
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* Revert "etnaviv: Fake occlusion query capability"Christian Gmeiner2017-01-181-3/+2
| | | | | | | | | | This reverts commit b7ac0f567123c96b5cd9e3485b963a5c0a0db66a. This is a half baked solution needs some rework to fixes issues with reported counter bits (GL_QUERY_COUNTER_BITS_ARB). Also it enables PIPE_CAP_QUERY_TIME_ELAPSED accidently. Signed-off-by: Christian Gmeiner <[email protected]>
* android: ac/debug: move sid_tables.h generation and IB decode to amd/commonMauro Rossi2017-01-181-12/+3
| | | | | | | | | | | | | | | This patch is the porting to android of the following commits: b838f64 "ac/debug: Move sid_tables.h generation to common code." 0ef1b4d "ac/debug: Move IB decode to common code." Fixes android building errors due to sid_tables.h and ac_debug.c, ac_debug.h moved to amd/common Tested by building nougat-x86 Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* android: gallium/auxiliary: fix building error in Android 7.0Mauro Rossi2017-01-181-1/+1
| | | | | | | | | | | | | | | | | | Conditional libLLVMCore static library dependency is added, for the case when MESA_ENABLE_LLVM is true Fixes the following building error with Android 7.0: In file included from external/mesa/src/gallium/auxiliary/gallivm/lp_bld_misc.cpp:62: ... external/llvm/include/llvm/IR/Attributes.h:68:14: fatal error: 'llvm/IR/Attributes.inc' file not found #include "llvm/IR/Attributes.inc" ^ 1 error generated. Reviewed-by: Emil Velikov <[email protected]>
* android: radeonsi: fix LLVMInitializeAMDGPU* functions declarationMauro Rossi2017-01-181-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:129:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:130:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:131:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeonsi/si_shader_tgsi_setup.c:132:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* android: radeon: fix LLVMInitializeAMDGPU* functions declarationMauro Rossi2017-01-181-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | LLVMInitializeAMDGPU* functions need to be explicitly declared and mesa expects them via <llvm-c/Target.h> header, but LLVM needs to be instructed to invoke its own LLVM_TARGET(AMDGPU) macro, or the functions will not be available. A new llvm cflag (-DFORCE_BUILD_AMDGPU) serves this purpose, the same mechanism is used also by other llvm targets e.g. FORCE_BUILD_ARM A necessary prerequisite is to have AMDGPU target handled accordingly in llvm config files i.e. {Target,AsmParser,AsmPrinter}.def for llvm device build includes. This avoids the following building errors: external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:121:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:122:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:123:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa/src/gallium/drivers/radeon/radeon_llvm_emit.c:124:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' [-Werror=implicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ Acked-by: Nicolai Hähnle <[email protected]> Acked-by: Emil Velikov <[email protected]>
* nouveau: remove always false argument in nouveau_fence_new()Emil Velikov2017-01-185-11/+6
| | | | | | | | | No point in having the extra argument considering that it's effectively unused since the function was introduced. Cc: Ilia Mirkin <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: correctly manage libsensors link flagsEmil Velikov2017-01-182-2/+1
| | | | | | | | | | We should be using LIBS rather than the LDFLAGS variable. Furthermore try to keep the linking to the final stage, rather than intermetent static library. Cc: Steven Toth <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* etnaviv: Fake occlusion query capabilityWladimir J. van der Laan2017-01-181-2/+3
| | | | | | | | | | | | | This enables the PIPE_CAP_OCCLUSION_QUERY capability without adding an occlusion query type. This is necessary to get Mesa to report desktop GL 2.0 support (to run exciting things such as ioq3's OpenGL 2 renderer), and should be valid because exposing the capability does not guarantee that any counters are actually implemented. Signed-off-by: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: add flags parameter to texture barrierChristian Gmeiner2017-01-181-1/+1
| | | | | | Fixes compile warning introduced by commit a1c848. Signed-off-by: Christian Gmeiner <[email protected]>
* etnaviv: handle PIPE_CAP_TGSI_FS_FBFETCHChristian Gmeiner2017-01-181-0/+1
| | | | | | Fixes compile warning introduced by commit ee3ebe. Signed-off-by: Christian Gmeiner <[email protected]>
* gallivm: (trivial) fix copy/paste bug with big endian codeRoland Scheidegger2017-01-181-2/+4
| | | | | | 8bd67a35c50e68c21aed043de11e095c284d151a introduced using undefined variable on big endian archs due to copy/paste bug. (compile hack tested only)
* configure.ac: Revert recent HAVE_LLVM changes.Jose Fonseca2017-01-185-12/+12
| | | | | | | | | | | | | | | | This reverts changes 903eb09b5fb78d47d0f8a4bdf826a113ca2aff40..1a0aa468f354f0ee94dd383cd40ae915584624aa: Tobias Droste (5): configure.ac: Rename MESA_LLVM to FOUND_LLVM configure.ac: Only set LLVM_LIBS if LLVM is used configure.ac: Only define HAVE_LLVM if LLVM is used configure.ac: Set and use HAVE_GALLIUM_LLVM define configure.ac: Don't check LLVM version in gallium_require_llvm They break scons build, and I'm not convinced this is the right fix. In particular changing HAVE_LLVM in the C code is something I'd rather avoid no matter what. So it's better to discuss without the pressure of broken builds.
* configure.ac: Set and use HAVE_GALLIUM_LLVM defineTobias Droste2017-01-185-12/+12
| | | | | | | | | | | | Gallium code used HAVE_LLVM to check if it needs to compile code for LLVM in header and source files. With the new logic HAVE_LLVM is always set. Use extra define to figure out if LLVM is used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99010 Signed-off-by: Tobias Droste <[email protected]>
* gallivm: Cleanup USE_MCJIT.Jose Fonseca2017-01-181-10/+25
| | | | | | | Split USE_MCJIT macro dual nature into a separate constant time define and a run-time variable. Reviewed-by: Emil Velikov <[email protected]>
* radeonsi: for the tess barrier, only use emit_waitcnt on SI and LLVM 3.9+Marek Olšák2017-01-171-2/+5
| | | | | | Cc: 17.0 13.0 <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* st/vdpau: remove the delayed rendering hack(v1.1)Nayan Deshmukh2017-01-176-141/+52
| | | | | | | | | | the hack was introduced to avoid an extra copying but now with dri3 we don't need it anymore v1.1: rebasing Signed-off-by: Nayan Deshmukh <[email protected]> Acked-by: Christian König <[email protected]>
* st/vdpau: use dri3 to directly send the buffer to X(v2)Nayan Deshmukh2017-01-172-27/+33
| | | | | | | | | | | | this avoids an extra copy which occurs in case of dri2 v1.1: fallback to dri2 if dri3 fails to initialize v2: add PIPE_BIND_SCANOUT to output buffers as they will be send to X server directly (Michel) Suggested-by: Christian König <[email protected]> Tested-by: Andy Furniss <[email protected]> Signed-off-by: Nayan Deshmukh <[email protected]>
* vl/dri3: use external texture as back buffers(v4)Nayan Deshmukh2017-01-172-17/+114
| | | | | | | | | | | | | | | | | | | | | dri3 allows us to send handle of a texture directly to X so this patch allows a state tracker to directly send its texture to X to be used as back buffer and avoids extra copying v2: use clip width/height to display a portion of the surface v3: remove redundant variables, fix wrapping, rename variables handle vaapi path v3.1: we need clip_width/height for every frame so we don't need to maintain it for each buffer instead use a global variable v4: In case of single gpu we can cache the buffers as applications use constant number of buffer and we can avoid calls to present extension for every frame Reviewed and Suggested-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Tested-by: Andy Furniss <[email protected]> Signed-off-by: Nayan Deshmukh <[email protected]>
* nv50/ir: optimize shl + andIlia Mirkin2017-01-161-0/+11
| | | | | | | | | | | | | | | | | Address loading can often end up as shl + shr + shl combinations. The latter two are equal shifts, which get converted into an and mask. However if the previous shl is more than the mask is trying to remove (in terms of low bits), we can just remove the and entirely. This reduces some large shaders by as many as 3% of instructions (out of 2K). total instructions in shared programs : 6495509 -> 6491076 (-0.07%) total gprs used in shared programs : 954621 -> 954623 (0.00%) local gpr inst bytes helped 0 0 1014 1014 hurt 0 2 0 0 Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: enable FBFETCH with a special slot for color buffer 0Ilia Mirkin2017-01-169-6/+172
| | | | | | | | | | | | We don't need to support all the color buffers for advanced blend, just cb0. For Fermi, we use the special binding slots so that we don't overlap with user textures, while Kepler+ gets a dedicated position for the fb handle in the driver constbuf. This logic is only triggered when a FBFETCH is actually present so it should be a no-op most of the time. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add flags parameter to texture barrierIlia Mirkin2017-01-1613-15/+23
| | | | | | | | This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_TGSI_FS_FBFETCHIlia Mirkin2017-01-1617-2/+20
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>