aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* Android: r600: fix build when LLVM is disabledRob Herring2017-05-191-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | There's still an error after my recent clean-up if LLVM is not patched to enable AMDGPU target: external/mesa3d/src/amd/common/ac_llvm_util.c:38:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetInfo' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetInfo(); ^ external/mesa3d/src/amd/common/ac_llvm_util.c:39:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTarget' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTarget(); ^ external/mesa3d/src/amd/common/ac_llvm_util.c:40:2: error: implicit declaration of function 'LLVMInitializeAMDGPUTargetMC' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUTargetMC(); ^ external/mesa3d/src/amd/common/ac_llvm_util.c:41:2: error: implicit declaration of function 'LLVMInitializeAMDGPUAsmPrinter' is invalid in C99 [-Werror,-Wimplicit-function-declaration] LLVMInitializeAMDGPUAsmPrinter(); ^ We need to drop libmesa_amd_common when LLVM is disabled, however there's still a dependency on include paths for ac_binary.h. So explicitly add the include path when LLVM is disabled. Signed-off-by: Rob Herring <[email protected]> Acked-by: Nicolai Hähnle <[email protected]>
* virgl: fix virgl_bo_transfer_{put, get} box struct copyRob Herring2017-05-191-2/+12
| | | | | | | | | | | | | Commit 3dfe61ed6ec6 ("gallium: decrease the size of pipe_box - 24 -> 16 bytes") changed the size of pipe_box, but the virgl code was relying on pipe_box and drm_virtgpu_3d_box structs having the same size/layout doing a struct copy. Copy the fields one by one instead. Cc: Marek Olšák <[email protected]> Cc: Dave Airlie <[email protected]> Fixes: 3dfe61ed6ec ("gallium: decrease the size of pipe_box - 24 -> 16 bytes") Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi/gfx9: use CE RAM optimallyMarek Olšák2017-05-182-36/+134
| | | | | | | | | | | | | | On GFX9 with only 4K CE RAM, define the range of slots that will be allocated in CE RAM. All other slots will be uploaded directly. This will switch dynamically according to which slots are used by current shaders. GFX9 CE usage should now be similar to VI instead of being often disabled. Tested on VI by taking the GFX9 CE allocation codepath and setting num_ce_slots = 2 everywhere to get frequent switches between both modes. CE is still disabled on GFX9. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove CE offset alignment restrictionMarek Olšák2017-05-181-2/+1
| | | | | | | This was only needed by LOAD_CONST_RAM, which is now only used to load whole CE. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: only upload (dump to L2) those descriptors that are used by shadersMarek Olšák2017-05-184-24/+117
| | | | | | | This decreases the size of CE RAM dumps to L2, or the size of descriptor uploads without CE. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: record which descriptor slots are used by shadersMarek Olšák2017-05-185-0/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: update si_ce_needed_cs_spaceMarek Olšák2017-05-181-8/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: do only 1 big CE dump at end of IBs and one reload in the preambleMarek Olšák2017-05-185-37/+37
| | | | | | | | A later commit will only upload descriptors used by shaders, so we won't do full dumps anymore, so the only way to have a complete mirror of CE RAM in memory is to do a separate dump after the last draw call. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: remove early return in si_upload_descriptorsMarek Olšák2017-05-181-3/+0
| | | | | | | | All updates of descriptors_dirty also set dirty_mask, so the return is unnecessary. The next commit will want this function to be executed even if dirty_mask == 0. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clamp indirect index to the number of declared shader resourcesMarek Olšák2017-05-184-4/+15
| | | | | | | We'll do partial uploads of descriptor arrays, so we need to clamp against what shaders declare. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: merge sampler and image descriptor lists into oneMarek Olšák2017-05-186-112/+99
| | | | | | | | | | | | Sampler slots: slot[8], .. slot[39] (ascending) Image slots: slot[7], .. slot[0] (descending) Each image occupies 1/2 of each slot, so there are 16 images in total, therefore the layout is: slot[15], .. slot[0]. (in 1/2 slot increments) Updating image slot 2n+i (i <= 1) also dirties and re-uploads slot 2n+!i. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: merge constant and shader buffers descriptor lists into oneMarek Olšák2017-05-188-132/+152
| | | | | | | | | | Constant buffers: slot[16], .. slot[31] (ascending) Shader buffers: slot[15], .. slot[0] (descending) The idea is that if we have 4 constant buffers and 2 shader buffers, we only have to upload 6 slots. That optimization is left for a later commit. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_threaded: add a fast path for unbinding shader buffersMarek Olšák2017-05-181-3/+9
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/u_threaded: add a fast path for unbinding shader imagesMarek Olšák2017-05-181-4/+10
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: get the sampler view type from inst->Texture for TG4Samuel Pitoiset2017-05-181-7/+3
| | | | | | | | | This will also magically fix this special lowering for bindless samplers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi: store the sampler view type directly in the instructionSamuel Pitoiset2017-05-184-7/+21
| | | | | | | | | | | RadeonSI needs to do a special lowering for Gather4 with integer formats, but with bindless samplers we just can't access the index. Instead, store the return type in the instruction like the target. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi: remove some unused OPCODE macrosSamuel Pitoiset2017-05-182-200/+0
| | | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallivm: Make sure module has the correct data layout when pass manager runsTom Stellard2017-05-181-16/+18
| | | | | | | | | | | | | | | | | | | The datalayout for modules was purposely not being set in order to work around the fact that the ExecutionEngine requires that the module's datalayout matches the datalayout of the TargetMachine that the ExecutionEngine is using. When the pass manager runs on a module with no datalayout, it uses the default datalayout which is little-endian. This causes problems on big-endian targets, because some optimizations that are legal on little-endian or illegal on big-endian. To resolve this, we set the datalayout prior to running the pass manager, and then clear it before creating the ExectionEngine. This patch fixes a lot of piglit tests on big-endian ppc64. Cc: [email protected]
* ac: add radeon_info::num_{sdma,compute}_ringsNicolai Hähnle2017-05-182-4/+4
| | | | | | Vulkan needs them. Reviewed-by: Marek Olšák <[email protected]>
* ac_surface: use radeon_info from ac_gpu_infoNicolai Hähnle2017-05-182-6/+2
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move radeon_info initialization to amd/commonNicolai Hähnle2017-05-182-238/+4
| | | | | | v2: update Android.common.mk (Emil) Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move struct radeon_info to ac_gpu_info.hNicolai Hähnle2017-05-181-61/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move some aspects of sanity checking to ac_surfaceNicolai Hähnle2017-05-181-16/+0
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: add ac_compute_surface to automatically switch gfx6 vs. gfx9Nicolai Hähnle2017-05-181-4/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move the bulk of gfx9_surface_init to ac_surfaceNicolai Hähnle2017-05-181-415/+10
| | | | | | We can now merge the two *_surface_init functions. Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move the bulk of gfx6_surface_init to ac_surfaceNicolai Hähnle2017-05-181-411/+16
| | | | Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: move amdgpu_addr_create to ac_surfaceNicolai Hähnle2017-05-183-165/+1
| | | | | | | | v2: - update Android.common.mk (Emil) - rebase on top of Raven support Reviewed-by: Marek Olšák <[email protected]> (v1)
* ac/radeonsi: move surface definitions to new header ac_surface.hNicolai Hähnle2017-05-181-147/+1
| | | | Reviewed-by: Marek Olšák <[email protected]>
* vc4: Don't allocate new BOs to avoid synchronization when they're shared.Eric Anholt2017-05-171-1/+2
| | | | | | | If X11 did a software fallback to the entire screen, we would throw out the BO the screen is scanning out from and allocate a new one. Cc: [email protected]
* vc4: Drop pointless indirections around BO import/export.Eric Anholt2017-05-173-69/+49
| | | | | | I've since found them to be more confusing by adding indirections than clarifying by screening off resources from the handle/fd import/export process.
* vc4: Drop the u_resource_vtbl no-op layer.Eric Anholt2017-05-174-33/+27
| | | | | We only ever attached one vtbl, so it was a waste of space and indirections.
* gallium/radeon: use a top-of-pipe timestamp for the start of TIME_ELAPSEDMarek Olšák2017-05-171-2/+19
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTIONMarek Olšák2017-05-1717-0/+17
| | | | | | for skipping mapped-buffer checking in every GL draw call Reviewed-by: Nicolai Hähnle <[email protected]>
* swr: don't use AttributeSet with llvm >= 5Tim Rowley2017-05-171-15/+21
| | | | | | | | | | | This change fixes the build break with llvm-svn. r301981 of llvm-svn made add/remove of function attributes use AttrBuilder instead of AttributeList. Tested with llvm-3.9, llvm-4.0, llvm-svn. Reviewed-by: Bruce Cherniak <[email protected]>
* Android: correct libz dependencyChih-Wei Huang2017-05-171-1/+2
| | | | | | | | | | | | | | | | | | | Commit 6facb0c0 ("android: fix libz dynamic library dependencies") unconditionally adds libz as a dependency to all shared libraries. That is unnecessary. Commit 85a9b1b5 introduced libz as a dependency to libmesa_util. So only the shared libraries that use libmesa_util need libz. Fix Android Lollipop build by adding the include path of zlib to libmesa_util explicitly instead of getting the path implicitly from zlib since it doesn't export the include path in Lollipop. Fixes: 6facb0c0 "android: fix libz dynamic library dependencies" Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Rob Herring <[email protected]>
* freedreno/gmem: fix hw binning hangs with large render targetsRob Clark2017-05-163-3/+13
| | | | | | | | On all 3 gens, we have 4 bits for width and height in the VSC pipe config. And overflow results in setting width and/or height to zero which causes hangs. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix crash with atomicsRob Clark2017-05-161-2/+9
| | | | | | Atomics can have a result value. And sometimes it is even used. Signed-off-by: Rob Clark <[email protected]>
* ttn: fix dest size for some texture instructionsRob Clark2017-05-161-1/+3
| | | | | | | Some, like lod, don't return 4 components. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ttn: fix txd src sizesRob Clark2017-05-161-4/+6
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* ttn: fix txs dest sizeRob Clark2017-05-161-1/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/a5xx: remove unneeded assertRob Clark2017-05-161-3/+0
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fallback to slow-clear for z32Rob Clark2017-05-164-11/+36
| | | | | | | We probably *could* do this with blit path, but I think it would involve clobbering settings from batch->gmem (see emit_zs()). Signed-off-by: Rob Clark <[email protected]>
* etnaviv: increment the resource seqno in resource_changedPhilipp Zabel2017-05-161-5/+1
| | | | | | | | | | Just increment the resource seqno instead of setting the texture seqno to be lower by one than the resource seqno. Signed-off-by: Philipp Zabel <[email protected]> Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: clean up sampler view reference countingLucas Stach2017-05-161-3/+3
| | | | | | | | Use the proper pipe_resource_reference function instead of rolling our own. Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: apply feature overrides in one central locationLucas Stach2017-05-165-10/+19
| | | | | | | | | This way we can just test the feature bits and don't need to spread the debug overrides to all locations touching a feature. Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: allow R/B swapped surfaces to be clearedLucas Stach2017-05-161-0/+2
| | | | | | | Fixes: 7f62ffb68ad ("etnaviv: add support for rb swap") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: stop oversizing buffer resourcesLucas Stach2017-05-161-1/+1
| | | | | | | | | | | | | | PIPE_BUFFER is a target enum, not a binding. This caused the driver to up-align the height of buffer resources, leading to largely oversizing those resources. This is especially bad, as the buffer resources used by the upload manager are already 1MB in size. Height alignment meant that those would result in 4 to 8MB big BOs. Fixes: c9e8b49b885 ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-By: Wladimir J. van der Laan <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* radeonsi: extract TGSI memory/texture opcode handling into its own fileNicolai Hähnle2017-05-165-1841/+1886
| | | | | | It's about time to get the growth of si_shader.c somewhat under control. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: make const_array externally accessibleNicolai Hähnle2017-05-162-13/+15
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: make get_bounded_indirect_index externally accessibleNicolai Hähnle2017-05-162-16/+20
| | | | Reviewed-by: Marek Olšák <[email protected]>