summaryrefslogtreecommitdiffstats
path: root/src/amd/common
Commit message (Collapse)AuthorAgeFilesLines
* ac/gpu: add code to detect if kernel supports sync objects.Dave Airlie2017-07-212-0/+10
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: rewrite shared variable handling (v2)Connor Abbott2017-07-171-87/+158
| | | | | | | | | | | | | | | | Translate the NIR variables directly to LLVM instead of lowering to a TGSI-style giant array of vec4's and then back to a variable. This should fix indirect dereferences, make shared variables more tightly packed, and make LLVM's alias analysis more precise. This should fix an upcoming Feral title, which has a compute shader that was failing to compile because the extra padding made us run out of LDS space. v2: Combine the previous two patches into one, only use this for shared variables for now until LLVM becomes smarter. Reviewed-by: Bas Nieuwenhuizen <bas@basnieuwenhuizen> Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Alex Smith <[email protected]>
* ac/gpu_info: if clock crystal frequency is 0, print an error and set 1Marek Olšák2017-07-171-0/+4
| | | | | | | | During bring-up, this is often 0. Prevent automatic disablement of ARB_timer_query and demotion of the OpenGL version to 3.2 by setting a non-zero frequency. Print an error message instead. Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/surface/gfx9: flags.texture currently refers to TC-compatible HTILEMarek Olšák2017-07-171-1/+3
| | | | | | This should lead to better MSAA performance on GFX9. Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: merge si_llvm_get_amdgpu_target into ac_get_llvm_targetMarek Olšák2017-07-172-8/+11
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/addrlib: fix typo in api name.Dave Airlie2017-07-171-1/+1
| | | | | | | | This fixes the misspelling of ALIGNMENTS in addrlib. Reviewed-by: Eduardo Lima Mitev <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: set cb base tile swizzles for MRT speedups (v4)Dave Airlie2017-07-172-0/+16
| | | | | | | | | | | | | | | | | | | | This patch uses addrlib to workout the tile swizzles according to the surface index. It seems to produce the same values as amdgpu-pro for the deferred test. v2: don't apply swizzle to CMASK. the eg docs don't mention it, and we clearly don't align cmask for that. v3: disable surf index for dedicated images, as these will most likely be shared, and I don't think the metadata has space for this info in it yet. v4: update for shareable images, rename combined_swizzle to tile_swizzle This gets the deferred demo from 730->950fps on my rx480. (dcc cmask elim predication patches get it further) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: drop setting xnackDave Airlie2017-07-091-2/+1
| | | | | | | | | | | Since radv uses compute rings and we can't know when we are setting up the shaders what ring they are to be used on, we should just use the default xnack setting. This may be suboptimal in some places, but if we hit a problem, we likely should try and address this between llvm and mesa. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: add support for using addrlib max alignment.Dave Airlie2017-07-093-2/+12
| | | | | | | | Rather than using 64k, use what addrlib returns as the base alignment for vulkan allocations. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Fix ordering of parameters for image atomic cmpswap intrinsicsAlex Smith2017-07-071-1/+1
| | | | | | | | | | The NIR parameters are ordered "compare, data", matching GLSL, but both the image and buffer LLVM intrinsics take them the other way around. This is already handled correctly for SSBO atomics. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Fixes: f4e499ec791 "radv: add initial non-conformant radv vulkan driver"
* radv: enable sisched toggle in perftest flags.Dave Airlie2017-07-062-2/+4
| | | | | | | | | RADV_PERFTEST=sisched to enable it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: set xnack like radeonsi does.Dave Airlie2017-07-061-1/+3
| | | | | | | Use family, but only set xnack+ for gfx9. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: create features list using snprintf.Dave Airlie2017-07-061-2/+5
| | | | | | | Just more moving code around before adding things to it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/radv: change api to create target machineDave Airlie2017-07-062-3/+6
| | | | | | | | This just modifies the API to make it easier to add other flags to target machine creation. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Move VS position exports before param exports.Bas Nieuwenhuizen2017-07-051-55/+54
| | | | | | | | According to Nicolai the SX can already start work when all the position exports are done, so do those first. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: fix 64-bit shiftsConnor Abbott2017-07-031-3/+12
| | | | | | | | | NIR always makes the shift amount 32 bits, but LLVM asserts if the two sources aren't the same type. Zero-extend the shift amount to make LLVM happy. Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: implement 64-bit packing and unpackingConnor Abbott2017-07-031-0/+31
| | | | | | | | | | | | | | | We implement the split opcodes, and tell NIR to lower the original ones. The lowering to LLVM is a little more complicated, but NIR can optimize the split ones a little better, and some NIR lowering passes that we might want to use (particularly for doubles) emit the split ones. This should fix pack/unpackDouble2x32, which seems like a bug since when we enabled the Float64 capability. It will also fix pack/unpackInt2x32 when we enable the Int64 capability. Fixes: 798ae37c ("radv: Enable Float64 support.") Signed-off-by: Connor Abbott <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use v4i32 variant of llvm.SI.load.const.Bas Nieuwenhuizen2017-06-301-3/+1
| | | | | | | | | | We apparently still used v16i8 .... As radeonsi doesn't use it with LLVM version checks I don't think we need them either. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* ac/nir: remove last remnants of v16i8Dave Airlie2017-06-283-9/+3
| | | | | | | llvm doesn't need this workaround anymore. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: Use correct LLVM intrinsics for atomic ops on imageBuffersAlex Smith2017-06-281-29/+34
| | | | | | | | The buffer intrinsics should be used instead of the image ones. Signed-off-by: Alex Smith <[email protected]> Cc: <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: assert printfs will fitJames Legg2017-06-281-5/+12
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: Make intrinsic_name buffer long enoughJames Legg2017-06-281-1/+1
| | | | | | | | | | | | When using cmpswap on an image, it was being trunctated to lvm.amdgcn.image.atomic.cmpswa, with the coords type missing entirely. v2: Add stable CC CC: <[email protected]> Reviewed-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: convert emit helpers to ac_llvm_contextNicolai Hähnle2017-06-271-117/+118
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: remove unused nir_to_llvm_context::has_ddxyNicolai Hähnle2017-06-271-2/+0
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: implement nir_op_f2bNicolai Hähnle2017-06-271-0/+12
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: implement nir_op_{b2i,i2b}Nicolai Hähnle2017-06-271-0/+20
| | | | | | | Booleans in NIR are ~0 for true, b2i returns 0/1. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: convert type helpers to ac_llvm_contextNicolai Hähnle2017-06-271-95/+95
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/llvm: fix type of second llvm.cttz.* parameterNicolai Hähnle2017-06-271-1/+1
| | | | | | | | LLVM has required an i1 here for a long time. llvm.ctlz.* was fixed in commit edd23e06067 ("ac/llvm: fix various findMSB bugs"). Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/shader_info: fix a commentNicolai Hähnle2017-06-271-2/+6
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: add ac_llvm_context::v8i32Nicolai Hähnle2017-06-272-0/+2
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: add ac_llvm_context::{i,f}32_{0,1}Nicolai Hähnle2017-06-272-0/+10
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac: add ac_llvm_context::{i16, i64, f16, f64}Nicolai Hähnle2017-06-272-0/+8
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: handle primitive id input into fragment shader with no geom shaderDave Airlie2017-06-262-3/+26
| | | | | | | | | | Fixes: dEQP-VK.pipeline.framebuffer_attachment.no_attachments dEQP-VK.pipeline.framebuffer_attachment.no_attachments_ms Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: set prim_id for geometry shadersDave Airlie2017-06-262-2/+4
| | | | | | | | Noticed in passing. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: set use_prim_id for tess shaders correctly.Dave Airlie2017-06-261-3/+5
| | | | | | | | Just noticed in passing. Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radeonsi/gfx9: fix TC-compatible stencil compressionMarek Olšák2017-06-191-2/+2
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* ac/sid.h: don't use parentheses in PKT3_RELEASE_MEM definitionMarek Olšák2017-06-191-1/+1
| | | | | | | The parses skips the line if it contains parentheses. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ac: parse EVENT_WRITE_EOP, RELEASE_MEM, WAIT_REG_MEM, NOWHEREMarek Olšák2017-06-192-0/+47
| | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* amd/common: fix off-by-one in sid_tables.pyNicolai Hähnle2017-06-191-1/+1
| | | | | | | The very last entry in the sid_strings_offsets table ended up missing, leading to out-of-bounds reads and potential crashes. Reviewed-by: Marek Olšák <[email protected]>
* ac: resolve conflicts introduced with "ac: remove amdgpu.h dependency"Emil Velikov2017-06-171-1/+3
| | | | | | | | | | | | | | | | | | | The commit did not add the relevant includes - in particular stdint.h and stdbool.h for the respective standard types. At the same time, the amdgpu_device_handle typedef redeclaration was off. Fixes: 81945ded0dc ("ac: remove amdgpu.h dependency") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101471 Cc: Mark Janes <[email protected]> Cc: Gregor Münch <[email protected]> Reported-by: Bas Nieuwenhuizen <[email protected]> Reported-by: Mark Janes <[email protected]> Reported-by: Gregor Münch <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Tested-by: Bas Nieuwenhuizen <[email protected]>
* ac: remove amdgpu.h dependencyEmil Velikov2017-06-162-2/+6
| | | | | | | | | | | | | | | | Add a couple of forward declarations and drop the amdgpu.h requirement. With this we can build the r300 and r600 drivers without the need for amdgpu. v2: - Add amdgpu.h include in the C file (Marek) - Add a comment about pre C11 typedef redeclaration warning (Eric) Cc: Nicolai Hähnle <[email protected]> Cc: Marek Olšák <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101189 Signed-off-by: Emil Velikov <[email protected]>
* ac/gpu: drop duplicated code line.Dave Airlie2017-06-131-1/+0
| | | | | | | | | has_hw_decode is assigned twice. Pointed out by coverity. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: remove another unused variableGrazvydas Ignotas2017-06-081-1/+0
| | | | | | | Declared by each loop already. Trivial. Signed-off-by: Grazvydas Ignotas <[email protected]>
* ac/nir: convert several ifs to a switchGrazvydas Ignotas2017-06-081-9/+11
| | | | | | | | Also solve "outinfo may be used uninitialized" warning by putting in an unreachable(). Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: mark some arguments constGrazvydas Ignotas2017-06-081-30/+31
| | | | | | | | | Most functions are only inspecting nir, so nir related arguments can be marked const. Some more can be done if/when some nir changes are accepted. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/nir: move gpr counting inside argument handling.Dave Airlie2017-06-071-10/+12
| | | | | | | This just moves this code in here to it's cleaner. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: assign argument param pointers in one place.Dave Airlie2017-06-071-187/+152
| | | | | | | | | Instead of having the fragile code to do a second pass, just give the pointers you want params in to the initial code, then call a later pass to assign them. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* ac/nir: consolidate setting userdata locationDave Airlie2017-06-071-28/+17
| | | | | | | | Just pass a pointer and increment inside the function, makes the code less error prone. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tree-wide: remove trailing backslashEric Engestrom2017-06-071-1/+1
| | | | | | | | | Simple search for a backslash followed by two newlines. If one of the newlines were to be removed, this would cause issues, so let's just remove these trailing backslashes. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* ac/surface: Fix HTILE for radv.Bas Nieuwenhuizen2017-06-061-2/+1
| | | | | | | We always compute HTILE size using addrlib, even when not TC compatible. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlied <[email protected]>