aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* st/va: check resource_get_info nullity in vlVaDeriveImageJulien Isorce2019-05-031-5/+8
| | | | | | | | This pipe_screen function is not implemented by all backends. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* anv,i965: Stop warning about incomplete gen11 supportJason Ekstrand2019-05-032-11/+2
| | | | | | | | Both drivers are feature-complete and should be running more-or-less at perf at this point. Drop the warning. Acked-by: Kenneth Graunke <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* nir/algebraic: Don't emit empty initializers for MSVCConnor Abbott2019-05-041-0/+4
| | | | | | | | | Just don't emit the transform array at all if there are no transforms v2: - Don't use len(array) > 0 (Dylan) - Keep using ARRAY_SIZE to make the generated C code easier to read (Jason).
* iris: Resolve textures used by the program, not merely bound texturesKenneth Graunke2019-05-031-2/+5
| | | | | | | | | st/mesa's PBO upload path binds a vertex shader that doesn't use any textures, but leaves the existing sampler views bound in place. This was tricking us into thinking the PBO destination might be bound for texturing in some cases. In Civilization VI, this fixes a false self- dependency issue that was preventing CCS_E compression on upload. Fixing this slightly improves frame times.
* meson: Don't build glsl cache_test when shader cache is disabledDylan Baker2019-05-031-12/+13
| | | | | | | v2: - Use new with_shader_cache variable instead of host_machine.system() == 'windows' Reviewed-by: Eric Anholt <[email protected]>
* tests/vma: fix build with MSVCDylan Baker2019-05-031-0/+8
| | | | Reviewed-by: Eric Anholt <[email protected]>
* glsl/tests: define ssize_t on windowsDylan Baker2019-05-031-0/+4
| | | | Reviewed-by: Eric Anholt <[email protected]>
* util/tests: Use define instead of VLADylan Baker2019-05-035-20/+25
| | | | | | | To allow the this test to be built with MSVC, which doesn't support VLAs. Reviewed-by: Eric Anholt <[email protected]>
* meson: make nm binary optionalDylan Baker2019-05-034-4/+4
| | | | | | | | | | | This makes nm not required, but used if found. In general I imagine that this means that on windows nm wont be found, and on other platforms it will. v2: - fix gbm and egl symbols check tests to only be run if nm is found - reword commit message to reflect the code change Reviewed-by: Eric Anholt <[email protected]>
* glsl: fix general_ir_test with mingwDylan Baker2019-05-031-7/+7
| | | | | | | | Somewhere down in the depths of the mingw headers 'interface' is defined, change it to iface like a similar patch did. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: always define libglapiDylan Baker2019-05-031-0/+2
| | | | | | | | This allows the identifier to be used even if shared-glapi isn't build, which simplifies a bunch of things. Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* meson: Fix missing glproto dependency for gallium-glxChuck Atkins2019-05-031-1/+1
| | | | | | Signed-off-by: Chuck Atkins <[email protected]> Cc: mesa-stable <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* radv: apply the indexing workaround for atomic buffer operations on GFX9Samuel Pitoiset2019-05-033-5/+14
| | | | | | | | | | | | | | | | | | | Because the new raw/struct intrinsics are buggy with LLVM 8 (they weren't marked as source of divergence), we fallback to the old instrinsics for atomic buffer operations only. This means we need to apply the indexing workaround for GFX9. The load/store operations still use the new LLVM 8 intrinsics. The fact that we need another workaround is painful but we should be able to clean up that a bit once LLVM 7 support will be dropped. This fixes a GPU hang with AC Odyssey and some rendering problems with Nioh. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110573 Fixes: 31164cf5f70 ("ac/nir: only use the new raw/struct image atomic intrinsics with LLVM 9+") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: fix crash when application does not provide push constantsLionel Landwerlin2019-05-031-1/+1
| | | | | | | | | | | | Found while running Talos Principle. As far as I can tell running a draw call with a pipeline having push constants without the application having called vkCmdPushConstants gives undefined push constant values. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Cc: [email protected]
* radv: fix radv_get_aspect_format() for D+S formatsSamuel Pitoiset2019-05-031-0/+2
| | | | | | | | | | | | This restores the previous behaviour before YCBCR landed. For D+S formats, it returns the depth format. This fixes an assertion with Thrones of Britannia. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110540 Fixes: 66507cc6563 ("radv: Add single plane image views & meta operations") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/fs: Assert when brw_fs_nir sees a nir_deref_instrCaio Marcelo de Oliveira Filho2019-05-021-1/+1
| | | | | | | | | | | Since 09f1de97a76 "anv,i965: Lower away image derefs in the driver" the backend compiler is not expected to handle any derefs, so let's assert on it. This helps identifying problems when a deref is not lowered and "leaks" into the backend compiler. Reviewed-by: Jason Ekstrand <[email protected]>
* r600: implement resource_get_infoJulien Isorce2019-05-031-5/+29
| | | | | | | | Factoring code with resource_get_handle. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110443 Signed-off-by: Julien Isorce <[email protected]> Reviewed-by: Dave Airlie [email protected]
* util/bitset: fix bitset range mask calculations.Dave Airlie2019-05-031-2/+2
| | | | | | | | | | | | | | | | | | | | | | | The MASK macro is used in the RANGE macro, and it should return the pre-bitset word mask for the (b) value. i.e. BITSET_MASK(0) should be undefined since it's meaningless. BITSET_MASK(31) should give 0x7fffffff BITSET_MASK(32) should give 0xffffffff BITSET_MASK(33) should give 0x00000001 BITSET_MASK(64) should give 0xffffffff However then BITSET_RANGE ends up broken for cases where it's (b) value is the 0,32,64 value as in that case the lower mask would be 0 not 0xffffffff. This fixes the unit tests that I've added, and my code that uses bitsets. Reviewed-by: Jason Ekstrand <[email protected]> Fixes: bb38cadb1c5f2 "More GLSL code" Reviewed-by: Kristian H. Kristensen <[email protected]>
* util/tests: add basic unit tests for bitsetDave Airlie2019-05-032-0/+141
| | | | | | | The last test here currently fails as there is a bug in bitset.h Reviewed-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: fix lower vars to ssa for larger vector sizes.Dave Airlie2019-05-031-4/+4
| | | | | | | This has a couple of hardcoded vec4 limits in it, change them to the proper sizing to avoid future issues. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: fix SpvOpBitSize return value.Dave Airlie2019-05-031-3/+1
| | | | | | The spir-v spec says this returns a bool. Reviewed-by: Jason Ekstrand <[email protected]>
* iris: Disable dual source blending when shader doesn't handle itKenneth Graunke2019-05-021-4/+15
| | | | | | | | | | | | | This is a port of Danylo's eca4a6548d07bbbb02a7768edb397bad7b72cfc2 which fixed the hang on i965. It fixes GPU hangs in his new Piglit test, arb_blend_func_extended-dual-src-blending-discard-without-src1. I avoided my own review feedback here, and decided to simply adjust 3DSTATE_PS_BLEND rather than BLEND_STATE_ENTRY[0]. It has never been clear to me which the hardware uses in every case. However, whacking the enable in 3DSTATE_PS_BLEND seems to be sufficient to fix the hang, and that packet is already dynamic, so it's easy to handle. I'd rather avoid making BLEND_STATE_ENTRY[0] dynamic unless I have to.
* anv: Stop including POS in FS input limitsJason Ekstrand2019-05-021-1/+1
| | | | | | | It is an input but it comes in as part of the shader payload and doesn't count towards the limits. Reviewed-by: Kenneth Graunke <[email protected]>
* nir: fix nir tex print harderRob Clark2019-05-021-6/+5
| | | | | | Fixes: 691d5a825a6 nir: rework tex instruction printing Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* lima/ppir: support nir_op_ftruncErico Nunes2019-05-023-0/+14
| | | | | | | | Support nir_op_ftrunc by turning it into a mov with a round to integer output modifier. Signed-off-by: Erico Nunes <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* gbm: Improve documentation of BO importHeinrich2019-05-021-3/+5
| | | | | | | | | | | | | - Add GBM_BO_IMPORT_FD_MODIFIER to documentation of supported foreign object types - Add newline before documentation block - Improve language Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Daniel Stone <[email protected]>
* radv: only need to force emit the TCS regs on Vega10 and Raven1Samuel Pitoiset2019-05-021-2/+2
| | | | | | | | Other GFX9 chips aren't affected. Cc: "19.0" "19.1" <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* glsl: fix and clean up NV_compute_shader_derivatives supportMarek Olšák2019-05-021-54/+24
| | | | | | | - make sure compute shader derivatives are exposed for all extensions - unify duplicated code Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* st/dri: decrease input lag by syncing sooner in SwapBuffersMarek Olšák2019-05-022-11/+11
| | | | | | | | | | | | | | | | | | | | | | | | | | | | It's done by: - decrease the number of frames in flight by 1 - flush before throttling in SwapBuffers (instead of wait-then-flush, do flush-then-wait) The improvement is apparent with Unigine Heaven. Previously: draw frame 2 wait frame 0 flush frame 2 present frame 2 The input lag is 2 frames. Now: draw frame 2 flush frame 2 wait frame 1 present frame 2 The input lag is 1 frame. Flushing is done before waiting, because otherwise the device would be idle after waiting. Nine is affected because it also uses the pipe cap.
* meson: lift driver-collection out into parent build-fileErik Faye-Lund2019-05-027-23/+17
| | | | | | | | | | This way we can mark the dri_drivers and dri_link arrays as temporary, as all knowledge about them are contained in a single build-file with clearly visible limited life-span. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* freedreno/a6xx: smaller hammer for fb barrierRob Clark2019-05-023-0/+48
| | | | | | | We just need to do a sequence of commands to flush the cache. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/a6xx: KHR_blend_equation_advanced supportRob Clark2019-05-027-5/+96
| | | | | | | | | Wire up support to sample from the fb (and force GMEM rendering when we have fb reads). The existing GLSL IR lowering for blend_equation_advanced does the rest. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/ir3: fb read supportRob Clark2019-05-023-7/+33
| | | | | | | | Lower load_output to txf_ms_fb and add support for the new texture fetch instruction. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/drm: expose GMEM_BASE addressRob Clark2019-05-023-0/+9
| | | | | | | Needed for sampling from tile buffer (GMEM). Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: add pass to lower fb readsRob Clark2019-05-025-6/+141
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: fix lower_wpos_ytransform in load_frag_coord caseRob Clark2019-05-021-10/+11
| | | | | | | | | | | | | | | Apparently we never hit this path. Or at least haven't for a rather long time. But in either case (load_deref or load_frag_coord), we can just directly use the intrinsic's ssa dest. So stop passing the nir_variable (which would be NULL in the load_frag_coord case) around and instead just use &intr->dest.ssa. (This ofc means we need to setup the cursor to insert *after* the instruction, which seems to be another bug of the original implementation.) Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: rework tex instruction printingRob Clark2019-05-021-8/+10
| | | | | | | The extra comma at the end was annoying me. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/ir3: add some ubo range related assertsRob Clark2019-05-023-4/+11
| | | | | | | And a comment.. since we are mixing units of bytes/dwords/vec4, hopefully this will avoid some unit confusion. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add IR3_SHADER_DEBUG flag to disable ubo loweringRob Clark2019-05-023-1/+9
| | | | | | | | | It isn't quite as simple as not running the pass, since with packed varyings we get load_ubo for block==0 (ie. the "real" uniforms). So instead run the pass normally but decline to lower anything in block > 0 Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix lowered ubo region alignmentRob Clark2019-05-021-1/+1
| | | | | | | | | | | Since we emit UBO regions INDIRECTly (ie. not copied into cmdstream but emit by EXT_SRC_ADDR) we need to keep them 4*vec4 aligned. Which the code already mostly did, except for aligning the first UBO region itself (ie. the one after block==0 which is the "real" uniforms). Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix shader variants vs UBO analysisRob Clark2019-05-021-1/+3
| | | | | | | | | | | Otherwise we zero out the state again, but all the UBO loads that we could lower are already lowered. End result is that we didn't emit the uniforms for lowered UBO access in any case where multiple shader variants are used. Fixes: 893425a607a freedreno/ir3: Push UBOs to constant file Fixes: 3c8779af325 freedreno/ir3: Enable PIPE_CAP_PACKED_UNIFORMS Signed-off-by: Rob Clark <[email protected]>
* vulkan/overlay: add TODO listLionel Landwerlin2019-05-021-0/+3
| | | | | | | Keen on having other people contribute. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vulkan/overlay: make overriden functions staticLionel Landwerlin2019-05-021-25/+26
| | | | | | And fix the unused CmdDrawIndirect. Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: make overlay size configurableLionel Landwerlin2019-05-023-1/+18
| | | | Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: add a frame counter optionLionel Landwerlin2019-05-022-1/+5
| | | | | | | | This is useful to normalize the numbers written into the output file as those number are accumulated over a period of time and number of frames. Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: record all select metrics into output fileLionel Landwerlin2019-05-022-1/+48
| | | | | | | | | | | | | | | | The output looks something like this (csv style) : fps, frame, frame_timing(us), submit, draw_indexed, pipeline_graphics, acquire_timing(us), vert_invocations, frag_invocations, gpu_timing(ns) 480.55, 242, 501512, 247, 1444, 1204, 714, 5827272, 113043296, 121424174 467.80, 234, 500214, 234, 1412, 1176, 648, 5635680, 109436188, 117743760 424.37, 213, 501923, 213, 2130, 1704, 623, 5132448, 99657292, 105474683 472.15, 237, 501962, 237, 2370, 1896, 667, 5710752, 110924644, 122226004 411.32, 206, 500826, 206, 2060, 1648, 709, 4963776, 96491764, 95333273 458.87, 230, 501228, 230, 2300, 1840, 634, 5542080, 107758204, 123112090 475.01, 238, 501044, 238, 2380, 1904, 631, 5734848, 111477480, 122087426 471.08, 236, 500972, 236, 2360, 1888, 655, 5686656, 110498496, 114816162 Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: add a margin to the size of the windowLionel Landwerlin2019-05-021-5/+6
| | | | | | Looks a bit better. Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: add no display optionLionel Landwerlin2019-05-023-25/+50
| | | | | | | In case you're just interested in data being record to the output file. Signed-off-by: Lionel Landwerlin <[email protected]>
* vulkan/overlay: add pipeline statistic & timestamps supportLionel Landwerlin2019-05-022-55/+383
| | | | | | | | | | v2: switch to VkBase{In,Out}Structure v3: Add timestamps at begin/end of primary command buffers to estimate gpu time spent per submission (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Eric Engestrom <[email protected]> (v2)
* vulkan/overlay: record stats in command buffers and accumulate on exec/submitLionel Landwerlin2019-05-022-141/+238
| | | | | | | | | | | | This significantly reworks how numbers displayed are computed. We accumulate operations written into command buffers and add those to the device when submitted to a queue. These collected values are then used to compute per frame overlay data. We also accumulate the data over the sampling fps period to produce numbers for that period of time. Signed-off-by: Lionel Landwerlin <[email protected]>