aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: stop using lp_build_gather_valuesMarek Olšák2018-06-253-28/+25
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: clean up some #includesMarek Olšák2018-06-257-27/+4
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: clean up passing the is_monolithic flag for compilationMarek Olšák2018-06-254-23/+18
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* gallium/util: Fix build error due to cast to different sizeRobert Foss2018-06-251-2/+2
| | | | | | Signed-off-by: Robert Foss <[email protected]> Reviewed-by: Tomasz Figa <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600/sb: give the scheduler more margin to find valid instructions groupsGert Wollny2018-06-251-3/+10
| | | | | | | | | | | | | | | For instruction sequences that change the address register with every load the current limit to bail out of the scheduler and reject the optimisation was too tight, i.e. it was expected that at least one pending instruction would be scheduled each time. Give the scheduler more margin to sort out these load sequences by allowing a number of rounds where no instruction is scheduled. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=106163 Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* r600/sb: fix rotated register in while loopGert Wollny2018-06-251-4/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This patch is based on https://lists.freedesktop.org/archives/mesa-dev/2018-February/185805.html Dave Airlie: "A bunch of CTS tests led me to write tests/shaders/ssa/fs-while-loop-rotate-value.shader_test which r600/sb always fell over on. GCM seems to move some of the copies into other basic blocks, if we don't allow this to happen then it doesn't seem to schedule them badly. Everything I've read on SSA/phi copies say they have to happen in parallel, so keeping them in the same basic block seems like a good way to keep some of that property." This patch differs from the one proposed by Dave in that it only adds the NF_DONT_MOVE flag to copy_move instructions that are created by split_phi* and that are located in loops. Fixes piglit: tests/shaders/ssa/fs-while-loop-rotate-value.shader_test (no regressions in the shader set). It also fixes all failing tests from dEQP-GLES3.functional.shaders.loops.* Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* freedreno/ir3: fix deref conversion falloutRob Clark2018-06-231-13/+13
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix unused variable warningRob Clark2018-06-231-1/+0
| | | | | Fixes: cf0c7258ee0 freedreno/a5xx: MSAA Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix HW_ATOMIC_COUNTERS capRob Clark2018-06-231-1/+1
| | | | | | | | | | | This was mistakenly exposed, even though we want atomic counters to be lowered to atomic ops on an SSBO like nearly every other GPU. Which somehow recently started getting segfaults due to calling a null pipe->set_hw_atomic_buffers(). Fixes a crash in stk, and probably other things. Signed-off-by: Rob Clark <[email protected]>
* nir: Remove old-school deref chain supportJason Ekstrand2018-06-221-3/+0
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* freedreno/ir3: convert to deref instructionsRob Clark2018-06-223-53/+57
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Rework lower_locals_to_regs to use deref instructionsJason Ekstrand2018-06-221-2/+2
| | | | | | | | | | This completely reworks the pass to support deref instructions and delete support for old deref chains Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Re-enable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | Now that it's rewritten for deref instructions, we can turn it back on. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radeonsi: Remove deref chain support in nir scan pass.Bas Nieuwenhuizen2018-06-221-30/+4
| | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* radeonsi: Add deref support to the nir scan pass.Bas Nieuwenhuizen2018-06-221-15/+59
| | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]>
* nir: Delete lower_io_typesJason Ekstrand2018-06-221-1/+0
| | | | | | | | | | It's only used by the ir3 stand-alone compiler and Rob said we could delete it. Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* broadcom/vc4: Remove deref chain support from nir_lower_txf_ms.Eric Anholt2018-06-221-1/+0
| | | | | | | Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* st,ir3,radeonsi: push lower_deref_instrs back into driverRob Clark2018-06-224-5/+3
| | | | | | | | | | | | | vc4+vc5 is not really effected by the deref chain to deref instr conversion, so it no longer needs this pass. For others, now that all the passes mesa/st uses are using deref instructions, push the lowering to deref chains back into driver. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: remove legacy versionRob Clark2018-06-221-1/+1
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir/lower_samplers: split out _legacy version for deref chainsRob Clark2018-06-221-1/+1
| | | | | | | | | | | | | | | | | | To simplify the transition, and make things bisectable, split out a legacy copy or lower_samplers. This way the i965 and gallium drivers can independently switch over to deref instructions. Since the lower_samplers_as_deref pass is only used by gallium drivers, it can be converted in lock-step with moving the lower_deref_instrs pass, and so does not need a corresponding _legacy clone. This legacy pass will be removed in a future commit. Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel,ir3: Disable nir_opt_copy_prop_varsJason Ekstrand2018-06-221-1/+1
| | | | | | | | | | | | This pass doesn't handle deref instructions yet. Making it handle both legacy derefs and deref instructions would be painful. Since it's not important for correctness, just disable it for now. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* ttn: convert to deref instructionsRob Clark2018-06-221-39/+13
| | | | | | | | Signed-off-by: Rob Clark <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965,radv,st,ir3: Call nir_lower_deref_instrsJason Ekstrand2018-06-224-1/+8
| | | | | | | | | | | This inserts a call to nir_lower_deref_instrs at every call site of glsl_to_nir, spirv_to_nir, and prog_to_nir. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Acked-by: Rob Clark <[email protected]> Acked-by: Bas Nieuwenhuizen <[email protected]> Acked-by: Dave Airlie <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nv50/ir: only avoid spilling constrained def if a mov is addedKarol Herbst2018-06-231-2/+2
| | | | | | | | | | | | | fix spilling regression introduced by 5428066f5e this is just a minor mistake done while moving the code out into a new function. The function contained a loop which might have been terminated earlier and skipped setting noSpill to 1. After the refactoring it was always set. Fixes: 5428066f5e1ef5ea6ae04c84019f270023cfc6aa ("nv50/ir: make a copy of tex src if it's referenced multiple times") Signed-off-by: Karol Herbst <[email protected]>
* freedreno: a2xx: fix clear colorJonathan Marek2018-06-221-1/+1
| | | | | | | | the format of the CLEAR_COLOR register doesn't depend on the target format this fixes clear color when rendering to 32-bit RGBA and 16-bit targets Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix crash when freeing contextJonathan Marek2018-06-221-0/+2
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: fix crash on first clearJonathan Marek2018-06-221-4/+4
| | | | | | | blend can be NULL, so check for that Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: add a20xJonathan Marek2018-06-227-31/+85
| | | | | | | | | | | | | this patch adds support for a20x, which has some differences with a220: -no VGT_MAX_VTX_INDX register -no CLEAR_COLOR register -set RB_BC_CONTROL in restore (hangs without) -different CP_DRAW_INDX format tested with kmscube and glmark2 scenes, on par with a220 Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* freedreno: a2xx: increase size of the offset field in instr_fetch_vtx_tJonathan Marek2018-06-221-4/+2
| | | | | | | | The offset field is 22 bit large. 11 bits are necessary because MaxVertexAttribRelativeOffset = 2047 Signed-off-by: Jonathan Marek <[email protected]> Signed-off-by: Rob Clark <[email protected]>
* v3d: Don't forget to initialize the buffer offset of a new winsys handle.Eric Anholt2018-06-211-0/+1
|
* radeonsi: fix occlusion queries with 16x AA without FBO attachments on StoneyMarek Olšák2018-06-211-1/+9
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: handle non-clearable DCC buffers as MSAA resolve dstMarek Olšák2018-06-212-1/+6
| | | | | | | This is reproducible on Stoney, but other chips may be affected too. Cc 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radeonsi: disable DCC MSAA for 128bpp formats on StoneyMarek Olšák2018-06-211-0/+5
| | | | | Cc: 18.1 <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* freedreno/a5xx: MSAARob Clark2018-06-2114-42/+89
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2018-06-218-41/+53
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: txf_ms supportRob Clark2018-06-213-7/+51
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a5xx: fix gpu hangs with large compute shadersRob Clark2018-06-211-3/+11
| | | | | | | Similar to the combined limit for VS+FS, there is an upper limit for shader size to run from internel memory. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix base_vertexRob Clark2018-06-211-0/+1
| | | | | Fixes: c366f422f0a nir: Offset vertex_id by first_vertex instead of base_vertex Signed-off-by: Rob Clark <[email protected]>
* st/dri: constify dri_fill_st_visual's screenEmil Velikov2018-06-212-2/+4
| | | | | | | As the function says - only the visual is changed. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* swr: bump minimum supported LLVM version to 5.0Juan A. Suarez Romero2018-06-212-5/+5
| | | | | | | | | | | | | | | | RADV now requires LLVM 5.0 or greater, and thus we can't build dist tarball because swr requires LLVM 4.0. Let's bump required LLVM to 5.0 in swr too. Fixes: f9eb1ef870 ("amd: remove support for LLVM 4.0") Cc: Tim Rowley <[email protected]> Cc: Emil Velikov <[email protected]> Cc: Dylan Baker <[email protected]> Cc: Eric Engestrom <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Acked-by: Bruce Cherniak <[email protected]>
* radeonsi: add a debug flag to zero vram allocationsGrazvydas Ignotas2018-06-215-0/+7
| | | | | | | | | | This allows to avoid having to see garbage in Dying Light loading screen at least, which probably expects Windows/NV behavior of all allocations being zeroed by default. Analogous to radv flag with the same name. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: use shifts for sign extensionGrazvydas Ignotas2018-06-211-2/+2
| | | | | | | | | Avoids a branch and reduces code size a tiny bit: text data bss dec hex filename 10804563 398653 2070368 13273584 ca89f0 /tmp/radeonsi_dri.so.old 10804499 398653 2070368 13273520 ca89b0 /tmp/radeonsi_dri.so Reviewed-by: Marek Olšák <[email protected]>
* r600: fix copy/paste bug for sampleMaskIn workaroundRoland Scheidegger2018-06-211-1/+1
| | | | | | | | | | | The sampleMaskIn workaround (b936f4d1ca0d2ab1e828ff6a6e617f12469687fa) tries to figure out if the shader is running at per-sample frequency, but there's a typo bug so it will only recognize per-sample linar inputs, not per-sample perspective ones. Spotted by Eric Engestrom <[email protected]> Fixes: b936f4d1ca0d2ab1e828a "r600: partly fix sampleMaskIn value"
* v3d: Fix min vs mag determination when not doing mip filtering.Eric Anholt2018-06-201-2/+10
| | | | | Fixes all 128 failing tests in dEQP-GLES3.functional.texture.filtering.*.combinations
* v3d: Track write reference to the separate stencil buffer.Eric Anholt2018-06-201-1/+16
| | | | | | | | | | | | | Otherwise, a blit from separate stencil may fail to flush the job that initialized it, or new drawing could fail to flush a blit reading from stencil. Fixes: dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_basic dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_scale dEQP-GLES3.functional.fbo.blit.depth_stencil.depth32f_stencil8_stencil_only dEQP-GLES3.functional.fbo.msaa.2_samples.depth32f_stencil8 dEQP-GLES3.functional.fbo.msaa.4_samples.depth32f_stencil8
* v3d: Add missing reference to the separate stencil buffer.Eric Anholt2018-06-201-11/+13
| | | | Noticed while debugging a missing flush of rendering in the z32f_s8 case.
* v3d: Fix return value from fence_finish.Eric Anholt2018-06-201-1/+1
| | | | | | | We needed to convert from a -errno to a boolean success value. Fixes: GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_flush GTF-GLES3.gtf.GL3Tests.sync.sync_functionality_clientwaitsync_signaled
* gallium: add scalar isa shader capChristian Gmeiner2018-06-2015-1/+32
| | | | | | | | | | | | | | | | v1 -> v2: - nv30 is _NOT_ scalar as suggested by Ilia Mirkin. - Change from a screen cap to a shader cap as suggested by Eric Anholt. - radeonsi is scalar as suggested by Marek Olšák. - Change missing ones to be scalar. v2 -> v3: - r600 prefers vec4 as suggested by Marek Olšák. Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/aux/util/u_cpu_detect.h: Fix -Wsign-compare warning in u_cpu_detect.cGert Wollny2018-06-201-1/+1
| | | | | | | | | | | | | Change the type of util_cpu_caps::nr_cpus to int because sysconfig returns a signed value, fixes: u_cpu_detect.c: In function 'util_cpu_detect': u_cpu_detect.c:317:30: warning: comparison between signed and unsigned integer expressions [-Wsign-compare] if (util_cpu_caps.nr_cpus == -1) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium/aux/util/u_debug.h: Fix "noreturn" warnings in debug modeGert Wollny2018-06-201-2/+2
| | | | | | | | | | | Only decorate function as noreturn when DEBUG is not defined, because when compiled in DEBUG mode the function actually executes an int3 and may return, fixes: u_debug.c: In function '_debug_assert_fail': u_debug.c:309:1: warning: 'noreturn' function does return Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Emil Velikov <[email protected]>