summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* spirv: Sort out the mess that is sampled imageJason Ekstrand2019-11-092-15/+24
| | | | | | | | | | | | | | This commit makes two major changes. First, we add a second case to OpLoad for sampled images which constructs a vtn_sampled_image and stashes that rather than stashing a pointer to the combined image sampler like we do for bare samplers and images. This should be more in line with how SPIR-V is intended to work and hopefully doesn't cause any weird problems. The second is a rework of vtn_handle_texture to assume that everything has an image but not everything has a sampler. We also add a vtn_fail_if for the case where a texture instructions require a sampler but none is provided. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Add a vtn_decorate_pointer helperJason Ekstrand2019-11-091-26/+41
| | | | | | | | | | This helper makes a duplicate copy of the pointer if any new access flags are set at this stage. This way we don't end up propagating access flags further than they actual SPIR-V decorations. In several instances where we create new pointers, we still call the decoration helper directly because no copy is needed. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: Remove the type from sampled_imageJason Ekstrand2019-11-094-8/+2
| | | | | | | We have types on all vtn_values at this point so there's no reason to carry the redundant type information. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* freedreno/ir3: also track # of nops for shader-dbRob Clark2019-11-093-1/+7
| | | | | | | | | | | | | The instruction count is (mostly) a measure of what optimization passes can do, while # of nops is more an indication of how effectively the scheduler is balancing register pressure vs instruction count. So track these independently. (There could be opportunities to rematerialize values to reduce register pressure, swapping some nop's with other alu instructions, so nothing is truely independent.. but it is still useful to break these stats out.) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: sync disasm changes from envytoolsRob Clark2019-11-092-24/+94
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | Set flag based on actual output reg type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix SP_FS_MRT_REG.HALF_PRECISIONRob Clark2019-11-091-1/+1
| | | | | | | We should really be setting this based on the actual output register type. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: remove obsolete commentRob Clark2019-11-091-4/+0
| | | | | | | | The meta PHI instruction was removed long ago. And fanin/fanout themselves to not contribute actual instructions (at least not by the time you get to sched, they may prevent copy-propagating away a mov) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/ra: remove ir print after livein/outRob Clark2019-11-091-1/+0
| | | | | | | The IR hasn't changed at this point, so it isn't really adding any value. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/ra: move regs_count==0 checkRob Clark2019-11-091-9/+2
| | | | | | | | Fold it in to writes_gpr() (since a register that does not reference any registers by definition does not write a register). This lets us avoid having to handle this case in a few other places. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: ir3_print tweaksRob Clark2019-11-092-47/+102
| | | | | | Handle HALF/HIGH flags in all cases, and colorize SSA src notation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: use SSA flag on dest register tooRob Clark2019-11-094-45/+48
| | | | | | | | | | | | We did this in some places before, but not consistantly. But it will be useful for two-pass RA, to identify which registers have already been assigned. While we are cleaning this up, use __ssa_src() and new __ssa_dst() helper more consistently. (If nothing else, this reduces the # of callers of ir3_reg_create() to audit that we didn't miss something) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split pre-coloring to it's own functionRob Clark2019-11-091-3/+12
| | | | Signed-off-by: Rob Clark <[email protected]>
* spirv: Don't leak GS initialization to other stagesCaio Marcelo de Oliveira Filho2019-11-081-1/+2
| | | | | | | | | | | | The stage specific fields of shader_info are in an union. We've likely been lucky that this value was either overwritten or ignored by other stages. The recent change in shader_info layout in commit 84a1a2578da ("compiler: pack shader_info from 160 bytes to 96 bytes") made this issue visible. Fixes: cf2257069cb ("nir/spirv: Set a default number of invocations for geometry shaders") Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* compiler: pack shader_info from 160 bytes to 96 bytesMarek Olšák2019-11-081-66/+66
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl/linker: pass shader_info to analyze_clip_cull_usage directlyMarek Olšák2019-11-081-16/+9
| | | | | | This will be needed by the next commit. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radeonsi/nir: fix compute shader crash due to nir_binary == NULLMarek Olšák2019-11-081-2/+12
| | | | | | This partially reverts 8b30114dda8. Fixes: 8b30114dda8 "radeonsi/nir: call nir_serialize only once per shader"
* radeonsi/nir: call nir_serialize only once per shaderMarek Olšák2019-11-081-21/+21
| | | | | | | | We were calling it twice. First serialize it, then use it to compute the cache key. Reviewed-by: Timothy Arceri <[email protected]>
* util: add blob_finish_get_bufferMarek Olšák2019-11-082-0/+14
| | | | Reviewed-by: Timothy Arceri <[email protected]>
* u_format: Fix swizzle of A1R5G5B5.Eric Anholt2019-11-081-1/+1
| | | | | | | Found once I started using the generated unpack code from the Mesa side. Fixes: 4bbaac3782ad ("gallium: Add some more channel orderings of packed formats.") Reviewed-by: Erik Faye-Lund <[email protected]>
* virgl: support emulating planar image samplingDavid Stevens2019-11-081-1/+6
| | | | | | | | | | Mesa emulates planar format sampling with per-plane samplers. Virgl now supports this by allowing the plane index to be passed when creating a sampler view from a planar image. With this change, mesa now passes that information to virgl. Signed-off-by: David Stevens <[email protected]> Reviewed-by: Lepton Wu <[email protected]>
* gallium/swr: Enable some ARB_gpu_shader5 extensionsKrzysztof Raszkowski2019-11-082-3/+4
| | | | | | | | | Enable / add to features.txt: - Enhanced textureGather. - Geometry shader instancing. - Geometry shader multiple streams. Reviewed-by: Jan Zielinski <[email protected]>
* gallium/swr: Fix GS invocation issuesKrzysztof Raszkowski2019-11-081-2/+7
| | | | | | | - Fixed proper setting gl_InvocationID. - Fixed GS vertices output memory overflow. Reviewed-by: Jan Zielinski <[email protected]>
* ac: Handle invalid GFX10 format correctly in ac_get_tbuffer_format.Timur Kristóf2019-11-082-0/+6
| | | | | | | | | | It happens that some games try to access a vertex buffer without a valid format. This case was incorrectly handled by ac_get_tbuffer_format which made ACO emit an invalid instruction. Signed-off-by: Timur Kristóf <[email protected]> Cc: 19.3 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* panfrost: Try to evict unused BOs from the cacheBoris Brezillon2019-11-084-6/+61
| | | | | | | | | | | | | | | | | | | | | | The panfrost BO cache can only grow since all newly allocated BOs are returned to the cache (unless they've been exported). With the MADVISE ioctl that's not a big issue because the kernel can come and reclaim this memory, but MADVISE will only be available on 5.4 kernels. This means an app can currently allocate a lot memory without ever releasing it, leading to some situations where the OOM-killer kicks in and kills the app (or even worse, kills another process consuming more memory than the GL app) to get some of this memory back. Let's try to limit the amount of BOs we keep in the cache by evicting entries that have not been used for more than one second (if the app stopped allocating BOs of this size, it's likely to not allocate similar BOs in a near future). This solution is based on the VC4/V3D implementation. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Move BO cache related fields to a sub-structBoris Brezillon2019-11-083-18/+21
| | | | | | | | | We will soon introduce an LRU list to evict BOs that have been unused for more than 1 second. Let's first move all BO cache fields to a sub-struct to clarify which fields are used by the BO caching logic. Signed-off-by: Boris Brezillon <[email protected]> Reviewed-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Switch base for vertex texturing on T720Alyssa Rosenzweig2019-11-081-11/+16
| | | | | | | | There aren't texture pipeline registers anymore; instead, space is shared with work and ldst registers for output and input respectively. We need to shift the base registers to represent this correctly. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Pass shader stage to disassemblerAlyssa Rosenzweig2019-11-084-4/+7
| | | | | | | Vertex texturing behaves differently from fragment texturing on some GPUs. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Disassemble half-steps correctlyAlyssa Rosenzweig2019-11-081-3/+15
| | | | | | | The meaning of some bits shifts; we need to account for this to print swizzles sanely. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix printing of half-registers in texture opsAlyssa Rosenzweig2019-11-081-35/+32
| | | | | | | We were using old style half-registers; let's update that to be consistent, preparing us for more disassmbler changes in this area. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* freedreno/ir3: Use regid() helper when setting up precolor regsKristian H. Kristensen2019-11-071-4/+4
| | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Turn on tessellation shadersKristian H. Kristensen2019-11-071-1/+13
| | | | | | | | Wow. Very triangle. So shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only use merged regs and four quads for VS+FSKristian H. Kristensen2019-11-071-5/+15
| | | | | | | | | When other geometry stages are present, we chose two quads and no merged regs. Acked-by: Eric Anholt <[email protected]> Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/blitter: Save tessellation stateKristian H. Kristensen2019-11-071-0/+2
| | | | | | | | We have tessellation state now. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Only set emit.hs/ds when we're drawing patchesKristian H. Kristensen2019-11-071-2/+3
| | | | | | | | | At least the gallium blitter helper will call us to draw with tessellation shaders set but a non-patch primitive. Signed-off-by: Kristian H. Kristensen <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno: Use bypass rendering for tessellationKristian H. Kristensen2019-11-071-0/+8
| | | | | | | | | | It seems like tiling could work in the Adreno architecture, but we've only ever seen bypass rendering with tessellation. For now, let's do that too. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Program state for tessellation stagesKristian H. Kristensen2019-11-074-34/+162
| | | | | | Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Emit constant parameters for tessellation stagesKristian H. Kristensen2019-11-071-10/+84
| | | | | | | | Assemble the information the stages need and emit the constants. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Allocate and program tessellation bufferKristian H. Kristensen2019-11-073-0/+44
| | | | | | | | | Tessellation needs a couple of buffers that should hold the entire output from a full VS+TCS draw call. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a6xx: Build the right draw command for tessellationKristian H. Kristensen2019-11-073-4/+52
| | | | | | | | | We need to select the right primitive type, set a bit to turn on tessellation and or in the TES output primitive type. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Allocate const space for tessellation parametersKristian H. Kristensen2019-11-071-0/+7
| | | | | | | | | | The tessellation stages need size and stride or the patch layout as well as locations of attributes in the patch. The tesselation stages also use two system memory BOs and need the iovas of those. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Pre-color TCS header and primitive ID inputsKristian H. Kristensen2019-11-071-2/+12
| | | | | | | | | | Similar to GS, the registers are shared and not reinitialized betewen VS and TCS, so we need to make sure to allocate the same registers for the system values between stages. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Don't assume binning shader is always VSKristian H. Kristensen2019-11-071-2/+2
| | | | | | | | In tessellation mode, the TES is (probably) the binning shader. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Setup inputs and outputs for tessellation stagesKristian H. Kristensen2019-11-071-7/+52
| | | | | | | | | | Similar to GS, some inputs are reused when the chsh from VS to TCS or TES to GS, so we need to make sure we setup the right inputs and make the shared system values outputs so they don't get clobbered. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement TCS synchronization intrinsicsKristian H. Kristensen2019-11-072-0/+41
| | | | | | | | | We add two new IR3 specific nir intrinsics that map to the new condend and endpatch instructions. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Implement tess coord intrinsicKristian H. Kristensen2019-11-071-0/+12
| | | | | | | | | | Our lowering pass made the z component unused by replacing its uses by 1 - x - y. The intrinsic implementation then just need to return the x and y components. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: End TES with chsh when using GSKristian H. Kristensen2019-11-071-1/+3
| | | | | | | | | When we have both TES and GS, the TES needs to chain to the VS with chmask and chsh GS just like the VS does to either TCS or GS. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add new synchronization opcodesKristian H. Kristensen2019-11-075-1/+15
| | | | | | | | | | | | | There are two new opcodes in use in tesselation control shaders: category 0, opcodes 13 and 15. unk13 is a kill type of instruction that terminates threads where !p0.x and it used to narrow down a patch wavefront to just thread 0. Then, once thread 0 has written the tess levels, it issues unk15, which might signal the TE that another patch has been fully written. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Extend geometry lowering pass to handle tessellationKristian H. Kristensen2019-11-073-8/+520
| | | | | | | | | | | | | | | | VS and TCS pass varyings the same way as VS and GS does. TCS then writes entire patch to a system memory BO and TES eventually reads back from the BO once the TE starts generating vertices. TES outputs vertices the same way as VS and GS, except when there's a GS as well, in which case TES passes varyings to GS same way the VS would. In addition, the TCS needs a little bit of control flow massaging so that it only runs for valid invocations needs a couple of unknown instructions to synchronize with the TE. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: Add tessellation field to shader keyKristian H. Kristensen2019-11-073-1/+51
| | | | | | | | | | Whether we're tessellating and which primitives the TES outputs affects the entire pipeline so let's add a field to the key to track that. Signed-off-by: Kristian H. Kristensen <[email protected]> Acked-by: Eric Anholt <[email protected]> Reviewed-by: Rob Clark <[email protected]>