aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau/codegen/nv50_ir.h
Commit message (Collapse)AuthorAgeFilesLines
* nvc0/ir: use the combined tid special registerRhys Perry2018-07-071-0/+1
| | | | | | | | | | | | | | total instructions in shared programs : 5804448 -> 5804690 (0.00%) total gprs used in shared programs : 670065 -> 670065 (0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 0 5 5 hurt 0 0 0 191 191 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nvc0: add support for bindless textures on kepler+Ilia Mirkin2018-01-071-0/+1
| | | | | | | | | This keeps a list of resident textures (per context), and dumps that list into the active buffer list when submitting. We also treat bindless texture fetches slightly differently, wrt the meaning of indirect, and not requiring the SAMPLER file to be used. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add precise field to InstructionKarol Herbst2017-07-211-0/+2
| | | | | | | v4: initialize field with NULL Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Pierre Moreau <[email protected]>
* nv50/ir: Remove unused translation methodsPierre Moreau2017-05-071-1/+0
| | | | | | | This code was merged commented out, and has stayed that way ever since. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Add SV_LANEMASK_* system values.Boyan Ding2017-04-131-0/+5
| | | | | | | v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: remove unused swizzle field in ValueRefIlia Mirkin2017-04-091-1/+0
| | | | | | | | The nv50 ir is scalar. Perhaps this was from some early attempts to integrate the simd aspects of nv30. However at this point it's entirely unused. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: make it possible to have the flags def in def0Ilia Mirkin2017-02-091-1/+1
| | | | | | | There's all kinds of logic that doesn't like there being holes in defs or srcs lists. Avoid them. This also fixes the sched logic for maxwell. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bitIlia Mirkin2017-02-091-0/+1
| | | | | | Note that this is not available for SM20/SM30. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for emitting partial min/max ops for int64Ilia Mirkin2017-02-091-0/+4
| | | | | | | | | | These operations allow you to compute min/max on arbitrary-width integers, 32 bits at a time. Note that the low/med ops implicitly set the condition code, and the med/high ops implicitly consume it. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add preliminary support for SHLADDSamuel Pitoiset2016-09-291-0/+1
| | | | | | | | | | This instruction is available since SM20 (Fermi) and allow to do (a << b) + c in one shot. In some situations, IMAD should be replaced by SHLADD when b is a power of 2, and ADD+SHL should be replaced by SHLADD as well. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: don't dual-issue ops that depend or interfere with each otherKarol Herbst2016-09-031-0/+4
| | | | | | | Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]> [imirkin: rewrite to split up the helpers and move more logic to target] Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for BGRA8 imagesIlia Mirkin2016-07-181-0/+3
| | | | | | | | This is useful for pbo downloads, which are now accelerated with images. BGRA8 is a moderately common format to do that in. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Add support for SV_WORK_DIMHans de Goede2016-07-021-0/+1
| | | | | | | | Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: add support for SULDP -> SULDB conversionIlia Mirkin2016-04-261-0/+71
| | | | | | | | | | This will allow to convert surface formats without adding an extra call to our lib. [hakzsam: make use of this for GK104] Signed-off-by: Samuel Pitoiset <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add OP_BUFQ for buffers querySamuel Pitoiset2016-04-261-0/+1
| | | | | | | | | | TGSI RESQ allows both images and buffers but we have to make a distinction between these two type of resources in our lowering pass. Introducing OP_BUFQ which is a fake operand will allow to implement OP_SUQ for surfaces. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: codegen: Use FILE_MEMORY_BUFFER for buffersHans de Goede2016-04-201-0/+1
| | | | | | | | | | | | | | | | | | | | | | Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: emit VOTE instructionSamuel Pitoiset2016-02-281-0/+4
| | | | | | | | Changes from v2: - add missing NOT modifier for GK110/GM107 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add lock/unlock subops for load/storeSamuel Pitoiset2016-02-211-0/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add SUQ op by reading the info from driver constbufIlia Mirkin2016-01-291-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add ARB_shader_draw_parameters supportIlia Mirkin2015-12-301-0/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for TGSI_SEMANTIC_HELPER_INVOCATIONIlia Mirkin2015-11-121-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add support for TXQS tgsi opcodeIlia Mirkin2015-09-131-2/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: support different unordered_set implementationsChih-Wei Huang2015-08-201-4/+4
| | | | | | | | | | | | If build with C++11 standard, use std::unordered_set. Otherwise if build on old Android version with stlport, use std::tr1::unordered_set with a wrapper class. Otherwise use std::tr1::unordered_set. Signed-off-by: Chih-Wei Huang <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: kepler can't do indirect shader input/output loads directlyIlia Mirkin2015-07-231-0/+1
| | | | | | | | | | | | | | There's a special AL2P instruction (called AFETCH in nv50 ir) which computes a "physical" value to be used with indirect addressing with ALD. Fixes tcs-input-array-*-index-rd tcs-output-array-*-index-wr varying-indexing tessellation tests on Kepler. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: tess factors are now sysvals, adapt codegen to expect thatIlia Mirkin2015-07-231-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64Ilia Mirkin2015-02-201-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: use unordered_set instead of list to keep track of var usesTobias Klausmann2014-07-081-3/+4
| | | | | | | | | | | The set of variable uses does not need to be ordered in any way, and removing/adding elements is a fairly common operation in various optimization passes. This shortens runtime of piglit test fp-long-alu to ~22s from ~4h Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add maxwell (sm50) compiler backendBen Skeggs2014-05-151-0/+6
| | | | | | | | | | The big missing part here is proper sched data calculations, but hopefully the chosen placeholder will be sufficient for now. Passes piglit as well as GK107 does. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump sched data member to 32-bitsBen Skeggs2014-05-151-1/+1
| | | | | | | SM50 backend requires 21 bits per instruction, not 8. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: change texture offsets to ValueRefs, allow nonconstIlia Mirkin2014-04-281-1/+2
| | | | | | | This allows us to have non-constant offsets for textureGatherOffset and textureGatherOffsets. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for new bitfield manipulation opcodesIlia Mirkin2014-04-281-0/+2
| | | | | | | | | | This adds support for: IBFE, UBFE, BFI, LSB, IMSB, UMSB, BREV, POPC Which are all required for ARB_gs5 support. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for SAMPLEMASK sysvalIlia Mirkin2014-04-261-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for PIPE_CAP_SAMPLE_SHADINGIlia Mirkin2014-04-261-0/+7
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: add support for PIPE_CAP_SAMPLE_SHADINGIlia Mirkin2014-04-261-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: enable texture query lodIlia Mirkin2014-04-071-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: add support for texelFetch'ing MS textures, ARB_texture_multisampleIlia Mirkin2014-01-271-0/+8
| | | | | | | | | | | | | | Creates two areas in the AUX constbuf: - Sample offsets for MS textures - Per-texture MS settings When executing a texelFetch with a MS sampler, looks up that texture's settings and adjusts the parameters given to the texfetch instruction. With this change, all the ARB_texture_multisample piglits pass, so turn on PIPE_CAP_TEXTURE_MULTISAMPLE. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix PFETCH and add RDSV to get VSTRIDE for GPsChristoph Bumiller2014-01-271-0/+1
|
* Move nv30, nv50 and nvc0 to nouveau.Johannes Obermayr2013-09-111-0/+1197
It is planned to ship openSUSE 13.1 with -shared libs. nouveau.la, nv30.la, nv50.la and nvc0.la are currently LIBADDs in all nouveau related targets. This change makes it possible to easily build one shared libnouveau.so which is then LIBADDed. Also dlopen will be faster for one library instead of three and build time on -jX will be reduced. Whitespace fixes were requested by 'git am'. Signed-off-by: Johannes Obermayr <[email protected]> Acked-by: Christoph Bumiller <[email protected]> Acked-by: Ian Romanick <[email protected]>