summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: fix retrieving query results into buffer for timestampsIlia Mirkin2016-04-221-5/+15
| | | | | | | | | The timestamps are stored in a funny place, and even though they are a 64-bit result, are not stored with is64bit. Account for that when retrieving the query result into a resource. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2" <[email protected]>
* gallium: add bool return to pipe_context::end_queryNicolai Hähnle2016-04-213-3/+6
| | | | | | | | | Even when begin_query succeeds, there can still be failures in query handling. For example for radeon, additional buffers may have to be allocated when queries span multiple command buffers. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*Marek Olšák2016-04-222-2/+2
| | | | Acked-by: Jose Fonseca <[email protected]>
* gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*Marek Olšák2016-04-229-32/+32
| | | | | | | | Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
* gk110/ir: make use of IMUL32I for all immediatesSamuel Pitoiset2016-04-201-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* gk110/ir: do not overwrite def value with zero for EXCH opsSamuel Pitoiset2016-04-201-6/+15
| | | | | | | | | | This is only valid for other atomic operations (including CAS). This fixes an invalid opcode error from dmesg. While we are it, make sure to initialize global addr to 0 for other atomic operations. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: avoid tex read fault from compute shaders on GK110Samuel Pitoiset2016-04-201-0/+3
| | | | | | | | | | | | After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nouveau: codegen: Add support for OpenCL global memory buffersHans de Goede2016-04-201-2/+10
| | | | | | | | | | | | | | | Add support for OpenCL global memory buffers, note this has only been tested with regular load and stores and likely needs more work for e.g. atomic ops. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: codegen: Use FILE_MEMORY_BUFFER for buffersHans de Goede2016-04-206-5/+13
| | | | | | | | | | | | | | | | | | | | | | Some of the lowering steps we currently do for FILE_MEMORY_GLOBAL only apply to buffers, making it impossible to use FILE_MEMORY_GLOBAL for OpenCL global buffers. This commits changes the buffer code to use FILE_MEMORY_BUFFER at the ir_from_tgsi and lowering steps, freeing use of FILE_MEMORY_GLOBAL for use with OpenCL global buffers. Note that after lowering buffer accesses use the FILE_MEMORY_GLOBAL register file. Tested with piglet on a gf119 and a gk107: ./piglit run -o shader -t '.*arb_shader_storage_buffer_object.*' results/shader [9/9] pass: 9 / ./piglit run -o shader -t '.*arb_compute_shader.*' results/shader [20/20] skip: 4, pass: 16 | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* Revert "nv50/ra: `isinf()` is in namespace `std` since C++11."Jose Fonseca2016-04-191-4/+0
| | | | | | This reverts commit f525db6358fbaa7b4296d2e6484e0b1ae703ac78. It was superseeded by commit 649704f1f7c9e1d0990d34a76154b2eb656bee42.
* nvc0: do not break the universe on GK110+Samuel Pitoiset2016-04-141-0/+1
| | | | | | | | I removed that return 0 by mistake. Ooops. Fixes: 6e23fd4 ("nvc0: allow to use compute support on GM200") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to use compute support on GM200Samuel Pitoiset2016-04-143-2/+5
| | | | | | | | | This works like a charm but please not that NVF0_COMPUTE have to be set because compute support is still not enabled by default on GK110+. This will require more testing to make sure it won't break the 3D state. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ra: `isinf()` is in namespace `std` since C++11.Pierre Moreau2016-04-131-0/+4
| | | | | | | | | | This fixes a compile error while building Nouveau with C++11 enabled (and glibc >= 2.23). This happens if SWR is enabled, as it forces C++11. Signed-off-by: Pierre Moreau <[email protected]> Signed-off-by: Jose Fonseca <[email protected]> https://bugs.freedesktop.org/show_bug.cgi?id=94907
* gallium: Add capability for ARB_robust_buffer_access_behavior.Bas Nieuwenhuizen2016-04-123-0/+3
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add pipe_context::set_active_query_state for pausing queriesMarek Olšák2016-04-123-0/+18
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nv30: Add missing PIPE_SHADER_CAP_INTEGERS to get_shader_param()Hans de Goede2016-04-121-0/+1
| | | | | | | | Add missing PIPE_SHADER_CAP_INTEGERS for frag shaders to nv30_screen_get_shader_param(). Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: handle the case where there are no framebuffer attachmentsIlia Mirkin2016-04-094-11/+44
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: support sending string markers down into the command streamIlia Mirkin2016-04-094-2/+52
| | | | | | | This should hopefully make it a little easier to debug with GL applications like glretrace and looking at command streams. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add invalidate_resource support for buffer resourcesIlia Mirkin2016-04-097-2/+51
| | | | | | | Provide a callback to reallocate the underlying storage of a resource so that it is not bound to any existing fences. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: do not try to attach JOIN ops to ATOMSamuel Pitoiset2016-04-071-1/+1
| | | | | | | | | | | This might result in an INVALID_OPCODE dmesg error in case a join is attached to an atomic operation. Spotted with arb_shader_image_load_store-host-mem-barrier on GK104. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-073-0/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+Ilia Mirkin2016-04-043-2/+64
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: add OP_SELP emission, used in DSQRT loweringIlia Mirkin2016-04-021-0/+30
| | | | | | | | The current DSQRT lowering code emits an OP_SELP, so we have to handle its emission. This will eventually go away, but no harm supporting this op. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: we can't load local memory directly into an outputIlia Mirkin2016-04-021-1/+2
| | | | | | | | | | | This fixes piglit tests like tests/spec/glsl-1.10/execution/variable-indexing/vs-output-array-float-index-wr.shader_test and related ones. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: fix envyas variants when building the code libSamuel Pitoiset2016-04-021-2/+2
| | | | | | | nvc0 and nve4 have been respectively replaced by gf100 and gk104. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-022-0/+2
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nvc0: enable compute shaders on GK104 and GM107+Samuel Pitoiset2016-04-011-1/+2
| | | | | | | | | Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump the maximum number of UBOs for compute on KeplerSamuel Pitoiset2016-04-012-3/+0
| | | | | | | | The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: do not lower shared+atomics on GM107+Samuel Pitoiset2016-04-011-7/+10
| | | | | | | | For Maxwell, the ATOMS instruction can be used to perform atomic operations on shared memory instead of this load/store lowering pass. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add atomics support on shared memory for KeplerSamuel Pitoiset2016-04-012-1/+108
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix wrong pred emission for ld lock on GK104Samuel Pitoiset2016-04-011-1/+4
| | | | | | | | This fixes 84b9b8f (nvc0/ir: add missing emission of locked load predicate). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for compute UBOs on KeplerSamuel Pitoiset2016-04-012-1/+57
| | | | | | | | Make sure to avoid out of bounds access in presence of indirect array indexing by loading the size from the driver constant buffer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add indirect compute support on KeplerSamuel Pitoiset2016-04-011-34/+77
| | | | | | | | | | The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) | y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reduce likelihood of collision for real buffers on KeplerSamuel Pitoiset2016-04-011-2/+2
| | | | | | | | | | | Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: store ubo info to the driver constbuf on KeplerSamuel Pitoiset2016-04-014-1/+30
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind user uniforms for compute on KeplerSamuel Pitoiset2016-04-012-27/+55
| | | | | | | | | | | Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind shader buffers for compute on KeplerSamuel Pitoiset2016-04-012-3/+39
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind driver cb for compute on c7[] for KeplerSamuel Pitoiset2016-04-014-45/+37
| | | | | | | | | | | | Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supportedIlia Mirkin2016-03-312-0/+18
| | | | | | | vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Check for valid insn instead of def sizePierre Moreau2016-03-311-2/+2
| | | | | | | | | | | | | This fixes a null pointer dereference during the register allocation pass, if a function had arguments. Functions arguments get a definition from the function itself, a definition which is therefore not linked to any instruction. If a value ends up having a definition but no linked instruction, the register allocation pass doesn't need to consider whether that value is generated by an instruction that can only handle "short" registers (on nv50). Signed-off-by: Pierre Moreau <[email protected]>
* nvc0/ir: move load/store lowering pass to handleLDST()Samuel Pitoiset2016-03-292-54/+61
| | | | | | | Having all this code in a big switch is not really a good pratice. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use a different offset for buffers and surfacesSamuel Pitoiset2016-03-294-28/+74
| | | | | | | | | | To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: make sure to disable fetches from previously-set VBOs when blittingIlia Mirkin2016-03-281-0/+2
| | | | | | | We disable the vertex attributes, but also disable the VBO fetch details as well, just in case. Not known to fix anything. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: disable primitive restart and index bias during blitsIlia Mirkin2016-03-281-0/+11
| | | | | | | | | | | | | | | | Back in the dawn of time, we used to do immediate uploads for the vertex data, and all was well. However Maxwell dropped support for immediate vertex data, so we started feeding in a VBO (in all cases). But we forgot to disable some things that apply in such cases, specifically primitive restart and index bias. The latter was causing WoW and other Blizzard games trouble as they use a pattern where they draw with a base vertex (aka index bias), followed by texture uploads (aka blits, internally). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Cc: "11.1 11.2" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nvc0/ir: fix picking of coordinates from tex instruction for textureGradIlia Mirkin2016-03-281-1/+11
| | | | | | | | | | | On Fermi, there's an argument in front of the coords that combines array and indirect handle, while on Kepler the array and the indirect handle are separate (and in front of the coords). We were previously only accounting for the array bit of it, if there were an indirect access it wouldn't be counted in the formula. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: saturate depth writesIlia Mirkin2016-03-281-1/+4
| | | | | | | | | | | Apparently there's no post-FS clamping logic, so we have to do this by hand. The depth will never be outside of the 0..1 range, even on floating point zeta buffers, so this should be safe. Fixes dEQP-GLES3.functional.fbo.depth.*clamp.* which tests writing invalid values on various zeta buffer formats. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: silence unhandled TGSI_PROPERTY_NEXT_SHADER infoSamuel Pitoiset2016-03-241-0/+3
| | | | | | | | radeonsi uses this property to make the best decision about which shader to compile, but this is not currently used by our codegen. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: make sure to delete samplers used by compute shadersSamuel Pitoiset2016-03-211-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nouveau: codegen: Do not silently fail in handeLOAD / handleSTORE / handleATOMHans de Goede2016-03-211-9/+18
| | | | | | | | | handeLOAD / handleSTORE / handleATOM can only handle TGSI_FILE_BUFFER and TGSI_FILE_MEMORY. Make things fail explictly when another register-file is used in these functions. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v2)
* nouveau: codegen: Disable more old resource handling codeHans de Goede2016-03-211-3/+12
| | | | | | | | | | | | | | Commit c3083c7082 ("nv50/ir: add support for BUFFER accesses") disabled / commented out some of the old resource handling code, but not all of it. Effectively all of it is dead already, if we ever enter the old code paths in handeLOAD / handleSTORE / handleATOM we will get an exception due to trying to access the now always zero-sized resources vector. Disable all the dead code. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> (v2)