summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nv50/ir: check for origin insn in findOriginForTestWithZeroPierre Moreau2017-03-091-0/+2
| | | | | | | | Function arguments do not have an "origin" instruction, causing a NULL-pointer dereference without this check. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: s/uint/enum pipe_render_cond_flag/ for set_render_condition()Brian Paul2017-03-083-3/+3
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()Brian Paul2017-03-083-3/+6
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium: s/unsigned/enum pipe_shader_type/ for pipe_screen::get_shader_param()Brian Paul2017-03-083-3/+6
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* gallium/util: replace pipe_mutex_unlock() with mtx_unlock()Timothy Arceri2017-03-072-2/+2
| | | | | | | | | | pipe_mutex_unlock() was made unnecessary with fd33a6bcd7f12. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_unlock(\([^)]*\)):mtx_unlock(\&\1):g' {} \; Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_mutex_lock() with mtx_lock()Timothy Arceri2017-03-072-2/+2
| | | | | | | | | | replace pipe_mutex_lock() was made unnecessary with fd33a6bcd7f12. Replaced using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_lock(\([^)]*\)):mtx_lock(\&\1):g' {} \; Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_mutex_destroy() with mtx_destroy()Timothy Arceri2017-03-072-2/+2
| | | | | | | | | | pipe_mutex_destroy() was made unnecessary with fd33a6bcd7f12. Replace was done with: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_destroy(\([^)]*\)):mtx_destroy(\&\1):g' {} \; Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_mutex_init() with mtx_init()Timothy Arceri2017-03-072-2/+2
| | | | | | | | | | pipe_mutex_init() was made unnecessary with fd33a6bcd7f12. Replace was done using: find ./src -type f -exec sed -i -- \ 's:pipe_mutex_init(\([^)]*\)):(void) mtx_init(\&\1, mtx_plain):g' {} \; Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: replace pipe_mutex with mtx_tTimothy Arceri2017-03-072-2/+2
| | | | | | pipe_mutex was made unnecessary with fd33a6bcd7f12. Reviewed-by: Marek Olšák <[email protected]>
* nvc0: take extra pushbuf space into account for pushbuf_space callsIlia Mirkin2017-03-041-2/+2
| | | | | | | | | | | | | | | | | See detailed explanation of why this is needed in commit eb60a89bc3a. This spot was missed/overlooked. Basically as a result of the fact that BEGIN_* ends up calling PUSH_SPACE, which in turn adds an extra 8 to the requested amount, we have to be mindful of that when doing bare nouveau_pushbuf_space calls. Reportedly this fixes some crashes when replaying a hitman trace taken on radeonsi. Fixes: eb60a89bc3a ("nouveau: take extra push space into account for pushbuf_space calls") Cc: "13.0 17.0" <[email protected]> Reported-by: Karol Herbst <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: increase alignment to 256 for texture buffers on fermiIlia Mirkin2017-03-041-1/+3
| | | | | | | | | | When binding as textures, the alignment can be 16. However when binding as an image, the address has to be aligned to 256. (Also when binding as an RT, but that can't happen with GL or current gallium APIs.) Reported-by: Roy Spliet <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Samuel Pitoiset <[email protected]>
* gallium: remove PIPE_CAP_USER_INDEX_BUFFERSMarek Olšák2017-02-253-3/+0
| | | | | | | | all drivers support it Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Brian Paul <[email protected]> Tested-by: Brian Paul <[email protected]> (VMware driver only)
* nvc0: use PascalB for most Pascal boardsBen Skeggs2017-02-212-1/+9
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* gallium: remove TGSI_OPCODE_CLAMPMarek Olšák2017-02-181-10/+0
| | | | | | | Not used and not widely supported. Use MIN+MAX instead. Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: set pipe_context uploaders in drivers (v3)Marek Olšák2017-02-143-0/+32
| | | | | | | | | | | | | | | Notes: - make sure the default size is large enough to handle all state trackers - pipe wrappers don't receive transfer calls from stream_uploader, because pipe_context::stream_uploader points directly to the underlying driver's stream_uploader (to keep it simple for now) v2: add error handling to nv50, nvc0, noop v3: set const_uploader Reviewed-by: Nicolai Hähnle <[email protected]> Tested-by: Edmondo Tommasina <[email protected]> (v1) Tested-by: Charmaine Lee <[email protected]>
* nvc0: disable linked tsc mode in compute launch descriptorIlia Mirkin2017-02-132-2/+6
| | | | | | | | | | Empirically, this makes things work. Presumably this was originally copied from the blob, which does make use of linked tsc mode. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99532 Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nv50,nvc0: use alternate samplers for stencilIlia Mirkin2017-02-121-3/+3
| | | | | | | | The blob uses these, and it fixes a bunch of dEQP stencil sampling tests involving border colors. Probably the Z-based samplers work somehow differently wrt border colors when using the stencil swizzle. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: set the render condition in the compute objectIlia Mirkin2017-02-111-2/+10
| | | | | | | Fixes GL45-CTS.compute_shader.conditional-dispatching Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gm107/ir: fix address offset bitfield for ATOMSIlia Mirkin2017-02-111-1/+1
| | | | | | | Fixes GL45-CTS.compute_shader.atomic-case1 on Maxwell Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50/ir: convert an ATOM.EXCH without a destination into a storeIlia Mirkin2017-02-111-0/+5
| | | | | | | | | | On SM35 there does not appear to be a way to emit a ATOM.EXCH with a null destination. This should be functionally equivalent to a plain store however, so just do that. Fixes GL45-CTS.compute_shader.atomic-case2 on SM35. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix 64-bit integer query buffer writesIlia Mirkin2017-02-113-20/+37
| | | | | | | The former logic just plain didn't work at all. We need to write the subsequent dword to the next buffer location. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: return a register when retrieving thread id sysvalIlia Mirkin2017-02-111-1/+1
| | | | | | | | | | We have logic to short-circuit such retrievals to zero. However "zero" was an immediate, and some logic expected to get registers (to later be propagated). Fix this by using loadImm. Fixes GL45-CTS.gpu_shader5.images_array_indexing Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add missing break after DSSGIlia Mirkin2017-02-111-0/+1
| | | | | | Recently broken during int64 addition. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix ubo max clamp, reset file indexIlia Mirkin2017-02-091-1/+3
| | | | | | | | | | | We just increased the max UBO, so we should also increase the clamp that we do for robustness. Similarly, as we're including the fileIndex in the new indirect value, we should reset fileIndex to 0 so that it is not added in a second time. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nv50/ir: always return 0 when trying to read thread id along unit dimIlia Mirkin2017-02-094-5/+17
| | | | | | | | | | | | | Many many many compute shaders only define a 1- or 2-dimensional block, but then continue to use system values that take the full 3d into account (like gl_LocalInvocationIndex, etc). So for the special case that a dimension is exactly 1, we know that the thread id along that axis will always be 0, so return it as such and allow constant folding to fix things up. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0/ir: fix robustness guarantees for constbuf loads on kepler+ computeIlia Mirkin2017-02-091-25/+22
| | | | | | | | | | | | | Kepler and up unfortunately only support up to 8 constbufs. We work around this by loading from constbufs as if they were storage buffers. However we were not consistently applying limits to loads from these buffers. Make sure to do the same thing we do for storage buffers. Fixes GL45-CTS.robust_buffer_access_behavior.uniform_buffer Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nvc0: increase number of ubo binding pointsIlia Mirkin2017-02-091-3/+2
| | | | | | | | | | | | | Apparently GL 4.5 requires 14 of these (there's a "*" in the spec, but it's unclear what it refers to). We need to expose an extra binding point for the "program parameters", which means this must be 15. Remove the last vestige of the "use c14 for immediates" idea. Fixes GL45-CTS.shading_language_420pack.binding_uniform_block_array Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nvc0: expose int64Ilia Mirkin2017-02-091-1/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: make it possible to have the flags def in def0Ilia Mirkin2017-02-095-12/+15
| | | | | | | There's all kinds of logic that doesn't like there being holes in defs or srcs lists. Avoid them. This also fixes the sched logic for maxwell. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for 64-bit shift lowering on SM20/SM30Ilia Mirkin2017-02-091-6/+62
| | | | | | | Unfortunately there is no SHF.L/SHF.R instruction pre-SM35. So we have to do a bit more work to get the job done. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for all the new int64 tgsi opcodesIlia Mirkin2017-02-096-5/+302
| | | | | | | | | | | | A few thoughts: - Some of that LegalizeSSA logic should really live much earlier and be subject to the likes of DCE and other useful passes - Some of the "lowering" done in from_tgsi should be done later so that proper optimization might be done. However this all works and the above can be improved upon later. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Split 64-bit integer MAD/MUL operationsPierre Moreau2017-02-091-0/+116
| | | | | | | Hardware does not support 64-bit integers MAD and MUL operations, so we need to transform them in 32-bit operations. Signed-off-by: Pierre Moreau <[email protected]>
* nvc0/ir: add a "high" subop for shifts, emit shf.l/shf.r for 64-bitIlia Mirkin2017-02-093-3/+74
| | | | | | Note that this is not available for SM20/SM30. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix SET and SLCT emissionIlia Mirkin2017-02-092-0/+6
| | | | | | | | We were never emitting a .X flag for consuming condition code on SET, and weren't emitting a signed type for SLCT comparison. Discovered while working on int64 logic. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for emitting partial min/max ops for int64Ilia Mirkin2017-02-094-1/+14
| | | | | | | | | | These operations allow you to compute min/max on arbitrary-width integers, 32 bits at a time. Note that the low/med ops implicitly set the condition code, and the med/high ops implicitly consume it. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add separate PIPE_CAP_INT64_DIVMODIlia Mirkin2017-02-093-0/+3
| | | | | | | | | | | Nouveau does not currently have logic to implement this as a library function. Even though such a library could be written, there's no big advantage to do it that way for now given that int64 is a very uncommon use-case. Allow a driver to expose INT64 without supporting division and modulo operations. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: turn PIPE_SHADER_CAP_DOUBLES into a screen capabilityNicolai Hähnle2017-02-023-5/+3
| | | | | | | | | | | | | | | | | | | Make the cap consistent with PIPE_CAP_INT64. Aside from the hypothetical case of using draw for vertex shaders (and actually caring about doubles...), every implementation supports doubles either nowhere or everywhere. Also, st/mesa didn't even check the cap correctly in all supported shader stages. While at it, add a missing LLVM version check for 64-bit integers in radeonsi. This is conservative: judging by the log, LLVM 3.8 might be sufficient, but there are probably bugs that have been fixed since then. v2: fix clover (Marek) Reviewed-by: Marek Olšák <[email protected]>
* nouveau: remove explicit __STDC_FORMAT_MACROS defineEmil Velikov2017-01-271-1/+0
| | | | | | | | Already handled by the build. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* gallium: Add integer 64 capabilityDave Airlie2017-01-273-0/+3
| | | | | | | | | v1.1: move to using a normal CAP. (Marek) v2: fill in the cap everywhere Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv50: add support for MUL_ZERO_WINS propertyIlia Mirkin2017-01-235-2/+11
| | | | | | | This is simply keyed off the vertex shader, as that's guaranteed to be present in any pipeline. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for MUL_ZERO_WINS propertyIlia Mirkin2017-01-234-9/+25
| | | | | | | | This sets the dnz flag on all the relevant multiplication operations. At emission time, this will only be supported by nvc0+, so nv50 will need a different solution. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_TGSI_MUL_ZERO_WINSIlia Mirkin2017-01-233-0/+3
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Axel Davy <[email protected]>
* nouveau: remove always false argument in nouveau_fence_new()Emil Velikov2017-01-185-11/+6
| | | | | | | | | No point in having the extra argument considering that it's effectively unused since the function was introduced. Cc: Ilia Mirkin <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: optimize shl + andIlia Mirkin2017-01-161-0/+11
| | | | | | | | | | | | | | | | | Address loading can often end up as shl + shr + shl combinations. The latter two are equal shifts, which get converted into an and mask. However if the previous shl is more than the mask is trying to remove (in terms of low bits), we can just remove the and entirely. This reduces some large shaders by as many as 3% of instructions (out of 2K). total instructions in shared programs : 6495509 -> 6491076 (-0.07%) total gprs used in shared programs : 954621 -> 954623 (0.00%) local gpr inst bytes helped 0 0 1014 1014 hurt 0 2 0 0 Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: enable FBFETCH with a special slot for color buffer 0Ilia Mirkin2017-01-169-6/+172
| | | | | | | | | | | | We don't need to support all the color buffers for advanced blend, just cb0. For Fermi, we use the special binding slots so that we don't overlap with user textures, while Kepler+ gets a dedicated position for the fb handle in the driver constbuf. This logic is only triggered when a FBFETCH is actually present so it should be a no-op most of the time. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add flags parameter to texture barrierIlia Mirkin2017-01-162-2/+2
| | | | | | | | This is so that we can differentiate between flushing any framebuffer reading caches from regular sampler caches. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_TGSI_FS_FBFETCHIlia Mirkin2017-01-163-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: true up exposing of the HW_METRIC_QUERY_GROUP for maxwellIlia Mirkin2017-01-161-2/+2
| | | | | | | This had been updated in one place but not the other. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: handle new DDIV op which will be used for double divisionsIlia Mirkin2017-01-161-0/+3
| | | | | | | The existing lowering is in place to lower that to RCP + MUL, or fancier things down the line if necessary. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: emit FMZ flag when requested on FFMAIlia Mirkin2017-01-151-0/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>