summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: add support for VOTE tgsi opcodesIlia Mirkin2016-06-065-24/+77
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_TGSI_VOTE for when the VOTE ops are allowedIlia Mirkin2016-06-063-0/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nv50/ir: use round toward 0 when converting doubles to integersSamuel Pitoiset2016-06-061-1/+3
| | | | | | | | | | | | Like floats, we should use the round toward 0 mode instead of the nearest one (which is the default) for doubles to integers. This fixes all arb_gpu_shader_fp64 piglits which convert doubles to integers (16 tests). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* nv50,nvc0: fix BGR10_A2UI vertex formatIlia Mirkin2016-06-051-1/+1
| | | | | | | | This is mostly academic as this is not reachable from GL, which only has the packed RGB10_A2UI vertex format. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: do not clear surfaces bins in the validate functionSamuel Pitoiset2016-06-052-5/+2
| | | | | | | | | We should not call nouveau_bufctx_reset() inside a validate function. This only affects Fermi where images are aliased between 3D and CP. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: re-validate images after launching a grid on FermiSamuel Pitoiset2016-06-051-0/+3
| | | | | | | | | | | | | | | | | | | | | Images invalidation is a bit weird on Fermi and there is already a hack which forces invalidating all images when launching a computer shader to help in fixing 3D<->CP interaction. However, we need to re-validate images for compute because nvc0_compute_invalidate_surfaces() will destroy the previous binding. This is not really good for performance purposes but this might be improved later. This fixes the following piglits: - spec/arb_compute_shader/execution/basic-uniform-access - spec/arb_compute_shader/execution/mutiple-texture-reading - spec/arb_compute_shader/execution/multiple-workgroups - spec/glsl-4.30/execution/built-in-functions/cs-* (207 tests) Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: reduce overhead from always marking images dirtyIlia Mirkin2016-06-041-9/+36
| | | | | | | | | | We would revalidate images when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the images. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: reduce overhead from always marking buffers dirtyIlia Mirkin2016-06-041-6/+20
| | | | | | | | | | We would revalidate buffers when anything was touched at all. Which is unfortunate, since the state tracker does not use CSO's to reduce the workload. So instead implement a protocol to ensure that something has changed before revalidating all the SSBOs. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: fix memory barrier flag handlingIlia Mirkin2016-06-041-9/+16
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: mark bound buffer range validIlia Mirkin2016-06-043-0/+9
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nvc0: mark buffer texture range valid for shader imagesSamuel Pitoiset2016-06-033-0/+31
| | | | | | | | Loosely based on radeonsi (Thanks to Nicolai). Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: 12.0 <[email protected]>
* nv50/ir: fix error finding free element in bitset in some situationsIlia Mirkin2016-05-311-0/+6
| | | | | | | | | | | | | | This really only hits for bitsets with a size of a multiple of 32. We can end up with pos = -1 as a result of the ffs, which we in turn decide is a valid position (since we fall through the loop and i == 1, we end up adding 32 to it, so end up returning 31 again). Up until recently this was largely unreachable, as the register file sizes were all 63 or 255. However with the advent of compute shaders which can restrict the number of registers, this can now happen. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* nv50/ir: print relevant file's bitset when showing RA infoIlia Mirkin2016-05-311-4/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix spilling predicates to registersIlia Mirkin2016-05-301-0/+4
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2 12.0" <[email protected]>
* nvc0/ir: limit max number of regs based on availability in SMIlia Mirkin2016-05-302-2/+4
| | | | | | | | | This effectively limits registers to 32 and 64 for fermi and kepler when 1024 threads are used, but allows the full amount to be used with smaller thread sizes. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: record number of threads in a compute shaderIlia Mirkin2016-05-305-2/+10
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: Add missing handling of U64/S64 in inlinesPierre Moreau2016-05-301-1/+3
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix emission of predicate spill to registerIlia Mirkin2016-05-301-1/+2
| | | | | | The lane mask only applies to real mov's, while here we're using PSET. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix some compute texture validation bits on keplerIlia Mirkin2016-05-303-2/+7
| | | | | | | | | (a) Make sure to update the TIC in case of an updated buffer address (b) Mark newly-inactive textures dirty so that we update the handle in set_tex_handles. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: push offset down to driverStanimir Varbanov2016-05-301-0/+6
| | | | | | | | | | | | | Push offset down to drivers when importing dmabuf. This is needed to more fully support EGL_EXT_image_dma_buf_import when a non-zero offset is specified. Tesing has been done for freedreno, and compile tested following gallium drivers: nouveau,svga,virgl,r600,r300,radeonsi,swrast,i915,ilo Signed-off-by: Stanimir Varbanov <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* nv50,nvc0: fix the max_vertices=0 caseIlia Mirkin2016-05-293-2/+4
| | | | | | | This is apparently legal. Drop any emit/restarts, and pass a 1 to the hardware. Signed-off-by: Ilia Mirkin <[email protected]>
* gk110/ir: fix unspilling of predicates from registersIlia Mirkin2016-05-281-0/+28
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96258 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2 11.1" <[email protected]>
* nvc0: remove outdated surfaces validation code for GK104Samuel Pitoiset2016-05-281-70/+0
| | | | | | | | | This code was used for validating surfaces with compute but now we use pipe_image_view instead. Anyway, surfaces support should be re-introduced properly once OpenCL happens. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: do not always invalidate 3D CBs when using computeSamuel Pitoiset2016-05-281-8/+17
| | | | | | | | | Constant buffers are aliased between 3D and CP on Fermi, but we should only invalidate them when a compute shader actually uses CBs and not all the time after a lauching grid. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: enable GL 4.3 on kepler/fermiDave Airlie2016-05-281-1/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* nvc0/ir: handle a load's reg result not being used for locked variantsIlia Mirkin2016-05-263-11/+45
| | | | | | | | | | | | | | For a load locked, we might not use the first result but the second result is the predicate result of the locking. In that case the load splitting logic doesn't apply (which is designed for splitting 128-bit loads). Instead we take the predicate and move it into the first position (as having a dead result in first def's position upsets all sorts of things including RA). Update the emitters to deal with this as well. Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Dave Airlie <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0/ir: avoid generating illegal instructions for compute constbuf loadsIlia Mirkin2016-05-261-1/+2
| | | | | | | | | | | | For user-supplied constbufs, fileIndex is 0. In that case, when we subtract 1, we'll end up loading from constbuf offset -16. This is illegal, and there are asserts to avoid it. Normally we'd just DCE it, but no point in generating the instructions if they're not going to be used. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: invalidate textures/samplers between 3D and CP on FermiSamuel Pitoiset2016-05-262-0/+27
| | | | | | | | | | | Like constant buffers, samplers and textures are aliased on Fermi and we need to invalidate the state when switching from 3D to CP and vice versa. This fixes rendering issues in the UE4 demos. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to monitor MP perf counters with compute shadersSamuel Pitoiset2016-05-262-19/+55
| | | | | | | | | | | | | | | | | | To read out MP perf counters we use a compute shader and need to upload input data like a 64-bits addr used to store the values and a sequence ID for synchronization. Currently, this input data is uploaded as user uniforms which means that it's sticked to c0[], but if a compute shader from a real application is used, monitoring those performance counters will just overwrite some data and miserably crash. Instead, sticking the 64-bits addr and the sequence into the driver constant buffer seems like much better and will allow to monitor counters with GL 4.3 apps. Tested on GF119 and GK110, but should not hurt anything on GK104. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add note about where the viewport mask would goIlia Mirkin2016-05-261-0/+1
| | | | | | | Not piping this all the way through yet, but no better place to note this down. This will can be used with NV_viewport_array2. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: enable 32 textures on kepler+Ilia Mirkin2016-05-262-3/+3
| | | | | | | | | For fermi, this likely will require use of linked tsc mode. However on bindless architectures, we can have as many as we want. As it stands, the AUX_TEX_INFO has 32 teture handles reserved. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: add descriptions for hardware perf counters/metricsSamuel Pitoiset2016-05-252-70/+353
| | | | | | | | | The GALLIUM_HUD does not yet expose a description for each events, but this might be useful for developers who want to have a long description of hw perf counters directly in the source code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: expose robust buffer accessIlia Mirkin2016-05-231-1/+1
| | | | | | | We apparently pass all the relevant CTS tests. There are probably some shortcomings, but they can be addressed down the line. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: Add a pipe cap for whether primitive restart works for patches.Kenneth Graunke2016-05-233-0/+3
| | | | | | | | | | | | | | | Some hardware supports primitive restart on patch primitives, and other hardware does not. Modern GL and ES include a query for this feature; adding a capability bit will allow us to answer it. As far as I know, AMD hardware does not support this feature, while NVIDIA and Intel hardware does. However, most Gallium drivers do not appear to support tessellation shaders yet. So, I've enabled it for nvc0 and disabled it everywhere else. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: do not invalidate compute constbufs on KeplerSamuel Pitoiset2016-05-231-4/+6
| | | | | | | | Constbufs are only aliased on Fermi and this will reduce the number of flushes when we switch between 3d and compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv30: don't assert when running out of registersIlia Mirkin2016-05-222-3/+1
| | | | | | | | | | This happens with dEQP tests. The code doesn't at all protect against this condition, so while unhandled, this is an expected situation. Also avoid using more than the first 16 registers for nv3x vertex programs. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: allow allocating non-object-backed buffersIlia Mirkin2016-05-221-4/+1
| | | | | | | On nv30, for example, there is no hardware index buffer support. So all of those will be created entirely in user memory. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix indirect access for imagesSamuel Pitoiset2016-05-221-8/+14
| | | | | | | | | | | | When the array doesn't start at 0 we need to account for su->tex.r. While we are at it, make sure to avoid out of bounds access by masking the index. This fixes GL45-CTS.shading_language_420pack.binding_image_array. Signed-off-by: Samuel Pitoiset <[email protected]> Reported-by: Dave Airlie <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv30: reset the stencil mask when fast-clearingIlia Mirkin2016-05-221-1/+6
| | | | | | | | Apparently the stencil mask applies to clears on nv30/nv40. Reset it to 0xff before doing a stencil clear. This fixes gl-1.0-readpixsanity and a number of other piglit tests. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30,nv50: add PIPE_SHADER_CAP_PREFERRED_IR supportIlia Mirkin2016-05-222-6/+12
| | | | | | The mesa state tracker has recently started to query this. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix setting of tess_mode in various situationsIlia Mirkin2016-05-221-4/+14
| | | | | | | | This fixes a lot of INVALID_VALUE errors reported by the card when running dEQP tests. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nv50/ir: fix prog info initIlia Mirkin2016-05-221-3/+1
| | | | | | | | | Left over from the pre-mainline tess support. Adapt to use the new defines. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0/ir: return 0 for gl_TessCoord.z for non-triangles modesIlia Mirkin2016-05-221-0/+4
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: expose GLSL version 420 on GF100Samuel Pitoiset2016-05-211-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: enable ARB_shader_image_load_store on GF100Samuel Pitoiset2016-05-211-0/+3
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add a lowering pass for surfaces on FermiSamuel Pitoiset2016-05-212-0/+117
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add emission for SULDB and SUSTxSamuel Pitoiset2016-05-211-2/+44
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add emission for OP_SULEASamuel Pitoiset2016-05-211-0/+58
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix tex constraints for surface coords on FermiSamuel Pitoiset2016-05-211-0/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: use moveSources to condense sourcesIlia Mirkin2016-05-211-6/+1
| | | | | | | This makes sure that rIndirectSrc and other things stay updated. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>