summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nv50,nvc0: limit the y-tiling of 3d textures to the first level's tilingIlia Mirkin2015-04-063-10/+13
| | | | | | | | | | | | | | We limit y-tiling to 0x20 when depth is involved. However the function is run for each miplevel, and the hardware expects miplevel 0 to have the highest tiling settings. Perform the y-tiling limit on all levels of a 3d texture, not just the ones that have depth. Fixes: texelFetch fs sampler3D 98x129x1-98x129x9 Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Nick Tenney <[email protected]> # GT216 Cc: "10.4 10.5" <[email protected]>
* nv50: allocate more offset space for occlusion queriesIlia Mirkin2015-04-041-5/+5
| | | | | | | | | | | | | | | Commit 1a170980a09 started writing to q->data[4]/[5] but kept the per-query space at 16, which meant that in some cases we would write past the end of the buffer. Rotate by 32, like nvc0 does. This ensures that we always have 32 bytes in front of us, and the data writes will go within the allocated space. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89679 Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Nick Tenney <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]> Cc: "10.4 10.5" <[email protected]>
* nv50/ir: avoid folding immediates into imad operationsIlia Mirkin2015-04-021-1/+2
| | | | | | | | Commit 09ee907266 added logic to fold immediates into mad operations, but the emission code is only there for fmad. Only allow it on float types. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix imad emission when dst == src2Ilia Mirkin2015-04-021-1/+1
| | | | | | | Commit fb63df22151f added 4-byte mad support, but only supported emission for floats. Disable it for ints for now. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: synchronize "scratch runout" destruction with the command streamMarcin Ślusarz2015-03-312-19/+37
| | | | | | | | | | | | | | | | | | | | | When nvc0_push_vbo calls nouveau_scratch_done it does not mean scratch buffers can be freed immediately. It means "when hardware advances to this place in the command stream the scratch buffers can be freed". To fix it, just postpone scratch runout destruction after current fence is signalled. The bug existed for a very long time. Nobody noticed, because "scratch runout" code path is rarely executed. Fixes hang at the very beginning of first mission in "Serious Sam 3" on nve7/gk107. It manifested as: nouveau E[ PFIFO][0000:01:00.0] read fault at 0x000a9e0000 [PTE] from GR/GPC0/PE_2 on channel 0x007f853000 [Sam3[17056]] Cc: "10.4 10.5" <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir/gk110: fix offset flag position for TXD opcodeIlia Mirkin2015-03-271-0/+1
| | | | | Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: take postFactor into account when doing peephole optimizationsIlia Mirkin2015-03-271-4/+8
| | | | | | | | | | Multiply operations can have a post-factor on them, which other ops don't support. Only perform the peephole optimizations when there is no post-factor involved. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89758 Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/util: remove u_linkageIlia Mirkin2015-03-262-2/+0
| | | | | | | | Does not appear to be used in tree. Coverity spotted some errors in the bitmask stuff, but the whole thing appears to be unused. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* Revert "nv50,nvc0: remove bogus 64_FLOAT formats"Ilia Mirkin2015-03-231-0/+5
| | | | | | | | | | | | | | | This reverts commit 20346808cf4f1ee4f320afaf18f94043fb146f2e. The conversion is actually done since these are the *B macro variants and no vtx format is supplied, which makes them go through the translate module. This restores the following piglit tests to passing: draw-vertices user gl-2.0-vertexattribpointer Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: implement get_device_vendor() for existing driversGiuseppe Bilotta2015-03-231-0/+7
| | | | | | | | | The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* gallium: add FMA and DFMA opcodes (v3)Marek Olšák2015-03-163-0/+4
| | | | | | | | | Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix wrong max value for driver queriesSamuel Pitoiset2015-03-091-4/+3
| | | | | | | | | | | The maximum value of a Gallium HUD's panel is automatically adjusted when the current value is greater than the max. If we set the pipe_query_driver_info::max_value to UINT64_MAX, the maximum value is never adjusted and this results in a flat line instead of a pretty curve which is correctly scaled. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: Fix build, invalid extern "C" around header inclusion.Mark Janes2015-03-061-2/+0
| | | | | | | | | | | | A previous patch to fix header inclusion within extern "C" neglected to fix the occurences of this pattern in nouveau files. When the helper to detect this issue was pushed to master, it broke the build for the nouveau driver. This patch fixes the nouveau build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=89477 Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]>
* nv50,nvc0: remove bogus 64_FLOAT formatsIlia Mirkin2015-03-061-5/+0
| | | | | | | There is no HW support for these and the VBO pusher doesn't know about them. No need to, either, since the st will be lowering them to 2x32. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: enable double supportIlia Mirkin2015-02-201-2/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: remove merge/split pairs to allow normal propagation to occurIlia Mirkin2015-02-201-0/+30
| | | | | | | | | | Because the TGSI interface creates merges for each instruction source and then splits them back out, there are a lot of unnecessary merge/split pairs which do essentially nothing. The various modifier/etc propagation doesn't know how to walk though those, so just remove them when they're unnecessary. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add support for new TGSI double opcodesIlia Mirkin2015-02-201-0/+236
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: handle zero and negative sqrt argumentsIlia Mirkin2015-02-201-2/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: no instruction can load a double immediateIlia Mirkin2015-02-201-0/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix lowering of RSQ/RCP/SQRT/MOD to work with F64Ilia Mirkin2015-02-205-16/+40
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: fix F2F flipped stype/dtype flagsIlia Mirkin2015-02-201-2/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: fix DSET boolean float flagIlia Mirkin2015-02-201-0/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: fix DMUL opcode encodingIlia Mirkin2015-02-201-3/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission of dadd/dmul/dmad opcodesIlia Mirkin2015-02-201-3/+77
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add emission of dadd/dmul/dmad opcodes, fix minmaxIlia Mirkin2015-02-201-3/+63
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: add new double-related shader caps to all the gettersIlia Mirkin2015-02-202-0/+10
| | | | | | | | Missed a few drivers in the earlier changes, this should fix up all the ones that print unknown caps or don't have a default statement. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nv50: add PIPELINE_STATISTICS query support, based on nvc0Ilia Mirkin2015-02-192-2/+27
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Nick Tenney <[email protected]>
* gallium: add shader cap for dldexp/dfracexp supportIlia Mirkin2015-02-191-0/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add a cap to enable double rounding opcodesIlia Mirkin2015-02-191-0/+4
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* gallium: add interface and state tracker support for GL_AMD_pinned_memoryMarek Olšák2015-02-173-0/+3
| | | | | | v2: add alignment restrictions to docs, fix indentation in headers Reviewed-by: Christian König <[email protected]>
* nvc0: allow holes in xfb target listsIlia Mirkin2015-02-142-4/+13
| | | | | | | | Tested with a modified xfb-streams test which outputs to streams 0, 2, and 3. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]>
* nvc0: bail out of 2d blits with non-A8_UNORM alpha formatsIlia Mirkin2015-02-141-2/+5
| | | | | | | | This fixes the teximage-colors uploads with GL_ALPHA format and non-GL_UNSIGNED_BYTE type. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]>
* nv50,nvc0: Mark PIPE_QUERY_TIMESTAMP_DISJOINT as ready immediatelyTiziano Bacocco2015-02-102-0/+6
| | | | | | | | | Without this when an application issues that query, it would try to wait the result from the gpu, and since no query has been actually issued, it will wait forever. Signed-off-by: Tiziano Bacocco <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Fold IMM into MADRoy Spliet2015-02-101-0/+53
| | | | | | | | | | | | | Add a specific optimisation pass for NV50 to check whether SRC0 or SRC1 is a MOV dst, IMM. If so: fold the IMM in and try to drop the MOV. Must be done post-RA because it requires that SDST == SSRC2. V2: improve readability and add comments to clarify decisions V3: Remove redundant code... compiler already attempts to put the IMM in SSRC1 Signed-off-by: Roy Spliet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Add emit support for MAD IMM formatRoy Spliet2015-02-101-0/+8
| | | | | | | | | But don't enable generation of it in the opProperties, because we can't guarantee the SDST==SRC2 constraint until after register assignment. We'll add a post-RA folding pass to utilise this. Signed-off-by: Roy Spliet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Add support for MAD 4-byte opcodeRoy Spliet2015-02-102-8/+6
| | | | | | | | | | | | Add emission rules for negative and saturate flags for MAD 4-byte opcodes, and get rid of some of the constraints. Obviously tested with a wide variety of shaders. V2: Document MAD as supported short form V3: Split up IMM from short-form modifiers Signed-off-by: Roy Spliet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: change the way float face is returnedIlia Mirkin2015-02-102-4/+6
| | | | | | | | | | | | The old way made it impossible for the optimizer to reason about what was going on. The new way is the same number of instructions (the neg gets folded into the cvt) but enables the optimizer to be cleverer if comparing to a constant (most common case). [The optimizer is presently not sufficiently clever to work this out, but it could relatively easily be made to be. The old way would have required significant complexity to work out.] Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: Add MULTISAMPLE_Z_RESOLVE capAxel Davy2015-02-063-0/+3
| | | | | | | | | | | | | | | | Resolving a multisampled depth texture into a single sampled texture is supported on >= SM4.1 hw. It is possible some previous hw support it. The ability was tested on radeonsi and nvc0. Apparently is is also supported for radeon >= r700. This patch adds the MULTISAMPLE_Z_RESOLVE cap and add it to the drivers. It is advertised for drivers for which it is sure the ability is supported. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Axel Davy <[email protected]>
* gallium: add a cap to determine whether the driver supports offset_clampIlia Mirkin2015-02-023-0/+3
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Glenn Kennard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nvc0: add name to magic numberIlia Mirkin2015-01-051-2/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: regenerate rnndb headersIlia Mirkin2015-01-0517-837/+1157
| | | | | | | | | | | | | | | The headers hadn't been regenerated in a long time and had seen a number of manual modifications. A few changes: - remove nvc0_2d entirely, use the nv50 header which has the nvc0 values too - remove 3ddefs, it's identical to the nv50 file - move macros out into a separate file Also the upstream rnndb changed the overall chip naming convention; this was fixed up manually in the generated files until a better solution is determined. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: regenerate rnndb headersIlia Mirkin2015-01-0511-358/+451
| | | | | | | | | | The headers hadn't been regenerated in a long time, and there were a few minor divergences. Among other things, rnndb has changed naming to G80/etc, for now I've not tackled switching that over and manually replaced the nvidia codenames back to the chip ids. However no other modifications of the headergen'd headers was done. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: enable texture compressionTobias Klausmann2015-01-052-3/+26
| | | | | | | | | Compression seems to be supported for only some formats. Enable it for those. Previously this was disabled for everything despite the code looking like it was actually enabled. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: enable sat modifier for OP_SUBIlia Mirkin2015-01-051-1/+1
| | | | | | | SUB is handled the same as ADD, so no reason not to allow a saturate modifier on it. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Add sat modifier for mulRoy Spliet2015-01-052-1/+7
| | | | | Signed-off-by: Roy Spliet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: avoid doing work inside of an assertIlia Mirkin2015-01-052-2/+4
| | | | | | | | assert is compiled out in release builds - don't put logic into it. Note that this particular instance is only used for vp debugging and is normally compiled out. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: fix texture offsets in release buildsIlia Mirkin2015-01-052-2/+4
| | | | | | | | | | assert's get compiled out in release builds, so they can't be relied upon to perform logic. Reported-by: Pierre Moreau <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Roy Spliet <[email protected]> Cc: "10.2 10.3 10.4" <[email protected]>
* nv50/ir: Fold sat into madRoy Spliet2015-01-011-1/+1
| | | | | | | | | The mad instruction emitter already supported the saturate modifier, but the ModifierFolding pass never tried folding cvt sat operations in for NV50. Signed-off-by: Roy Spliet <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: fold MAD when one of the multiplicands is constIlia Mirkin2015-01-011-0/+23
| | | | | | | | | | Fold MAD dst, src0, immed, src2 (or src0/immed swapped) when - immed = 0 -> MOV dst, src2 - immed = +/- 1 -> ADD dst, src0, src2 These types of MAD patterns were observed in some st/nine shaders. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: set vertex id base to index_biasIlia Mirkin2014-12-305-7/+35
| | | | | | | | | | | | | | Fixes the piglits which check that gl_VertexID includes the base vertex offset: arb_draw_indirect-vertexid elements gl-3.2-basevertex-vertexid Note that this leaves out the original G80, for which this will continue to fail. It could be fixed by passing a driver constbuf value in, but that's beyond the scope of this change. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.3 10.4" <[email protected]>