aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
...
* nv50/ir: fix combineLd/St to update existing records as necessaryIlia Mirkin2017-06-261-0/+8
| | | | | | | | | | | | | | | | | | | | | Previously the logic would decide that the record is kept, which translates into keep = false in the caller, which meant that these passes did not run. While it's right that keep = false which means that a new record does not need to be added, we do still have to perform the usual list maintenance. It's easiest to do this pre-merge rather than post. The lowering that clip/cull distance passes produce triggers this bug in TCS (since reading outputs is done differently in other stages), but it should be possible to achieve it with the right sequence of regular reads/writes. Fixes: KHR-GL45.cull_distance.functional Fixes: generated_tests/spec/arb_tessellation_shader/execution/tes-input/tes-input-gl_ClipDistance.shader_test Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nv50/ir: adjust overlapping logic to take fileIndex-relative offsetsIlia Mirkin2017-06-261-1/+5
| | | | | | | | | | | | | | | If the fileIndex is different, that means they are in logically different spaces. However if there's also a relative offset, then they could end up pointing at the same spot again. Also add a note about potential for multiple buffers to overlap even if they're at different file indexes. However that's potentially lowered away by the point that this logic hits. Not known to fix any specific application or test. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: VFETCH is also considered a load for MemoryOptIlia Mirkin2017-06-261-1/+1
| | | | | | | | This has no effect since in practice this will only play for memory-backed files, for which VFETCH will never happen. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: remove IDX from bufctx immediately, to avoid conflicts with clearIlia Mirkin2017-06-262-9/+10
| | | | | | | | | | | | The idxbuf could linger, and when a clear happened, which also uses the 3d bufctx, we could get an error trying to access it. This fixes spurious crashes/errors in CTS tests. Fixes: 61d8f3387d ("nv50,nvc0: clear index buffer bufctx bin unconditionally") Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nv50/ir: fetch indirect sources BEFORE the op that uses themIlia Mirkin2017-06-261-19/+32
| | | | | | | | | | All the BuildUtil helpers just insert the operation into the current BB. So we have to take care that any fetchSrc() operations happen before the operation whose setIndirect() it goes into. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: [email protected]
* nv50/ir: Properly fold constants in SPLIT operationPierre Moreau2017-06-251-3/+4
| | | | | | | Fixes: b7d9677d ("nv50/ir: constant fold OP_SPLIT") Cc: [email protected] Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix transfer of larger rectangles with DmaCopy on gk104 and upBen Skeggs2017-06-201-9/+32
| | | | | | | | | | | | | | | | By treating the rectangles as 1cpp, we can run up against some internal copy engine limits and trigger a MEM2MEM_RECT_OUT_OF_BOUNDS error check at launch time. This commit enables the REMAP hardware, which allows us to specify both the component size and number of components for a transfer. We're then able to pass in the real width/nblocksx values and not hit the limits. There's a couple of "supported" CPPs in the list that we can't actually hit, but are there simply because they're possible. Signed-off-by: Ben Skeggs <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: copy engine surface params are only relevant for tiled surfacesBen Skeggs2017-06-201-18/+19
| | | | | | | | | Aside from reducing pushbuf usage in some situations, this commit should have no other effect, and is just to make it somewhat obvious that those methods have zero effect on linear surfaces. Signed-off-by: Ben Skeggs <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_CAP_BINDLESS_TEXTURESamuel Pitoiset2017-06-143-0/+3
| | | | | | | | | Whether bindless texture operations are supported by the underlying driver. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* util: Port nir_array functionality to u_dynarrayThomas Helland2017-06-074-4/+4
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* gallium: Add missing includesThomas Helland2017-06-072-0/+2
| | | | | | | | These will need to be in place to avoid regressions when removing these includes from the u_dynarray Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nvc0: Add support for ARB_post_depth_coverageLyude2017-06-028-1/+15
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: Add a cap to check if the driver supports ARB_post_depth_coverageLyude2017-06-023-0/+3
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: disable BGRA8 images on FermiLyude2017-06-021-5/+14
| | | | | | | | | | | | BGRA8 image stores on Fermi don't work, which results in breaking PBO downloads, such that they always return 0x0. Discovered this through a glamor bug, and confirmed it does indeed break a good number of piglit tests such as spec/arb_pixel_buffer_object/pbo-read-argb8888 Fixes: 8e7893eb53213 ("nvc0: add support for BGRA8 images") Signed-off-by: Lyude <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0: Clean up unnecessary includes from gallium/auxiliary/vl/Rhys Kidd2017-06-011-3/+0
| | | | | Signed-off-by: Rhys Kidd <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: support for GP10BAlexandre Courbot2017-05-301-0/+1
| | | | | | | | GP10B uses the same 3D class as GP100. Signed-off-by: Alexandre Courbot <[email protected]> Acked-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nouveau: drop Android 4.4 and earlier supportRob Herring2017-05-252-33/+3
| | | | | | | | | Support for Android 4.4 and earlier has already been removed from mesa. Remove this remaining piece from nouveau, too. Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Rob Herring <[email protected]>
* util/disk_cache: add new driver_flags param to cache keysTimothy Arceri2017-05-231-1/+1
| | | | | | | | | This will be used for things such as adding driver specific environment variables to the key. Allowing us to set environment vars that change the shader and not have the driver ignore them if it finds existing shaders in the cache. Reviewed-by: Eduardo Lima Mitev <[email protected]>
* nv50,nvc0: clear index buffer bufctx bin unconditionallyIlia Mirkin2017-05-202-5/+3
| | | | | | | | | The previous condition was to clear it out if it had previously been set, not what's in the current draw. That information is gone now, so just clear it unconditionally. Fixes: 330d0607e ("gallium: remove pipe_index_buffer and set_index_buffer") Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: fix vtxbuf cleanupIlia Mirkin2017-05-201-1/+1
| | | | | | | Use a user-buffer-aware cleanup function. Fixes: c24c3b94ed ("gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytes") Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: SHLADD's middle source must be an immediateIlia Mirkin2017-05-201-0/+2
| | | | | | | | The instruction encodings only allow for immediates. Don't try to replace a zero (which is dumb to have in that op in any case) with RZ. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium: add PIPE_CAP_ALLOW_MAPPED_BUFFERS_DURING_EXECUTIONMarek Olšák2017-05-173-0/+3
| | | | | | for skipping mapped-buffer checking in every GL draw call Reviewed-by: Nicolai Hähnle <[email protected]>
* nv50/ir: Report wrong prog types using proper varPierre Moreau2017-05-131-1/+1
| | | | | | | | | | | | Coverity caught the use of the uninitialised variable `type`. However, it was `info->type`, which is initialised, which was meant to be used. CID: 1406000 Reported-by: Ilia Mirkin <[email protected]> Fixes: b490ca9a387d ("nv50/ir: Fail if encountering unknown shader type") Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* Android: push driver build details to driver makefilesRob Herring2017-05-111-0/+5
| | | | | | | | | | | | | src/gallium/targets/dri/Android.mk contains lots of conditional for individual drivers. Let's move these details into the individual driver makefiles. In the process, align the make driver conditionals with automake (i.e. HAVE_GALLIUM_*). Signed-off-by: Rob Herring <[email protected]> [Emil Velikov: add the radeon winsys for radeonsi] Signed-off-by: Emil Velikov <[email protected]>
* Android: remove remaining explicit libcxx includesRob Herring2017-05-111-1/+0
| | | | | | | | | | | Explicitly including libcxx includes is not necessary at least on Android M and later. It appears that libc++ was made the default in commit "Make libc++ the default STL." in Android build system post L. However, if L support is still needed, using "LOCAL_CXX_STL=libc++" is the preferred way. Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEXMarek Olšák2017-05-103-0/+3
| | | | | | | The next patch will use it. This is really for svga and GL2-level drivers. Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: remove pipe_index_buffer and set_index_bufferMarek Olšák2017-05-1018-209/+107
| | | | | | | | | | | | | | pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
* gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytesMarek Olšák2017-05-101-8/+8
| | | | For faster initialization of non-indirect draws.
* gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytesMarek Olšák2017-05-1013-60/+62
|
* nv50/ir: Replace NV50_PROGRAM_IR_* by PIPE_SHADER_IR_*Pierre Moreau2017-05-075-10/+7
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Remove unused translation methodsPierre Moreau2017-05-072-10/+3
| | | | | | | This code was merged commented out, and has stayed that way ever since. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Free target if we failed to create a programPierre Moreau2017-05-071-1/+3
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Fail if encountering unknown shader typePierre Moreau2017-05-071-2/+2
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: add a missing assertion in emitISCADD()Samuel Pitoiset2017-05-011-0/+2
| | | | | | | For consistency, similar to the other emitters. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: Enable compute support for PascalBoyan Ding2017-04-273-4/+7
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Add new launch descriptor format for GP100Boyan Ding2017-04-272-34/+197
| | | | | | | | | | v2: Also handle the the new format in indirect dispatch Use compute class check instead of chipset check Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Fix index of unk fields in nve4_cp_launch_descBoyan Ding2017-04-271-2/+2
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Fix indentation of maxwell compute class definitionsBoyan Ding2017-04-271-2/+2
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: disable the TGSI merge registers passSamuel Pitoiset2017-04-262-2/+4
| | | | | | | | | | | | | | | | shader-db results on GK106 (Thanks Karol): total instructions in shared programs : 3931608 -> 3929463 (-0.05%) total gprs used in shared programs : 481255 -> 479014 (-0.47%) total local used in shared programs : 27481 -> 27381 (-0.36%) total bytes used in shared programs : 36031256 -> 36011120 (-0.06%) local gpr inst bytes helped 14 1471 1309 1309 hurt 1 88 384 384 Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERSSamuel Pitoiset2017-04-263-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: Add support for setting viewport index/layer from VS/TESIlia Mirkin2017-04-204-7/+27
| | | | | | | | | | | | | This enables support on GM200+ for: - GL_AMD_vertex_shader_layer - GL_AMD_vertex_shader_layer_viewport_index - GL_ARB_shader_viewport_layer_array Signed-off-by: Ilia Mirkin <[email protected]> [lyude: add relnotes/TES cap] Signed-off-by: Lyude <[email protected]> [imirkin: move relnotes to right place, add features.txt] Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Only store viewport in scratch register for GPLyude2017-04-201-0/+1
| | | | | | | | | EMIT only applies to geometry shaders. For everything else, we want to export the viewport normally. Signed-off-by: Lyude <[email protected]> Reviewed-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: fold u_trim_pipe_prim call from st/mesa to driversMarek Olšák2017-04-201-0/+5
| | | | | | | Most drivers don't need it and shouldn't need it because it can't be used in some cases (indirect draws, primitive restart, count from streamout). Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_TGSI_TES_LAYER_VIEWPORTNicolai Hähnle2017-04-143-0/+3
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nvc0: Enable ARB_shader_ballot on Kepler+Boyan Ding2017-04-131-1/+2
| | | | | | | | readInvocationARB() and readFirstInvocationARB() need SHFL.IDX instruction which is introduced in Kepler. Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*Boyan Ding2017-04-131-0/+31
| | | | | | | v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*Boyan Ding2017-04-131-0/+27
| | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Add SV_LANEMASK_* system values.Boyan Ding2017-04-135-0/+25
| | | | | | | v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Allow 0/1 immediate value as source of OP_VOTEBoyan Ding2017-04-133-11/+60
| | | | | | | | | | | | | | | | | | | | Implementation of readFirstInvocationARB() on nvidia hardware needs a ballotARB(true) used to decide the first active thread. This expressed in gm107 asm as (supposing output is $r0): vote any $r0 0x1 0x1 To model the always true input, which corresponds to the second 0x1 above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and "not 0x1" in the src field respectively. v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: (Ilia Mirkin) Make the handling more symmetric with predicate version in gm107 Use i->getSrc(s) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: Emit OP_SHFLBoyan Ding2017-04-131-0/+56
| | | | | | | | | v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: Check the range of immediate in OP_SHFL (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>