summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nv50/ir: Report wrong prog types using proper varPierre Moreau2017-05-131-1/+1
| | | | | | | | | | | | Coverity caught the use of the uninitialised variable `type`. However, it was `info->type`, which is initialised, which was meant to be used. CID: 1406000 Reported-by: Ilia Mirkin <[email protected]> Fixes: b490ca9a387d ("nv50/ir: Fail if encountering unknown shader type") Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* Android: push driver build details to driver makefilesRob Herring2017-05-111-0/+5
| | | | | | | | | | | | | src/gallium/targets/dri/Android.mk contains lots of conditional for individual drivers. Let's move these details into the individual driver makefiles. In the process, align the make driver conditionals with automake (i.e. HAVE_GALLIUM_*). Signed-off-by: Rob Herring <[email protected]> [Emil Velikov: add the radeon winsys for radeonsi] Signed-off-by: Emil Velikov <[email protected]>
* Android: remove remaining explicit libcxx includesRob Herring2017-05-111-1/+0
| | | | | | | | | | | Explicitly including libcxx includes is not necessary at least on Android M and later. It appears that libc++ was made the default in commit "Make libc++ the default STL." in Android build system post L. However, if L support is still needed, using "LOCAL_CXX_STL=libc++" is the preferred way. Signed-off-by: Rob Herring <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* gallium: add PIPE_CAP_CAN_BIND_CONST_BUFFER_AS_VERTEXMarek Olšák2017-05-103-0/+3
| | | | | | | The next patch will use it. This is really for svga and GL2-level drivers. Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: remove pipe_index_buffer and set_index_bufferMarek Olšák2017-05-1018-209/+107
| | | | | | | | | | | | | | pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
* gallium: separate indirect stuff from pipe_draw_info - 80 -> 56 bytesMarek Olšák2017-05-101-8/+8
| | | | For faster initialization of non-indirect draws.
* gallium: decrease the size of pipe_vertex_buffer - 24 -> 16 bytesMarek Olšák2017-05-1013-60/+62
|
* nv50/ir: Replace NV50_PROGRAM_IR_* by PIPE_SHADER_IR_*Pierre Moreau2017-05-075-10/+7
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Remove unused translation methodsPierre Moreau2017-05-072-10/+3
| | | | | | | This code was merged commented out, and has stayed that way ever since. Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Free target if we failed to create a programPierre Moreau2017-05-071-1/+3
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Fail if encountering unknown shader typePierre Moreau2017-05-071-2/+2
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: add a missing assertion in emitISCADD()Samuel Pitoiset2017-05-011-0/+2
| | | | | | | For consistency, similar to the other emitters. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: Enable compute support for PascalBoyan Ding2017-04-273-4/+7
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Add new launch descriptor format for GP100Boyan Ding2017-04-272-34/+197
| | | | | | | | | | v2: Also handle the the new format in indirect dispatch Use compute class check instead of chipset check Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Fix index of unk fields in nve4_cp_launch_descBoyan Ding2017-04-271-2/+2
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Fix indentation of maxwell compute class definitionsBoyan Ding2017-04-271-2/+2
| | | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: disable the TGSI merge registers passSamuel Pitoiset2017-04-262-2/+4
| | | | | | | | | | | | | | | | shader-db results on GK106 (Thanks Karol): total instructions in shared programs : 3931608 -> 3929463 (-0.05%) total gprs used in shared programs : 481255 -> 479014 (-0.47%) total local used in shared programs : 27481 -> 27381 (-0.36%) total bytes used in shared programs : 36031256 -> 36011120 (-0.06%) local gpr inst bytes helped 14 1471 1309 1309 hurt 1 88 384 384 Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium: add PIPE_SHADER_CAP_TGSI_SKIP_MERGE_REGISTERSSamuel Pitoiset2017-04-263-0/+4
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: Add support for setting viewport index/layer from VS/TESIlia Mirkin2017-04-204-7/+27
| | | | | | | | | | | | | This enables support on GM200+ for: - GL_AMD_vertex_shader_layer - GL_AMD_vertex_shader_layer_viewport_index - GL_ARB_shader_viewport_layer_array Signed-off-by: Ilia Mirkin <[email protected]> [lyude: add relnotes/TES cap] Signed-off-by: Lyude <[email protected]> [imirkin: move relnotes to right place, add features.txt] Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Only store viewport in scratch register for GPLyude2017-04-201-0/+1
| | | | | | | | | EMIT only applies to geometry shaders. For everything else, we want to export the viewport normally. Signed-off-by: Lyude <[email protected]> Reviewed-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: fold u_trim_pipe_prim call from st/mesa to driversMarek Olšák2017-04-201-0/+5
| | | | | | | Most drivers don't need it and shouldn't need it because it can't be used in some cases (indirect draws, primitive restart, count from streamout). Reviewed-by: Brian Paul <[email protected]>
* gallium: add PIPE_CAP_TGSI_TES_LAYER_VIEWPORTNicolai Hähnle2017-04-143-0/+3
| | | | | Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* nvc0: Enable ARB_shader_ballot on Kepler+Boyan Ding2017-04-131-1/+2
| | | | | | | | readInvocationARB() and readFirstInvocationARB() need SHFL.IDX instruction which is introduced in Kepler. Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Implement TGSI_OPCODE_BALLOT and TGSI_OPCODE_READ_*Boyan Ding2017-04-131-0/+31
| | | | | | | v2: Check if each channel is masked in TGSI_OPCODE_BALLOT (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Implement TGSI_SEMANTIC_SUBGROUP_*Boyan Ding2017-04-131-0/+27
| | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Add SV_LANEMASK_* system values.Boyan Ding2017-04-135-0/+25
| | | | | | | v2: Add name strings in nv50_ir_print.cpp (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Allow 0/1 immediate value as source of OP_VOTEBoyan Ding2017-04-133-11/+60
| | | | | | | | | | | | | | | | | | | | Implementation of readFirstInvocationARB() on nvidia hardware needs a ballotARB(true) used to decide the first active thread. This expressed in gm107 asm as (supposing output is $r0): vote any $r0 0x1 0x1 To model the always true input, which corresponds to the second 0x1 above, we make OP_VOTE accept immediate value 0/1 and emit "0x1" and "not 0x1" in the src field respectively. v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: (Ilia Mirkin) Make the handling more symmetric with predicate version in gm107 Use i->getSrc(s) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: Emit OP_SHFLBoyan Ding2017-04-131-0/+56
| | | | | | | | | v2: Make sure that asImm() is not NULL (Samuel Pitoiset) v3: Check the range of immediate in OP_SHFL (Ilia Mirkin) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Emit OP_SHFLBoyan Ding2017-04-131-0/+53
| | | | | | | | | | | | | v2: (Samuel Pitoiset) Add an assertion to check if the target is Kepler Make sure that asImm() is not NULL v3: (Ilia Mirkin) Check the range of immediate value of OP_SHFL Use the new setPDSTL API Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Properly handle a "split form" of predicate destinationBoyan Ding2017-04-131-2/+13
| | | | | | | | | | | | | | GF100's ISA encoding has a weird form of predicate destination where its 3 bits are split across whole the instruction. Use a dedicated setPDSTL function instead of original defId which is incorrect in this case. v2: (Ilia Mirkin) Change API of setPDSTL() to handle cases of no output Fix setting of the highest bit in setPDSTL() Cc: [email protected] Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: Emit third src 'bound' and optional predicate output of SHFLBoyan Ding2017-04-132-9/+29
| | | | | | | | v2: Emit the original hard-coded 0x1c03 when OP_SHFL is used in gm107's lowering (Samuel Pitoiset) Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: when mapping a persistent buffer, synchronize on former xfersIlia Mirkin2017-04-111-4/+2
| | | | | | | | | If the buffer is being used, we should wait for those uses to be complete before returning the map. Fixes: GL45-CTS.direct_state_access.buffers_functional Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0: increase texture buffer object alignment to 256 for pre-GM107Ilia Mirkin2017-04-111-1/+1
| | | | | | | | | | | | | We currently don't pass the low byte of the address via the surface info, so in order to work with images, these have to implicitly be aligned to 256. The proprietary driver also doesn't go out of its way to provide lower alignment. Fixes GL45-CTS.texture_buffer.texture_buffer_texture_buffer_range Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50/ir: remove unused swizzle field in ValueRefIlia Mirkin2017-04-091-1/+0
| | | | | | | | The nv50 ir is scalar. Perhaps this was from some early attempts to integrate the simd aspects of nv30. However at this point it's entirely unused. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: enable ARB_shader_clock on nv50 and nvc0Boyan Ding2017-04-092-2/+2
| | | | | | | v2: Also enable support on nv50 Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Handle TGSI_OPCODE_CLOCKBoyan Ding2017-04-091-0/+7
| | | | | | Signed-off-by: Boyan Ding <[email protected]> [imirkin: make zero mov non-fixed] Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: Emit SV_CLOCK system valueBoyan Ding2017-04-091-0/+1
| | | | | Signed-off-by: Boyan Ding <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix overwriting of offset register with interpolateAtOffsetIlia Mirkin2017-04-071-2/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nvc0/ir: fix LSB/BFE/BFI implementationsIlia Mirkin2017-04-071-8/+11
| | | | | | | | | Overwriting the src register is a very bad idea - it logically maps onto the TGSI registers, and so is effectively overwriting the source values. Reported-by: Boyan Ding <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* gallium: add PIPE_CAP_TGSI_BALLOTNicolai Hähnle2017-04-053-0/+3
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium: add sparse buffer interface and capabilityNicolai Hähnle2017-04-053-0/+3
| | | | | | | v2: - explain the resource_commit interface in more detail Reviewed-by: Marek Olšák <[email protected]>
* nv30: fp/rast may be null when validating fb/scissor due to clearIlia Mirkin2017-04-021-5/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fragprog may not be set when e.g. clearingIlia Mirkin2017-04-021-2/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50: don't assume a rast is set when validating for clearsIlia Mirkin2017-04-022-3/+7
| | | | | | | | Clears can happen before a rast is set, which can in turn cause scissors and fragprog to be validated. Make sure that we handle this case. Reported-by: Andrew Randrianasulu <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: also do PostRaLoadPropagation for FMAKarol Herbst2017-03-312-1/+2
| | | | | | | | | | | | | | | | | Helps Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3934925 -> 3934327 (-0.02%) total gprs used in shared programs : 481563 -> 481563 (0.00%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36061888 -> 36056504 (-0.01%) local gpr inst bytes helped 0 0 228 228 hurt 0 0 0 0 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: add LIMM form of madKarol Herbst2017-03-312-11/+26
| | | | | | | | | | | v2: renamed commit reordered modifiers add assert(dst == src2) v3: reordered modifiers again v5: no rounding bit for limms Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add LIMM form of madKarol Herbst2017-03-312-18/+34
| | | | | | | | | | v2: renamed commit reordered modifiers add assert(dst == src2) v3: removed wrong neg mod emission Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: implement mad post ra folding for nvc0+Karol Herbst2017-03-311-4/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | changes for GpuTest /test=pixmark_piano /benchmark /no_scorebox /msaa=0 /benchmark_duration_ms=60000 /width=1024 /height=640: score: 1026 -> 1045 changes for shader-db: total instructions in shared programs : 3943335 -> 3934925 (-0.21%) total gprs used in shared programs : 481563 -> 481563 (0.00%) total local used in shared programs : 27469 -> 27469 (0.00%) total bytes used in shared programs : 36139384 -> 36061888 (-0.21%) local gpr inst bytes helped 0 0 3587 3587 hurt 0 0 0 0 v2: removed TODO reorderd to show changes without RA modification removed stale debugging print() call v3: remove predicate checks enable only for gf100 ISA Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: restructure and rename postraconstantfolding passKarol Herbst2017-03-311-58/+63
| | | | | | | | | | | we might want to add more folding passes here, so make it a bit more generic v2: leave the comment and reword commit message v4: rename it to PostRaLoadPropagation Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: also do ConstantFolding for FMAKarol Herbst2017-03-311-0/+1
| | | | | | | | | | | | | | | | | Helps mainly Feral-ported games, due to their use of fma() shader-db changes: total instructions in shared programs : 3941587 -> 3940749 (-0.02%) total gprs used in shared programs : 481511 -> 481460 (-0.01%) total local used in shared programs : 27469 -> 27481 (0.04%) total bytes used in shared programs : 36123344 -> 36115776 (-0.02%) local gpr inst bytes helped 2 48 243 243 hurt 2 3 32 32 Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>