summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: organize screen capsIlia Mirkin2014-06-191-61/+51
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: remove vport_int hack and instead use the usual state validationIlia Mirkin2014-06-193-11/+3
| | | | | | | | Commit ad4dc772 fixed an issue with the viewport not being restored correctly. However it's rather hackish and confusing. Instead just mark the viewport dirty and let the viewport validation take care of it. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: Remove NV50_SEMANTIC_VIEWPORTINDEXTobias Klausmann2014-06-162-2/+1
| | | | | | | Use TGSI_SEMANTIC_VIEWPORT_INDEX for the last consumer. Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: implement multiple viewports/scissors, enable ARB_viewport_arrayTobias Klausmann2014-06-167-63/+113
| | | | | | Signed-off-by: Tobias Klausmann <[email protected]> [imirkin: mark things dirty on ctx switch, 3d blit] Reviewed-by: Ilia Mirkin <[email protected]>
* nv50: make sure to mark first scissor dirty after blitIlia Mirkin2014-06-161-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix bfind emissionIlia Mirkin2014-06-071-1/+1
| | | | | | | | There is a short-immediate version as well, but it should never end up getting used since it would have gotten folded earlier. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix emitting constbuf file indexIlia Mirkin2014-06-071-2/+3
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: emit saturate flag on fadd when neededIlia Mirkin2014-06-061-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix slct emissionIlia Mirkin2014-06-061-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix interp mode emissionIlia Mirkin2014-06-061-1/+2
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix ISAD emission with register argsIlia Mirkin2014-06-061-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: fix quadon opcode emissionIlia Mirkin2014-06-061-1/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nvc0: don't bother trying to set up compute for gk110+Ilia Mirkin2014-06-061-3/+3
| | | | | | | | | | The nouveau fw currently prints a bunch of errors. No point in seeing those all the time, esp since compute doesn't really work in the first place. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2" <[email protected]>
* gk110: add in forgotten code for gk110 isaIlia Mirkin2014-06-061-0/+13
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2" <[email protected]>
* gk110/ir: emit texbar the same way that the blob doesIlia Mirkin2014-06-061-1/+1
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Ben Skeggs <[email protected]> Cc: "10.2" <[email protected]>
* nvc0/ir: Handle OP_POPCNT when folding constant expressionsTobias Klausmann2014-06-061-0/+13
| | | | | | Signed-off-by: Tobias Klausmann <[email protected]> [imirkin: make sure to only fold 1-arg popcnt in opnd] Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Handle OP_BFIND when folding constant expressionsTobias Klausmann2014-06-061-0/+17
| | | | | Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: Handle reverse subop for OP_EXTBF when folding constant expressionsTobias Klausmann2014-06-061-2/+6
| | | | | Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: clear subop when folding constant expressionsTobias Klausmann2014-06-061-0/+1
| | | | | | | | | Some operations (e.g. OP_MUL/OP_MAD/OP_EXTBF) might have a subop set. After folding, make sure that it is cleared Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.1 10.2" <[email protected]>
* gallium: create TGSI_PROPERTY to disable viewport and clippingChristoph Bumiller2014-06-023-0/+4
| | | | | | Marek v2: add a cap Signed-off-by: Marek Olšák <[email protected]>
* nvc0/ir: use SM35 ISA with GK20AAlexandre Courbot2014-05-273-7/+12
| | | | | | | | GK20A is mostly compatible with GK104, but uses the SM35 ISA. Use the GK110 path when this chip is detected. Signed-off-by: Alexandre Courbot <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add GK20A 3D classAlexandre Courbot2014-05-272-1/+9
| | | | | | | | | GK20A is mostly compatible with GK104, but features a new 3D class. Add it to the relevant header and use it when GK20A is detected. Signed-off-by: Alexandre Courbot <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: implement clear_bufferTobias Klausmann2014-05-261-0/+141
| | | | | | | Provide an accelerated path for ARB_clear_buffer_object Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: revert mistaken logic to collapse color outputs to the beginningIlia Mirkin2014-05-261-9/+4
| | | | | | | | | | In commit af38ef907, I added a "fix" to color outputs not being assigned correctly when sample mask was being output. This was totally wrong -- the color indices (i.e. "si" values) were the ones that were wrong. Undo that hunk. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Emil Velikov <[email protected]>
* nv50: count wrapped textures towards the tex_obj countJoakim Sindholt2014-05-231-0/+2
| | | | | | | But don't count their size towards the allocated memory, since that belongs to whoever created it. Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: assert that we have vertex elements stateChristoph Bumiller2014-05-231-0/+1
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use PRIxPTR for sizeof()Christoph Bumiller2014-05-231-1/+1
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: allow 15,16,30 bpp display formatsChristoph Bumiller2014-05-231-4/+4
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: handle guard band definesChristoph Bumiller2014-05-232-4/+16
| | | | | [imirkin: moved default case out of switch] Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir/tgsi: optimize KILChristoph Bumiller2014-05-231-0/+5
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50/ir: fix lowering of predicated instructions (without defs)Christoph Bumiller2014-05-231-1/+4
| | | | | | | | Note that predicated instructions with defs are still not supported because transformation to SSA doesn't handle them yet. Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50/ir/opt: fix constant folding with saturate modifierChristoph Bumiller2014-05-231-1/+3
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50/ir/tgsi: TGSI_OPCODE_POW replicates its resultChristoph Bumiller2014-05-231-1/+5
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50,nvc0: set constbufs dirty on pipe context switchChristoph Bumiller2014-05-232-0/+5
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50: setup scissors on clear_render_target/depth_stencilChristoph Bumiller2014-05-231-2/+18
| | | | | | [imirkin: add logic to also clear the "regular" scissors] Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50,nvc0: always pull out bufctx on context destructionChristoph Bumiller2014-05-232-9/+7
| | | | | Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]>
* nv50,nvc0: fix 3d blits with mipmap levelsIlia Mirkin2014-05-212-11/+19
| | | | | | | | | | Make sure to normalize the z coordinates as well as the x/y ones when there are mipmaps present. Fixes 3d mipmap generation, which now uses the blit path. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Ben Skeggs <[email protected]>
* nv50/ir: fix constant folding for OP_MUL subop HIGHIlia Mirkin2014-05-211-4/+43
| | | | | | | | | | | | These instructions can come in either through IMUL_HI/UMUL_HI TGSI opcodes, or from OP_DIV constant folding. Also make sure that the constant foldings which delete the original instruction still get counted as having done something. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.1 10.2" <[email protected]> Reviewed-by: Ben Skeggs <[email protected]>
* nv50/ir: fix s32 x s32 -> high s32 multiply logicIlia Mirkin2014-05-212-11/+82
| | | | | | | | | | | | | | | | | Retrieving the high 32 bits of a signed multiply is rather annoying. It appears that the simplest way to do this is to compute the absolute value of the arguments, and perform a u32 x u32 -> u64 operation. If the arguments' signs differ, then negate the result. Since there is no u64 support in the cvt instruction, we have the perform the 2's complement negation "by hand". This logic can come into use by the IMUL_HI instruction (very unlikely to be seen), as well as from constant folding of division by a constant. Fixes dolphin's divisions by 255. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.1 10.2" <[email protected]> Reviewed-by: Ben Skeggs <[email protected]>
* nv50/ir: fix integer mul lowering for u32 x u32 -> high u32Ilia Mirkin2014-05-181-3/+4
| | | | | | | | | | | | UNION appears to expect that all of its sources are conditionally defined. Otherwise it inserts an unpredicated mov instruction which overwrites the desired result. This fixes tests that use UMUL_HI, and much less directly, unsigned integer division by a constant, which uses this functionality in a peephole pass. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.1 10.2" <[email protected]> Reviewed-by: Ben Skeggs <[email protected]>
* nv50/ir: make sure that texprep/texquerylod's args get coalescedIlia Mirkin2014-05-181-0/+2
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.2" <[email protected]> Reviewed-by: Ben Skeggs <[email protected]>
* nvc0: enable support for maxwell boardsBen Skeggs2014-05-155-19/+48
| | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add maxwell (sm50) compiler backendBen Skeggs2014-05-1516-5/+3588
| | | | | | | | | | The big missing part here is proper sched data calculations, but hopefully the chosen placeholder will be sufficient for now. Passes piglit as well as GK107 does. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: maxwell isa has no per-instruction join modifierBen Skeggs2014-05-154-19/+23
| | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: replace immd 0 with $rLASTGPR for emit/restart opcodesBen Skeggs2014-05-151-0/+1
| | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: move nvc0 lowering pass class definitions into headerBen Skeggs2014-05-153-106/+136
| | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump sched data member to 32-bitsBen Skeggs2014-05-151-1/+1
| | | | | | | SM50 backend requires 21 bits per instruction, not 8. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use vertex arrays for eng3d blitBen Skeggs2014-05-151-31/+64
| | | | | | | Maxwell doesn't have immediate-mode. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: restrict "constant vbo" logic to fermi/kepler classesBen Skeggs2014-05-151-1/+1
| | | | | Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: replace some vb->stride checks with constant_vbo insteadBen Skeggs2014-05-151-3/+3
| | | | | | | | Maxwell no longer has the methods to set constant attributes, and we'll want to be treating stride 0 vtxbufs the same as for stride > 0. Signed-off-by: Ben Skeggs <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>