summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nv50: fix alphatest for non-blendable formatsIlia Mirkin2016-07-1614-11/+118
| | | | | | | | | | | | | | | | | | | | The hardware can only do alphatest when using a blendable format. This means that the various *16 norm formats didn't work with alphatest. It appears that Talos Principle uses such formats, as well as alpha tests, for some internal renders, which made them be incorrect. However this does not appear to affect the final renders, but in a different game it easily could. The approach we take is that when alphatests are enabled and a suitable format is used (which we anticipate is the vast minority of the time), we insert code into the shader to perform the comparison and discard. Once inserted, that code lives in the shader forever, and we re-upload it each time the function changes with a fixed-up compare. To avoid re-uploading too often, if we switch back to a blendable format, the test is (effectively) disabled and the hw alphatest functionality is used. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: add missing string for SV_WORK_DIMSamuel Pitoiset2016-07-141-0/+1
| | | | | | | Fixes: 2aa1197 ("nouveau: Add support for SV_WORK_DIM") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Hans de Goede <[email protected]>
* nvc0: initial support for GP100 GPUsBen Skeggs2016-07-124-5/+15
| | | | Signed-off-by: Ben Skeggs <[email protected]>
* nvc0: use a define for the driver constant buffer sizeSamuel Pitoiset2016-07-117-17/+17
| | | | | | | This might avoid mistakes if the size is bumped in the future. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix the driver cb size when draw parameters are usedSamuel Pitoiset2016-07-111-2/+2
| | | | | | | | | | | | The size of the driver constant buffer for each stage should be 2048 and not 512 because it has been increased recently for buffers/images. While we are at it, do the same change for indirect draws. This fixes all ARB_shader_draw_parameters tests on GM107. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: 12.0 <[email protected]>
* nvc0/ir: fix images indirect access on FermiSamuel Pitoiset2016-07-111-0/+7
| | | | | | | | | | | This fixes the following piglits: arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index arb_arrays_of_arrays-basic-imagestore-mixed-const-non-const-uniform-index2 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: 12.0 <[email protected]>
* nvc0/ir: remove unused resource info loading helpersSamuel Pitoiset2016-07-082-28/+0
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: refactor the surfaces info loading logicSamuel Pitoiset2016-07-082-82/+44
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: move the shift left op inside loadTexHandle()Samuel Pitoiset2016-07-081-8/+6
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: rename NVE4_SU_INFO_XXX to NVC0_SU_INFO_XXXSamuel Pitoiset2016-07-051-49/+49
| | | | | | | | While we are at it, fix a typo inside the comment which describes what those constants are for. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: reset the base offset for indirect images accessesSamuel Pitoiset2016-07-051-2/+4
| | | | | | | | | | In presence of an indirect image access, the base offset should be zeroed because the stride will be computed twice. This is a pretty rare situation but it can happen when tex.r > 0. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]>
* gm107/ir: fix sign bit emission for FADD32ISamuel Pitoiset2016-07-051-3/+6
| | | | | | | | | | | | When emitting OP_SUB, the sign bit for FADD and FADD32I is not at the same position. It's at position 45 for FADD but 51 for FADD32I. This fixes the following piglit test: tests/spec/arb_fragment_program/fdo30337b.shader_test Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* nv30: Fix "array subscript is below array bounds" compiler warningHans de Goede2016-07-021-2/+1
| | | | | | | | gcc6 does not like the trick where we point to one entry before the array start and then start a while with a pre-increment. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: Fix a couple of "foo may be used uninitialized' compiler warningsHans de Goede2016-07-022-3/+3
| | | | | | | | | | | | | These are all new false positives with gcc6. In nouveau_compiler.c: gcc6 no longer assumes that passing a pointer to a variable into a function initialises that variable. In nv50_ir_from_tgsi.cpp op and mode are not set if there are 0 enabled dst channels, this never happens, but gcc cannot know this. Signed-off-by: Hans de Goede <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nouveau: Fix gcc6 / c++11 auto_ptr deprecation compiler warningsHans de Goede2016-07-021-0/+4
| | | | | Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nouveau: Add support for SV_WORK_DIMHans de Goede2016-07-028-12/+29
| | | | | | | | Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argumentHans de Goede2016-07-023-4/+4
| | | | | | | | | This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: fix up image support for allowing multiple samplesIlia Mirkin2016-07-017-49/+108
| | | | | | | | | Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: go back to not using viewport validate function for swtnlIlia Mirkin2016-07-012-1/+16
| | | | | | | | | The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: fix viewport clipping settings to be based on viewport, not rtIlia Mirkin2016-07-012-17/+11
| | | | | | | This fixes a ton of "*clip*" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: print EMIT subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print RSQ/RCP subops in debug modeSamuel Pitoiset2016-06-291-0/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print PIXLD subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print SHFL subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gm107/ir: make sure that flagsDef is set when emitting setcondSamuel Pitoiset2016-06-281-1/+1
| | | | | | | | | | | Rely on the existence of a second destination when emitting a setcond flag is dangerous, because this doesn't mean that the flag has been correctly set. Instead rely on flagsDef like what emitX() does for flagsSrc. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* gm107/ir: add missing setcond flags for LOP variantsSamuel Pitoiset2016-06-281-0/+2
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* gm107/ir: make use of LOP32I for all immediatesSamuel Pitoiset2016-06-281-1/+1
| | | | | | | | LOP only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* gm107/ir: make use of MOV32I for all immediatesSamuel Pitoiset2016-06-271-2/+1
| | | | | | | | | MOV only allows to emit 19-bits immediates. This is similar to the previous fix I did for IMUL. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* nvc0: update "derived" state function namesIlia Mirkin2016-06-261-8/+8
| | | | | | | derived_1/2/etc aren't too informative. Instead name them based on the state they're derived from. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: provide support for unscaled poly offset unitsIlia Mirkin2016-06-263-3/+26
| | | | | | | | On at least Kepler hardware, the units differ based on RT format. Emit a properly scaled value for Z16 depth buffers vs other formats, to help out st/nine. Signed-off-by: Ilia Mirkin <[email protected]>
* gm107/ir: make use of IMUL32I for all immediatesSamuel Pitoiset2016-06-261-1/+1
| | | | | | | | | | IMUL only allows to emit 19-bits immediates. This is similar to d30768025a2283d4cc57930b784798bf278969da which fixed the same thing for the GK110 emitter. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* gallium: Add a cap for offset_units_unscaledAxel Davy2016-06-253-0/+3
| | | | | | | | | | | | | | D3D9 has a different behaviour for depth bias. For OGL/D3D1X, the depth bias unit is the minimal resolvable value for the depth buffer, which depends on the format (and has different behaviour for float depth buffers). For D3D9, the depth bias unit is 1.0f. Signed-off-by: Axel Davy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nvc0: when mapping directly, provide accurate xfer info + startIlia Mirkin2016-06-241-5/+12
| | | | | | | | | | We were ignoring the incoming box parameters, and were providing totally bogus stride/layer stride, and other bits, for when a non-full-surface map was requested. Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Samuel Pitoiset <[email protected]> Cc: <[email protected]>
* Remove wrongly repeated words in commentsGiuseppe Bilotta2016-06-232-2/+2
| | | | | | | | | | | | | | | | | Clean up misrepetitions ('if if', 'the the' etc) found throughout the comments. This has been done manually, after grepping case-insensitively for duplicate if, is, the, then, do, for, an, plus a few other typos corrected in fly-by v2: * proper commit message and non-joke title; * replace two 'as is' followed by 'is' to 'as-is'. v3: * 'a integer' => 'an integer' and similar (originally spotted by Jason Ekstrand, I fixed a few other similar ones while at it) Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* nv50,nvc0: fix start_instance in manual push pathIlia Mirkin2016-06-212-10/+22
| | | | | | | | | | The start instance is applied as an offset into the buffer directly, ignoring the divisor, not as an instance id offset that respects the divisor. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: make image_view constRob Clark2016-06-201-2/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make constant_buffer constRob Clark2016-06-203-3/+3
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: make shader_buffers constRob Clark2016-06-201-3/+3
| | | | | | | Be consistent with the rest of the "set_xyz" state interfaces. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: don't make use of push hint if there are no non-const user vbosIlia Mirkin2016-06-191-1/+3
| | | | | | | | | | | | | | | | | | | | | | This makes the check match up what we do on nv50 as well - there's no point in switching over the push path if everything's in managed buffers. This can happen when a shader uses a vertex without an enabled array - we end up passing it a constant attribute. This also has the effect of "fixing" some flickering in Talos. I have no idea why. I've stared at the push logic forwards, backwards, and sideways. By always forcing the push path (which is slow), the flickering also goes away, but other rendering is still wrong (specifically draw 383068 as identified in the bug). However by not switching over to the push path, draw 383068 is correct. Note that other flickering remains in Talos, like the red/green walls/floors. This takes care of the shadow flickering though. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90513 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gk104/ir: fix tex use generation to be more careful about eliding usesIlia Mirkin2016-06-192-12/+27
| | | | | | | | | | | | | If we have a loop, instructions before the tex might be added as tex uses, and those may in fact dominate all other uses of the tex results. This however doesn't mean that we don't need a texbar after the tex. Only check if uses dominate each other they are dominated by the tex. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96565 Fixes: 7752bbc44 (gk104/ir: simplify and fool-proof texbar algorithm) Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2 12.0" <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50: add support for GL_EXT_window_rectanglesIlia Mirkin2016-06-187-5/+74
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add support for GL_EXT_window_rectanglesIlia Mirkin2016-06-187-5/+72
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gallium: add PIPE_CAP_MAX_WINDOW_RECTANGLES to all driversIlia Mirkin2016-06-183-0/+3
| | | | | | | | This says how many window rectangles are supported by the implementation, although it may not exceed PIPE_MAX_WINDOW_RECTANGLES. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* nv50/ir: add missing strings for some recent sysvalsSamuel Pitoiset2016-06-181-0/+3
| | | | | | | | | This is pretty useful for debugging purposes and those should not be omitted. Fixes: 517a93b3 ("nvc0: add ARB_shader_draw_parameters support") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: make Graph destructor virtualStephan Bergmann2016-06-131-1/+1
| | | | | | | | Avoid ASan new-delete-type-mismatch when Function::domTree is created as DominatorTree in Function::convertToSSA but destroyed only as base Graph in ~Function. Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: clamp the UBO index for compute on KeplerSamuel Pitoiset2016-06-131-1/+9
| | | | | | | | | | We already check that the address is not "too far", but we should also clamp the UBO index in order to avoid looking at the wrong place in the driver cb. This is a pretty rare situation though. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "12.0" <[email protected]>
* Android: move libdrm settings to top-level Android.common.mkRob Herring2016-06-131-1/+1
| | | | | | | | | | | | | | Fix warnings like these due to HAVE_LIBDRM being inconsistently defined: external/libdrm/include/drm/drm.h:839:30: warning: redefinition of typedef 'drm_clip_rect_t' is a C11 feature [-Wtypedef-redefinition] typedef struct drm_clip_rect drm_clip_rect_t; HAVE_LIBDRM needs to be set project wide to fix this. This change also harmlessly links libdrm with everything, but simplifies the makefiles a bit. Signed-off-by: Rob Herring <[email protected]> Acked-by: Emil Velikov <[email protected]>
* nv50: reinstate dedicated constbuf push pathIlia Mirkin2016-06-115-29/+50
| | | | | | | | | | | | | | | | | This was disabled due to occasionally incorrect behavior when trying to upload data. It later became apparent that nvc0 also had a similar but slightly different issue, which was resolved in commit e50c01d5. This takes the same logic as nvc0 and applies it to nv50 (which has somewhat different interfaces). Unfortunately I did not note down precisely what was broken with UBOs when removing the support from nv50, but I've tested a bunch of local traces, and none of them appear to regress. This should hopefully improve performance when UBOs are used, but this was not directly verified. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50: enable indirect addressing of fragment shader inputsIlia Mirkin2016-06-112-1/+2
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gk104/ir: fix conditions for adding a texbarIlia Mirkin2016-06-071-4/+6
| | | | | | | | | | | | Sometimes a register source can actually be double- or even quad-wide. We must make sure that the inserted texbars take that width into account. Based on an earlier patch by Samuel Pitoiset. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: "12.0 11.2" <[email protected]>