aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau/nvc0/nvc0_program.c
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: Add support for ARB_post_depth_coverageLyude2017-06-021-0/+1
| | | | | Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: Replace NV50_PROGRAM_IR_* by PIPE_SHADER_IR_*Pierre Moreau2017-05-071-1/+1
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: enable FBFETCH with a special slot for color buffer 0Ilia Mirkin2017-01-161-0/+6
| | | | | | | | | | | | We don't need to support all the color buffers for advanced blend, just cb0. For Fermi, we use the special binding slots so that we don't overlap with user textures, while Kepler+ gets a dedicated position for the fb handle in the driver constbuf. This logic is only triggered when a FBFETCH is actually present so it should be a no-op most of the time. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: avoid reading out of bounds when getting bogus so infoIlia Mirkin2016-10-191-2/+5
| | | | | | | | | The state tracker tries to attach the info to the wrong shader. This is easy enough to protect against. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Cc: 12.0 13.0 <[email protected]>
* nvc0: dump program binary only when NV50_PROG_DEBUG is setSamuel Pitoiset2016-10-071-1/+1
| | | | | | | | When the chipset is forced with NV50_PROG_CHIPSET, we actually only want to output the binary if NV50_PROG_DEBUG is also enabled. Otherwise, this pollutes the shader-db output. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: dump program binary when chipset has been forcedSamuel Pitoiset2016-10-051-0/+5
| | | | | | | | Currently, program binaries are only dumped at upload time, but when the chipset has been forced via NV50_PROG_CHIPSET we might want to show the generated code, especially with shaderdb. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: allow to force compiling programs in debug buildSamuel Pitoiset2016-09-261-9/+10
| | | | | | | | | This adds a new envvar called NV50_PROG_CHIPSET which allows to compile shaders with a different target, especially useful for shader-db. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to resize the code segment dynamicallySamuel Pitoiset2016-09-011-1/+24
| | | | | | | | | | | | | When an application uses a ton of shaders, we need to evict them when the code segment is full but this is not really a good solution if monster shaders are used because code eviction will happen a lot. To avoid this, it seems better to dynamically resize the code segment area after each eviction. The maximum size is arbitrary fixed to 8MB which should be enough. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: re-upload currently bound shaders after code evictionSamuel Pitoiset2016-09-011-0/+27
| | | | | | | | | | | | | This fixes a very old issue which happens when the code segment size is full. A bunch of real applications like Tomb Raider, F1 2015, Elemental, hit that issue because they use a ton of shaders. In this case, all shaders are evicted (for freeing space) but all currently bound shaders also need to be re-uploaded and SP_START_ID have to be updated accordingly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: refactor the program upload processSamuel Pitoiset2016-09-011-30/+57
| | | | | | | | | | This refactoring will help for fixing the "out of code space" eviction issue because we will need to reupload the code for all currently bound shaders but it's slightly different than uploading a new fresh code. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: remove an attempt at uploading all IMMD into a CBSamuel Pitoiset2016-08-311-17/+0
| | | | | | | | | | | This has never been used because info->immd.bufSize is always 0 and anyways this is an experimental code which has never been completed. This gets rid of some unused code in the program validation process. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix up TCP header on GM107+Samuel Pitoiset2016-07-271-0/+9
| | | | | | | | | | The number of outputs patch (limited to 255) has moved in the TCP header, but blob seems to also set the old position. Also, the high 8-bits are now located inbetween the min/max parallel output read address at position 20. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50: fix alphatest for non-blendable formatsIlia Mirkin2016-07-161-1/+2
| | | | | | | | | | | | | | | | | | | | The hardware can only do alphatest when using a blendable format. This means that the various *16 norm formats didn't work with alphatest. It appears that Talos Principle uses such formats, as well as alpha tests, for some internal renders, which made them be incorrect. However this does not appear to affect the final renders, but in a different game it easily could. The approach we take is that when alphatests are enabled and a suitable format is used (which we anticipate is the vast minority of the time), we insert code into the shader to perform the comparison and discard. Once inserted, that code lives in the shader forever, and we re-upload it each time the function changes with a fixed-up compare. To avoid re-uploading too often, if we switch back to a blendable format, the test is (effectively) disabled and the hw alphatest functionality is used. Signed-off-by: Ilia Mirkin <[email protected]>
* nouveau: Add support for SV_WORK_DIMHans de Goede2016-07-021-1/+1
| | | | | | | | Add support for SV_WORK_DIM for nvc0 and nve4. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: Make NVC0_CB_AUX_GRID_INFO take an index argumentHans de Goede2016-07-021-1/+1
| | | | | | | | | This brings it inline with the other macros like NVC0_CB_AUX_UBO_INFO and NVC0_CB_AUX_TEX_INFO. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: fix up image support for allowing multiple samplesIlia Mirkin2016-07-011-12/+8
| | | | | | | | | Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: fix the max_vertices=0 caseIlia Mirkin2016-05-291-1/+1
| | | | | | | This is apparently legal. Drop any emit/restarts, and pass a 1 to the hardware. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: add note about where the viewport mask would goIlia Mirkin2016-05-261-0/+1
| | | | | | | Not piping this all the way through yet, but no better place to note this down. This will can be used with NV_viewport_array2. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: fix setting of tess_mode in various situationsIlia Mirkin2016-05-221-4/+14
| | | | | | | | This fixes a lot of INVALID_VALUE errors reported by the card when running dEQP tests. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: bind images on fragment and compute shaders for FermiSamuel Pitoiset2016-05-211-6/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add support for cull distancesTobias Klausmann2016-05-151-2/+3
| | | | | | | | | | | Cull distances are just a special case of clip distances as far as the hardware is concerned. Make sure that the relevant "planes" are enabled, and flip the clip mode to cull for those. Signed-off-by: Tobias Klausmann <[email protected]> [imirkin: add enables on nvc0, add nv50 support] Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
* nvc0: fix gl_SampleMaskIn computationIlia Mirkin2016-05-111-0/+1
| | | | | | | | | | | | | | | | | | | | | The SAMPLEMASK semantic should only return the bits set covered by the current invocation. However we were always retrieving the covmask, which returns the covered samples of the whole pixel. When not doing per-sample invocation, this is precisely what we want. However when doing per-sample invocation, we have to select the sampleid'th bit and only return that. Furthermore, this means that we have to have a 1:1 correlation for invocations and samples. This fixes most dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* tests. A few failures remain due to disagreements about nr_samples==1 logic as well as what happens with MSAA x2 RTs when the shading fraction is 0.5. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: generalize interp fixups to be able to fixup anythingIlia Mirkin2016-05-111-6/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: bind images on 3D shaders for KeplerSamuel Pitoiset2016-04-261-1/+3
| | | | | | | Similar to surfaces validation for compute shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind images on compute shaders for KeplerSamuel Pitoiset2016-04-261-1/+3
| | | | | | | | | | | Old surfaces validation code will be removed once images are completely done for Fermi/Kepler, that explains why I only disable it for now. This also introduces nvc0_get_surface_dims() which computes correct dimensions regarding the given target. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*Marek Olšák2016-04-221-1/+1
| | | | Acked-by: Jose Fonseca <[email protected]>
* nvc0: handle the case where there are no framebuffer attachmentsIlia Mirkin2016-04-091-0/+7
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: store ubo info to the driver constbuf on KeplerSamuel Pitoiset2016-04-011-0/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind shader buffers for compute on KeplerSamuel Pitoiset2016-04-011-3/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind driver cb for compute on c7[] for KeplerSamuel Pitoiset2016-04-011-6/+5
| | | | | | | | | | | | Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: use a different offset for buffers and surfacesSamuel Pitoiset2016-03-291-3/+5
| | | | | | | | | | To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: avoid using magic numbers for the uniform_bo offsetsSamuel Pitoiset2016-03-191-6/+6
| | | | | | | | | | | | | | | Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: replace resInfoCBSlot by auxCBSlotSamuel Pitoiset2016-03-191-3/+1
| | | | | | | | | Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>
* nvc0: bind shader buffers for compute on FermiSamuel Pitoiset2016-02-211-0/+1
| | | | | | | | This is loosely based on 3D. Shader buffers are bound on c15 (the driver constbuf) at offset 0x200. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind driver constbuf for compute on FermiSamuel Pitoiset2016-02-211-0/+2
| | | | | | | | | | | Changes from v3: - add new validation state for COMPUTE driver constbuf Changes from v2: - always bind the driver consts even if user params come in via clover Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add support for BUFFER accessesIlia Mirkin2016-01-291-0/+3
| | | | | | | This largely leaves the existing image logic alone. When image support is added this will have to be harmonized somehow. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: use a face sysval to avoid the useless back-and-forth conversionIlia Mirkin2016-01-081-1/+0
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: Set winding order regardless of domain.Kenneth Graunke2015-12-301-2/+4
| | | | | | | | | | Quads need to respect winding order, too - not just triangles. Fixes rendering in GFXBench 4.0's tessellation benchmark. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0 11.1" <[email protected]>
* nvc0: add ARB_shader_draw_parameters supportIlia Mirkin2015-12-301-1/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium/drivers/nouveau: Make use of ARRAY_SIZE macroEdward O'Callaghan2015-12-061-1/+1
| | | | | Signed-off-by: Edward O'Callaghan <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* nv50,nvc0: provide debug messages with shader compilation statsIlia Mirkin2015-11-051-1/+7
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: adapt to new method for passing in cull/clip distance masksIlia Mirkin2015-10-291-6/+4
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: do upload-time fixups for interpolation parametersIlia Mirkin2015-10-291-1/+24
| | | | | | | | | | | | | | | | | Unfortunately flatshading is an all-or-nothing proposition on nvc0, while GL 3.0 calls for the ability to selectively specify explicit interpolation parameters on gl_Color/gl_SecondaryColor which would override the flatshading setting. This allows us to fix up the interpolation settings after shader generation based on rasterizer settings. While we're at it, we can add support for dynamically forcing all (non-flat) shader inputs to be interpolated per-sample, which allows st/mesa to not generate variants for these. Fixes the remaining failing glsl-1.30/execution/interpolation piglits. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: always emit a full shader colormaskIlia Mirkin2015-09-081-1/+1
| | | | | | | | | | | | | | Indications are that if the colormask indicates a single bit set on fermi, that value will always be read from $r0 instead of a potentially higher register (if e.g. green is set). Not to upset the counting logic, always set the header up with a full color mask for each RT. Such a situation can basically only ever happen with generated blit shaders. Fixes the following piglit on Fermi (Kepler is unaffected): fbo-stencil blit GL_DEPTH32F_STENCIL8 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* nvc0: bind a fake tess control program when there isn't one availableIlia Mirkin2015-08-171-0/+17
| | | | | | | | | | | Apparently this is necessary in order for tess factors to work in a tess eval program without a tess control program bound. Probably because it uses the fake program's shader header to work out the number of patch constants. Fixes vs-tes-tessinner-tessouter-inputs Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: per-patch vars are in a separate address spaceIlia Mirkin2015-07-241-11/+7
| | | | | | | | | | | | | There's no need to attempt to avoid overlapping generic i/o with patch i/o. By the same token, we can't merge patch and non-patch loads/stores. This fixes at least the tes-both-input-array-*-index-rd tessellation variable-indexing tests. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: cleanup private enums that have graduated to galliumIlia Mirkin2015-07-231-2/+0
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0/ir: mark varyings as per-patch based on semantic nameIlia Mirkin2015-07-231-4/+2
| | | | | | Also add proper handling for PATCH semantics Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: TESSCOORD comes in as a sysval, not an inputIlia Mirkin2015-07-231-9/+10
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: preliminary tess supportIlia Mirkin2015-07-231-19/+10
| | | | | | | Uncomment the various functionality that was already there and add in obvious missing bits that parallel vp/gp/fp functionality. Signed-off-by: Ilia Mirkin <[email protected]>