summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau
Commit message (Collapse)AuthorAgeFilesLines
* nv50/ir: avoid asserts when the state tracker feeds us bogus inputsIlia Mirkin2016-05-151-12/+48
| | | | | | | | | | | | | | | | | | INTERP is defined (by me) to have to have a INPUT source. However the state tracker does not always obey this. This happens due to varying packing logic introducing additional mov's which can't always be undone. Instead of just giving up, we instead try harder to find the original input. This won't always be possible, for example with indirect accesses. There's not much we can (easily) do about that though. This fixes the remaining interpolateAt* failures in dEQP: dEQP-GLES31.functional.shaders.multisample_interpolation.interpolate_at* some of which were asserting due to INTERP_* being passed a non-input. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: don't try to go through the push path for indirect drawsIlia Mirkin2016-05-151-1/+2
| | | | | | | | | | | | | | This fixes dEQP-GLES31.functional.draw_indirect.draw_elements_indirect.*.default_attribute These tests were causing a const vbo to be set up, and were small enough draws that the logic was trying to go via the push path (which emits data directly into the cmd stream rather than uploading a user vbo). Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0/ir: make sure to align the second arg of TXD to 4, as we do for TEXIlia Mirkin2016-05-151-0/+14
| | | | | | | | | | | | | This was handled in handleTEX(), however the way the logic works, those extra arguments aren't added on by then, so it did nothing. Instead we must duplicate that bit here. GK110 appears to complain about MISALIGNED_GPR, however it's reasonable to believe that GK104 has the same requirements. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95403 Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected] Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: add support for cull distancesTobias Klausmann2016-05-1510-11/+33
| | | | | | | | | | | Cull distances are just a special case of clip distances as far as the hardware is concerned. Make sure that the relevant "planes" are enabled, and flip the clip mode to cull for those. Signed-off-by: Tobias Klausmann <[email protected]> [imirkin: add enables on nvc0, add nv50 support] Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tobias Klausmann <[email protected]>
* gallium: Add a pipe cap for arb_cull_distanceTobias Klausmann2016-05-143-0/+3
| | | | | | | | | This lets us safely enable or disable the extension as needed Signed-off-by: Tobias Klausmann <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0: fix indentation in nvc0_invalidate_resource_storage()Samuel Pitoiset2016-05-121-52/+52
| | | | | | | Trivial. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: save some CPU cycles in nvc0_context_unreference_resources()Samuel Pitoiset2016-05-121-8/+6
| | | | | | | | This reduces the number of loop iterations for invalidating buffers and images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: invalidate texture buffers for computeSamuel Pitoiset2016-05-121-3/+8
| | | | | | | This is a pretty rare situation but this can happen though. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix gl_SampleMaskIn computationIlia Mirkin2016-05-1110-5/+94
| | | | | | | | | | | | | | | | | | | | | The SAMPLEMASK semantic should only return the bits set covered by the current invocation. However we were always retrieving the covmask, which returns the covered samples of the whole pixel. When not doing per-sample invocation, this is precisely what we want. However when doing per-sample invocation, we have to select the sampleid'th bit and only return that. Furthermore, this means that we have to have a 1:1 correlation for invocations and samples. This fixes most dEQP-GLES31.functional.shaders.sample_variables.sample_mask_in.* tests. A few failures remain due to disagreements about nr_samples==1 logic as well as what happens with MSAA x2 RTs when the shading fraction is 0.5. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: generalize interp fixups to be able to fixup anythingIlia Mirkin2016-05-1110-60/+71
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: enable compute support by default on GK110+Samuel Pitoiset2016-05-101-15/+3
| | | | | | | | | | Compute support seems to be pretty stable now, and according to piglit it doesn't seem to break 3D state. As a side effect, this will expose ARB_compute_shader on GK110/GK208. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: silence unsupported TGSI_PROPERTY_CS_FIXED_BLOCK_*Samuel Pitoiset2016-05-091-0/+5
| | | | | | | We don't need them for compute shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: unreference images when the context is destroyedSamuel Pitoiset2016-05-061-0/+4
| | | | | | | Like other resources, we need to unreference all images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau/video: properly detect the decoder class for availability checksIlia Mirkin2016-05-041-8/+17
| | | | | | | | | | | The kernel is now more strict with the class ids it exposes, so we need to check the G98 and MCP89 classes as well as the GT215 class. This effectively caused us to decide there were no decoding capabilities on newer kernel for VP3 chips. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95251 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2" <[email protected]>
* nvc0: compute a percentage for metric-achieved_occupancySamuel Pitoiset2016-05-031-4/+4
| | | | | | | metric-issue_slot_utilization and metric-branch_efficiency are already computed as percentages. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: display some performance metrics with a percentageSamuel Pitoiset2016-05-031-3/+3
| | | | | | This makes more sense for them. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: store the driver query type for performance metricsSamuel Pitoiset2016-05-031-18/+22
| | | | | | | | This will allow to use percentages for some metrics because the Gallium HUD doesn't allow to display floating point numbers and 0 is printed instead. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: fix exposing of metric-issue_slots for SM21/SM30Samuel Pitoiset2016-05-031-2/+22
| | | | | | | | | This is most likely a copy-paste error when I reworked this area few weeks ago. For SM20, metric-issue_slots is equal to inst_issued because there is only one pipeline, so the metric is not exposed there. Signed-off-by: Samuel Pitoiset <[email protected]> Reported-by: Karol Herbst <[email protected]>
* nv50,nvc0: re-bind old compute state after reading MP perf countersSamuel Pitoiset2016-05-022-0/+4
| | | | | | | | This might be useful to avoid breaking the current compute state when monitoring MP perf counters because we use a compute kernel to read out those counters. This has been initially suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]>
* nvc0: stick compute kernel arguments into uniform_boSamuel Pitoiset2016-04-295-26/+10
| | | | | | | | | | | | | Having one buffer object for input kernel arguments coming from clover and an other one for OpenGL user uniforms is unnecessary. Using the uniform_bo object for both GL/CL uniforms avoids to declare a new BO. This only affects compute programs but it should not hurt anything because the states are dirtied and data will get reuploaded. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: codegen: LOAD: Take src swizzle into accountHans de Goede2016-04-271-2/+6
| | | | | | | | | | | | | | | | | | | | | | The llvm TGSI backend uses pointers in registers and does things like: LOAD TEMP[0].y, MEMORY[0], TEMP[0] Expecting the data at address TEMP[0].x to get loaded to TEMP[0].y. But this will cause the data at TEMP[0].x + 4 to be loaded instead. This commit adds support for a swizzle suffix for the 1st source operand, which allows using: LOAD TEMP[0].y, MEMORY[0].xxxx, TEMP[0] And actually getting the desired behavior Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: codegen: LOAD: Do not call fetchSrc(1) if the address is immediateHans de Goede2016-04-271-2/+3
| | | | | | | | | | "off" later gets set to NULL when the address is immediate, so move the fetchSrc(1) call to the non-immediate branch of the if-else. This brings handleLOAD's offset handling inline with how it is done in handleSTORE. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nouveau: codegen: LOAD: Always use component 0 when getting the addressHans de Goede2016-04-271-1/+3
| | | | | | | | | | LOAD loads upto 4 components from the specified resource starting at the passed in x value of the 2nd source operand, the y, z and w components of the address should not be used. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: Remove every double semi-colonJakob Sinclair2016-04-262-2/+2
| | | | | | Signed-off-by: Jakob Sinclair <[email protected]> Reviewed-by: Ian Romanick <[email protected]> Reviewed-by: Chad Versace <[email protected]>
* nvc0: expose GLSL version 420 on GK110Samuel Pitoiset2016-04-261-1/+1
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: enable ARB_shader_image_load_store on GK110Samuel Pitoiset2016-04-261-1/+1
| | | | | | | This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission for VSHLSamuel Pitoiset2016-04-261-0/+58
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission for OP_SUEAU, OP_SUBFM and OP_SUCLAMPSamuel Pitoiset2016-04-261-0/+87
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission for OP_SULDB and OP_SUSTxSamuel Pitoiset2016-04-261-0/+155
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission for OP_MADSPSamuel Pitoiset2016-04-261-0/+23
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gk110/ir: add emission for OP_PERMTSamuel Pitoiset2016-04-261-0/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: expose GLSL version 420 on GK104Samuel Pitoiset2016-04-261-0/+2
| | | | | | | Other chipsets will be added later. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: enable ARB_shader_image_load_store on GK104Ilia Mirkin2016-04-261-0/+2
| | | | | | | This exposes 8 images for all shader types. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: inform users that 3D images are not fully supportedSamuel Pitoiset2016-04-263-5/+10
| | | | | | | | | 3D images are a bit more complicated to implement and will probably requires a bunch of headaches and we don't care for now because they do not seem to be really used by apps. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reduce GL_MAX_3D_TEXTURE_SIZE to 2048 on Kepler+Samuel Pitoiset2016-04-261-1/+1
| | | | | | | | | The blob sets it to 2048 and using 4096 reports an INVALID_DATA error with RT_ARRAY_MODE when z is 4096. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0/ir: check that the image format doesn't mismatchSamuel Pitoiset2016-04-263-2/+25
| | | | | | | | | | This re-uses NVE4_SU_INFO_CALL which is not used anymore because we don't use our lib for format conversions. While we are at it, add a todo for image buffers because there are some robustness-related issues to fix. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: prevent out of bounds when no images are boundSamuel Pitoiset2016-04-261-2/+19
| | | | | | | | | | Checking if the image address is not 0 should be enough to prevent read faults. To improve robustness, make sure that the destination value of atomic operations is correctly initialized in case the instruction is not performed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: add indirect support for images on KeplerSamuel Pitoiset2016-04-261-12/+28
| | | | | | | | This fixes arb_shader_image_load_store-indexing and arb_shader_image_load_store-max-images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix 1D arrays images for KeplerSamuel Pitoiset2016-04-261-2/+9
| | | | | | | For 1D arrays, the array index is stored in the Z component. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix cube images for KeplerSamuel Pitoiset2016-04-261-5/+5
| | | | | | | Like 2d array images, the z-dimension needs to be clamped. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add support for SULDP -> SULDB conversionIlia Mirkin2016-04-265-45/+293
| | | | | | | | | | This will allow to convert surface formats without adding an extra call to our lib. [hakzsam: make use of this for GK104] Signed-off-by: Samuel Pitoiset <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* nv50/ir: make use of OP_SUQ for surfaces querySamuel Pitoiset2016-04-264-11/+71
| | | | | | | | | This implements RESQ for surfaces which comes from imageSize() GLSL bultin. As the dimensions are sticked into the driver constant buffer, this only has to be lowered with loads. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v2)
* nv50/ir: add OP_BUFQ for buffers querySamuel Pitoiset2016-04-266-7/+23
| | | | | | | | | | TGSI RESQ allows both images and buffers but we have to make a distinction between these two type of resources in our lowering pass. Introducing OP_BUFQ which is a fake operand will allow to implement OP_SUQ for surfaces. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: enable early fragment test with explicit user controlSamuel Pitoiset2016-04-261-0/+3
| | | | | | | | | | | | | | This feature can be enabled in two ways: as an optimization and by explicit user control (with OpenGL 4.2 or ARB_shader_image_load_store). This makes use of the recent TGSI_PROPERTY_FS_EARLY_DEPTH_STENCIL to force early fragment tests when needed. This fixes a bunch of dEQP-GLES31.functional.image_load_store.early_fragment_tests.* tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: fix constraints for OP_SUSTx on KeplerSamuel Pitoiset2016-04-261-1/+3
| | | | | | | | Destination type is actually always 32-bits, so typeSizeof() returns 4 and no sources are condensed. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: re-introduce TGSI lowering pass for imagesSamuel Pitoiset2016-04-261-3/+94
| | | | | | | | This is loosely based on the previous lowering pass wrote by calim four years ago. I did clean the code and fixed some issues. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: add support for TGSI image declarationsSamuel Pitoiset2016-04-261-1/+22
| | | | | | | | Old and dead resource code will be removed once images are completely done. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add missing glMemoryBarrier bitsSamuel Pitoiset2016-04-261-1/+8
| | | | | | | | This fixes a bunch of subtests of arb_shader_image_load_store-host-mem-barrier. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1)
* nvc0: enable RGB10_A2UI format on GK104Samuel Pitoiset2016-04-261-3/+3
| | | | | | | | | No clue why this was not enabled by default before, maybe because the SULDP conversion was wrong. Anyway, this helps in fixing all rgb10_a2ui piglit tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: shift address with blocksize for image buffersSamuel Pitoiset2016-04-261-0/+4
| | | | | | | This fixes a bunch of dEQP image buffers related tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>