summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/nouveau/nvc0
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: add missing glMemoryBarrier bitsSamuel Pitoiset2016-04-261-1/+8
| | | | | | | | This fixes a bunch of subtests of arb_shader_image_load_store-host-mem-barrier. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> (v1)
* nvc0: enable RGB10_A2UI format on GK104Samuel Pitoiset2016-04-261-3/+3
| | | | | | | | | No clue why this was not enabled by default before, maybe because the SULDP conversion was wrong. Anyway, this helps in fixing all rgb10_a2ui piglit tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: shift address with blocksize for image buffersSamuel Pitoiset2016-04-261-0/+4
| | | | | | | This fixes a bunch of dEQP image buffers related tests. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix address offset when images have multiple levelsSamuel Pitoiset2016-04-261-0/+1
| | | | | | | This fixes arb_shader_image_load_store-level. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind images on 3D shaders for KeplerSamuel Pitoiset2016-04-262-2/+31
| | | | | | | Similar to surfaces validation for compute shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind images on compute shaders for KeplerSamuel Pitoiset2016-04-264-28/+110
| | | | | | | | | | | Old surfaces validation code will be removed once images are completely done for Fermi/Kepler, that explains why I only disable it for now. This also introduces nvc0_get_surface_dims() which computes correct dimensions regarding the given target. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reserve an area for surfaces info in the driver constbufSamuel Pitoiset2016-04-266-12/+12
| | | | | | | | | | | | | To process surfaces coordinates from the codegen part, and because some information like the format is not always available (eg. when writeonly is used), we have to stick some surfaces data in the driver constbuf. This is especially true for OpenCL because we don't know the format at shader compile time. This bumps the size of each shader area from 1K to 2K. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add preliminary support for imagesSamuel Pitoiset2016-04-265-2/+74
| | | | | | | | | | | | This implements set_shader_images() and resource invalidation for images. As OpenGL requires at least 8 images, we are going to expose this minimum value even if this might be raised for Kepler, but this limit is mainly for Fermi because the hardware only accepts 8 images. Based on original patch by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump the amount of shared memory per MP on MaxwellSamuel Pitoiset2016-04-261-1/+11
| | | | | | | | According to the CUDA compute capability version, GM10x can expose 64KB of shared memory while GM20x can use 96KB. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: fix retrieving query results into buffer for timestampsIlia Mirkin2016-04-221-5/+15
| | | | | | | | | The timestamps are stored in a funny place, and even though they are a 64-bit result, are not stored with is64bit. Account for that when retrieving the query result into a resource. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.2" <[email protected]>
* gallium: add bool return to pipe_context::end_queryNicolai Hähnle2016-04-211-1/+2
| | | | | | | | | Even when begin_query succeeds, there can still be failures in query handling. For example for radeon, additional buffers may have to be allocated when queries span multiple command buffers. Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium: use PIPE_SHADER_* everywhere, remove TGSI_PROCESSOR_*Marek Olšák2016-04-221-1/+1
| | | | Acked-by: Jose Fonseca <[email protected]>
* gallium: merge PIPE_SWIZZLE_* and UTIL_FORMAT_SWIZZLE_*Marek Olšák2016-04-222-10/+10
| | | | | | | | Use PIPE_SWIZZLE_* everywhere. Use X/Y/Z/W/0/1 instead of RED, GREEN, BLUE, ALPHA, ZERO, ONE. The new enum is called pipe_swizzle. Acked-by: Jose Fonseca <[email protected]>
* nvc0: avoid tex read fault from compute shaders on GK110Samuel Pitoiset2016-04-201-0/+3
| | | | | | | | | | | | After some investigation, it seems like that disabling the UNK02C4 command avoid a read fault with texelFetch() from a compute shader. I have no clue on what this method actually does, but this avoid the GPU to hang with basic-texelFetch.shader_test without introducing any compute-related regressions. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: do not break the universe on GK110+Samuel Pitoiset2016-04-141-0/+1
| | | | | | | | I removed that return 0 by mistake. Ooops. Fixes: 6e23fd4 ("nvc0: allow to use compute support on GM200") Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: allow to use compute support on GM200Samuel Pitoiset2016-04-142-2/+4
| | | | | | | | | This works like a charm but please not that NVF0_COMPUTE have to be set because compute support is still not enabled by default on GK110+. This will require more testing to make sure it won't break the 3D state. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium: Add capability for ARB_robust_buffer_access_behavior.Bas Nieuwenhuizen2016-04-121-0/+1
| | | | | | Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* gallium: add pipe_context::set_active_query_state for pausing queriesMarek Olšák2016-04-121-0/+6
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* nvc0: handle the case where there are no framebuffer attachmentsIlia Mirkin2016-04-094-11/+44
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: support sending string markers down into the command streamIlia Mirkin2016-04-092-1/+26
| | | | | | | This should hopefully make it a little easier to debug with GL applications like glretrace and looking at command streams. Signed-off-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add invalidate_resource support for buffer resourcesIlia Mirkin2016-04-092-1/+2
| | | | | | | Provide a callback to reallocate the underlying storage of a resource so that it is not bound to any existing fences. Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: Add PIPE_CAP_FRAMEBUFFER_NO_ATTACHMENTEdward O'Callaghan2016-04-071-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Add PIPE_CAP to determine if the GL extension 'GL_ARB_framebuffer_no_attachments' shall be supported. The driver is required to support 'PIPE_FORMAT_NONE' via its 'is_format_supported()' callback in order to determine the MSAA modes the hardware supports so that values requested from the application using 'GL_ARB_framebuffer_no_attachments' may be quantized to what the hardware expects. V.2: Fix doc for a more detailed description of the PIPE_CAP and the corresponding GL constant. V.3: Renamed and repurposed once again. V.4: Remove CAP from cap_mapping array. [airlied: fix damaged whitespace] Signed-off-by: Edward O'Callaghan <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0: add hardware ETC2 and ASTC support on GK20A and GM107+Ilia Mirkin2016-04-042-2/+18
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: distinguish between shader IR in get_compute_paramBas Nieuwenhuizen2016-04-021-0/+1
| | | | | | | | | | | | | For radeonsi, native and TGSI use different compilers and this results in different limits for different IR's. The set we strictly need for radeonsi is only the MAX_BLOCK_SIZE and MAX_THREADS_PER_BLOCK params, but I added a few others as shader related that seemed like they would also typically depend on the compiler. Signed-off-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Dave Airlie <[email protected]>
* nvc0: enable compute shaders on GK104 and GM107+Samuel Pitoiset2016-04-011-1/+2
| | | | | | | | | Compute support on GK110 is still unstable for weird reasons, but this can be fixed later as the NVF0_COMPUTE envvar prevent using compute. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bump the maximum number of UBOs for compute on KeplerSamuel Pitoiset2016-04-012-3/+0
| | | | | | | | The maximum number of uniform blocks (MAX_COMPUTE_UNIFORM_BLOCKS) per compute program must be at least 12. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: add indirect compute support on KeplerSamuel Pitoiset2016-04-011-34/+77
| | | | | | | | | | The grid size is stored as three 32-bits integers in the indirect buffer but the launch descriptor uses a 32-bits integer for both griddim_y and griddim_z like this (z << 16) | y. To make it work, the 16 high bits of griddim_y are overwritten by griddim_z. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: reduce likelihood of collision for real buffers on KeplerSamuel Pitoiset2016-04-011-2/+2
| | | | | | | | | | | Reduce likelihood of collision with real buffers by placing the hole at the top of the 4G area. This fixes some indirect draw+compute tests with large buffers. Suggested by Ilia Mirkin. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: store ubo info to the driver constbuf on KeplerSamuel Pitoiset2016-04-013-1/+29
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind user uniforms for compute on KeplerSamuel Pitoiset2016-04-012-27/+55
| | | | | | | | | | | Uniform buffer objects will be sticked to the driver constant buffer like buffers because the launch descriptor only allows 8 CBs. Input kernel parameters for OpenCL are still uploaded to screen->parm which is bound on c0, but this will be changed later with a new series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind shader buffers for compute on KeplerSamuel Pitoiset2016-04-012-3/+39
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0: bind driver cb for compute on c7[] for KeplerSamuel Pitoiset2016-04-014-45/+37
| | | | | | | | | | | | Instead of using the screen->parm buffer object which will be removed, upload auxiliary constants to uniform_bo to be consistent regarding what we already do for Fermi. This breaks surfaces support (for compute only) but this will be properly re-introduced later for ARB_shader_image_load_store. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: add PIPE_BIND_LINEAR support to is_format_supportedIlia Mirkin2016-03-311-0/+9
| | | | | | | vdpau has recently come to rely on this, so make sure to check it properly. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: use a different offset for buffers and surfacesSamuel Pitoiset2016-03-291-3/+5
| | | | | | | | | | To not overwrite buffers and surfaces information, we need to use a different offset in the driver constant buffer. Currently, OP_SUQ is only supported for buffers but this will be slightly updated for images support. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: make sure to disable fetches from previously-set VBOs when blittingIlia Mirkin2016-03-281-0/+2
| | | | | | | We disable the vertex attributes, but also disable the VBO fetch details as well, just in case. Not known to fix anything. Signed-off-by: Ilia Mirkin <[email protected]>
* nvc0: disable primitive restart and index bias during blitsIlia Mirkin2016-03-281-0/+11
| | | | | | | | | | | | | | | | Back in the dawn of time, we used to do immediate uploads for the vertex data, and all was well. However Maxwell dropped support for immediate vertex data, so we started feeding in a VBO (in all cases). But we forgot to disable some things that apply in such cases, specifically primitive restart and index bias. The latter was causing WoW and other Blizzard games trouble as they use a pattern where they draw with a base vertex (aka index bias), followed by texture uploads (aka blits, internally). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Cc: "11.1 11.2" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Tested-by: Karol Herbst <[email protected]>
* nvc0: make sure to delete samplers used by compute shadersSamuel Pitoiset2016-03-211-1/+1
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: avoid using magic numbers for the uniform_bo offsetsSamuel Pitoiset2016-03-198-44/+73
| | | | | | | | | | | | | | | Instead make use of constants to improve readability. The first 32 bytes of the driver constant buffer are unknown... This doesn't seem to be used in the codegen part, but if the texBindBase offset is shifted from 0x20 to 0x00, this breaks the universe for really weird reasons. This sounds like to be related to textures. Anyway, name this NVC0_CB_AUX_UNK_INFO and add a todo should be enough for now. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nv50,nvc0: replace resInfoCBSlot by auxCBSlotSamuel Pitoiset2016-03-191-3/+1
| | | | | | | | | Having two different variables for the driver constant buffer slot is confusing and really useless. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]> Acked-by: Pierre Moreau <[email protected]>
* nv50,nvc0: Set only NEW_CP_GLOBALS upon bindingPierre Moreau2016-03-131-1/+1
| | | | | Signed-off-by: Pierre Moreau <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nv50,nvc0: handle SQRT lowering inside the driverIlia Mirkin2016-03-131-1/+1
| | | | | | | | | | | | | | | | | | | | | First off, st/mesa lowers DSQRT incorrectly (it uses CMP to attempt to find out whether the input is less than 0). Secondly the current approach (x * rsq(x)) behaves poorly for x = inf - a NaN is produced instead of inf. Instead we switch to the less accurate rcp(rsq(x)) method - this behaves nicely for all valid inputs. We still don't do this for DSQRT since the RSQ/RCP ops are *really* inaccurate, and don't even have Newton-Raphson steps right now. Eventually we should have a separate library function for DSQRT that does it more precisely (and perhaps move this lowering to the post-opt phase). This fixes a number of dEQP precision tests that were expecting better behavior for infinite inputs. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Tested-by: Samuel Pitoiset <[email protected]>
* nvc0: fix blit triangle size to fully cover FB's > 8192x8192Ilia Mirkin2016-03-131-4/+4
| | | | | | | | | | | | | | The idea is that a single triangle will cover the whole area being drawn, allowing the blit shader to do its work. However the max fb size is 16384x16384, which means that the triangle we draw needs to be twice that in order to cover the whole area fully. Increase the size of the triangle to 32768x32768. This fixes a number of dEQP tests that were failing because a blit was involved which would miss some of the resulting texture. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.1 11.2" <[email protected]>
* nvc0: add support for TGSI FMA opsIlia Mirkin2016-03-101-1/+2
| | | | | | | | This will allow the nouveau backend to not try and split up ops that are fused in GLSL. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* nvc0: expose SM35 perf counters to AMD_performance_monitorSamuel Pitoiset2016-03-101-2/+2
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: add driver metrics for SM35 (GK110)Samuel Pitoiset2016-03-101-1/+20
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: add MP performance counters for SM35 (GK110)Samuel Pitoiset2016-03-103-17/+204
| | | | | | | | | Because compute support is not enabled by default for these chipsets, NVF0_COMPUTE=1 needs to be used, along with GALLIUM_HUD to enable performance counters. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: explode config of Kepler hardware SM eventsSamuel Pitoiset2016-03-101-78/+477
| | | | | | | | This is really verbose but most of the configuration will be reused for SM35 (GK110). Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: rework the driver metrics infrastructureSamuel Pitoiset2016-03-103-157/+172
| | | | | | | This follows the same design as MP perf counters. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* nvc0: rework the MP counters infrastructureSamuel Pitoiset2016-03-104-268/+243
| | | | | | | This mainly improves how we define the different list of queries. Signed-off-by: Samuel Pitoiset <[email protected]> Acked-by: Ilia Mirkin <[email protected]>
* gallium: add CAPs returning PCI device locationMarek Olšák2016-03-091-0/+4
| | | | Reviewed-by: Brian Paul <[email protected]>