summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* nvc0: fix up image support for allowing multiple samplesIlia Mirkin2016-07-017-49/+108
| | | | | | | | | Basically we just have to scale up the coordinates and then add the relevant sample offset. The code to handle this was already largely present from Christoph's earlier attempts to pipe images through back in the dark ages, this just hooks it all up. Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: go back to not using viewport validate function for swtnlIlia Mirkin2016-07-012-1/+16
| | | | | | | | | The output of draw requires a null viewport transform, which the regular code is ill-equiped to do. Reinstate the original settings in the render path, and add setting of the viewport clip polygon based on fb width/height (as that is all taken care of by draw). Signed-off-by: Ilia Mirkin <[email protected]>
* nv30: fix viewport clipping settings to be based on viewport, not rtIlia Mirkin2016-07-012-17/+11
| | | | | | | This fixes a ton of "*clip*" dEQP GLES2 tests, as well as triangle-guardband-viewport in piglit. Signed-off-by: Ilia Mirkin <[email protected]>
* swr: Refactor checks for compiler feature flagsChuck Atkins2016-06-301-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | Encapsulate the test for which flags are needed to get a compiler to support certain features. Along with this, give various options to try for AVX and AVX2 support. Ideally we want to use specific instruction set feature flags, like -mavx2 for instance instead of -march=haswell, but the flags required for certain compilers are different. This allows, for AVX2 for instance, GCC to use -mavx2 -mfma -mbmi2 -mf16c while the Intel compiler which doesn't support those flags can fall back to using -march=core-avx2. This addresses a bug where the Intel compiler will silently ignore the AVX2 instruction feature flags and then potentially fail to build. v2: Pass preprocessor-check argument as true-state instead of false-state for clarity. v3: Reduce AVX2 define test to just __AVX2__. Additional defines suchas __FMA__, __BMI2__, and __F16C__ appear to be inconsistently defined w.r.t thier availability. v4: Fix C++11 flags being added globally and add more logic to swr_require_cxx_feature_flags Cc: <[email protected]> Reviewed-by: Tim Rowley <[email protected]> Tested-by: Tim Rowley <[email protected]> Signed-off-by: Chuck Atkins <[email protected]>
* svga: use SVGA3D_vgpu10_BufferCopy() for buffer copiesBrian Paul2016-06-301-4/+28
| | | | | | | | | | So that we do copies host-side rather than in the guest with map/memcpy. Tested with piglit arb_copy_buffer-subdata-sync test and new arb_copy_buffer-intra-buffer-copy test. Reviewed-by: Charmaine Lee <[email protected]> Acked-by: Roland Scheidegger <[email protected]>
* svga: add SVGA3D_vgpu10_BufferCopy()Brian Paul2016-06-302-0/+30
| | | | | Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: flush buffers when mapping for readingBrian Paul2016-06-301-13/+24
| | | | | | | | | | | | | | | With host-side buffer copies (via SVGA3D_vgpu10_BufferCopy()) we have to make sure any pending map-write operations are completed before reading if the buffer is dirty. Otherwise the ReadbackSubResource operation could get stale data from the host buffer. This allows the piglit arb_copy_buffer-subdata-sync test to pass when we start using the SVGA3D_vgpu10_BufferCopy command. v2: check the sbuf->dirty flag in the outer conditional, per Charmaine. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: enable ARB_copy_image extension in the driverNeha Bhende2016-06-301-1/+2
| | | | | | Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: try blitting with copy region in more casesBrian Paul2016-06-301-1/+7
| | | | | | | | | | We previously could do blits with util_resource_copy_region() when doing 'loose' format checking. Also do blits with util_resource_copy_region() when the blit src/dst formats (not the underlying resources) exactly match. Needed for GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use copy_region_vgpu10() for region copies when possibleBrian Paul2016-06-301-4/+37
| | | | | | | v2: remove extra svga_define_texture_level() call, per Charmaine. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use vgpu10 CopyRegion command when possibleNeha Bhende2016-06-301-2/+147
| | | | | | | | | Do texture->texture copies host-side with this command when possible. Use the previous software fallback otherwise. Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: set render target flag for snorm surfacesBrian Paul2016-06-301-0/+10
| | | | | | | | | We don't normally support rendering to SNORM surfaces, but with GL_ARB_copy_image we can copy to them if we treat them as typeless and use a UNORM surface view. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: add new svga_format_is_uncompressed_snorm() helperBrian Paul2016-06-302-0/+24
| | | | | Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: adjust sampler view format for RGBXBrian Paul2016-06-301-1/+5
| | | | | | | | We previously handled the case of a RGBX sampler view of a RGBA surface. Add the reverse case too. For GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: adjust render target view format for RGBXBrian Paul2016-06-301-1/+13
| | | | | | | | For GL_ARB_copy_image we may be asked to create an RGBA view of a RGBX surface. Use an RGBX view format for that case. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: don't advertise support for R32G32B32_UINT/SINT surface formatsNeha Bhende2016-06-301-2/+2
| | | | | | | | | | | | | | | | | | We want to be able to copy between different 32-bit, 3-channel surface formats for GL_ARB_copy_image but since we don't support R32G32B32_FLOAT for textures (it's not blendable and wouldn't work for render to texture) we can't support 32-bit, 3-channel integer formats. The state tracker will choose 4-channel formats instead. Fixes the piglit arb_copy_image-format test for several cases. Note: This change may need to be revisited if/when the texture_view exension is enabled in driver. Reviewed-by: Brian Paul <[email protected]> Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: use untyped surface formats in most casesBrian Paul2016-06-301-4/+7
| | | | | | | | | This allows us to do copies between different, but compatible, surface formats such as RGBA8_UNORM, RGBA8_SINT, RGBA8_UINT, etc. for GL_ARB_copy_image. Acked-by: Roland Scheidegger <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Fix failures caused in fedora 24Neha Bhende2016-06-303-6/+15
| | | | | | | | | | | SVGA_3D_CMD_DX_GENRATE_MIPMAP & SVGA_3D_CMD_DX_SET_PREDICATION commands are not presents in fedora 24 kernel module. Because of this reason application like supertuxkart are not running. v2: Add few comments and code modifications suggested by Brian P. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* radeon/uvd: fix a h265 context size bugsonjiang2016-06-291-0/+3
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon/uvd: seperate uvd context buffer from DPBsonjiang2016-06-291-9/+97
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* radeon uvd add uvd fw version for amdgpusonjiang2016-06-291-0/+1
| | | | | | | Signed-off-by: sonjiang <[email protected]> Cc: "12.0" <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]>
* nv50/ir: print EMIT subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print RSQ/RCP subops in debug modeSamuel Pitoiset2016-06-291-0/+10
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print PIXLD subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* nv50/ir: print SHFL subops in debug modeSamuel Pitoiset2016-06-291-0/+9
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>
* gallium/radeon: remove zombie textures kept alive by DCC stat gatheringMarek Olšák2016-06-291-12/+27
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: don't re-create queries for DCC stat gatheringMarek Olšák2016-06-293-5/+10
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: assume X11 DRI3 can use at most 5 back buffersMarek Olšák2016-06-291-1/+5
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: separate DCC starts as disabled (ps_draw_ratio = 0)Marek Olšák2016-06-291-9/+10
| | | | | | | | | | | | | | DRI3: - Only slows clears can enable it for the first frame. - A good PS/draw ratio can enable it for other frames. DRI2: - Only slows clears can enable it for a frame. - Page-flipped color buffers are unref'd at the end of each frame, so it can't be enabled in any other way. - Relying on slow clears is sufficient for our synthetic benchmarks. Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: R600_DEBUG=nodccfb disables separate DCCMarek Olšák2016-06-293-1/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add and use r600_texture_referenceMarek Olšák2016-06-295-10/+13
| | | | | | Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Vedran Miletić <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a HUD query for PS draw ratio stats from separate DCCMarek Olšák2016-06-294-0/+8
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add a heuristic enabling DCC for scanout surfaces (v2)Marek Olšák2016-06-296-4/+338
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | DCC for displayable surfaces is allocated in a separate buffer and is enabled or disabled based on PS invocations from 2 frames ago (to let queries go idle) and the number of slow clears from the current frame. At least an equivalent of 5 fullscreen draws or slow clears must be done to enable DCC. (PS invocations / (width * height) + num_slow_clears >= 5) Pipeline statistic queries are always active if a color buffer that can have separate DCC is bound, even if separate DCC is disabled. That means the window color buffer is always monitored and DCC is enabled only when the situation is right. The tracking of per-texture queries in r600_common_context is quite ugly, but I don't see a better way. The first fast clear always enables DCC. DCC decompression can disable it. A later fast clear can enable it again. Enable/disable typically happens only once per frame. The impact is expected to be negligible because games usually don't have a high level of overdraw. DCC usually activates when too much blending is happening (smoke rendering) or when testing glClear performance and CMASK isn't supported (Stoney). v2: rename stuff, add assertions Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add state setup for a separate DCC bufferMarek Olšák2016-06-294-5/+41
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: always calculate DCC info even if it's not used immediatelyMarek Olšák2016-06-291-1/+2
| | | | | | for a later use Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: unreference framebuffer state with set_framebuffer_stateMarek Olšák2016-06-293-4/+6
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium/radeon: add flag R600_QUERY_HW_FLAG_BEGIN_RESUMESMarek Olšák2016-06-292-1/+4
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't advertise multisample shader imagesMarek Olšák2016-06-291-0/+3
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable distributed tess on multi-SE parts onlyMarek Olšák2016-06-294-2/+7
| | | | | | | ported from Vulkan Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: set optimal VGT_HS_OFFCHIP_PARAMMarek Olšák2016-06-295-14/+49
| | | | | | | ported from Vulkan Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: enable CU0 in each SE for LS-HS executionMarek Olšák2016-06-291-2/+1
| | | | | | | Offchip-only tessellation allows this. Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: use conformant line rasterizationMarek Olšák2016-06-294-5/+29
| | | | | | | | | | AA lines are not completely correct (see TODO), but everything else should be. + 3 linestipple piglits Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* svga: force direct map for transfering multiple slicesCharmaine Lee2016-06-281-15/+24
| | | | | | | | | | | | | With commit fb9fe35, we start using transfer_inline_write for memcpy of TexSubImage. But SurfaceDMA command does not work well with texture array. This patch forces direct map when transfering multiple slices of a texture array. Fixes piglit regression "texelFetch fs sampler1DArray" Tested with MTT piglit, glretrace, conform. Reviewed-by: Sinclair Yeh <[email protected]>
* svga: whitespace, line wrapping fixes in svga_surface.cBrian Paul2016-06-281-11/+16
|
* gm107/ir: make sure that flagsDef is set when emitting setcondSamuel Pitoiset2016-06-281-1/+1
| | | | | | | | | | | Rely on the existence of a second destination when emitting a setcond flag is dangerous, because this doesn't mean that the flag has been correctly set. Instead rely on flagsDef like what emitX() does for flagsSrc. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* radeonsi: set PA_SU_SMALL_PRIM_FILTER_CNTL register on PolarisMarek Olšák2016-06-282-0/+11
| | | | | | | This was missing. Cc: 12.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeon/vce: use vce structure for vce 52 firmwareBoyuan Zhang2016-06-285-98/+517
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* radeon/vce: add vce structuresBoyuan Zhang2016-06-281-0/+297
| | | | | | Signed-off-by: Boyuan Zhang <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Leo Liu <[email protected]>
* gm107/ir: add missing setcond flags for LOP variantsSamuel Pitoiset2016-06-281-0/+2
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>
* gm107/ir: make use of LOP32I for all immediatesSamuel Pitoiset2016-06-281-1/+1
| | | | | | | | LOP only allows to emit 19-bits immediates. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: <[email protected]>