aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600
Commit message (Collapse)AuthorAgeFilesLines
* r600/radeonsi: implement new float comparison instructionsRoland Scheidegger2013-08-151-10/+8
| | | | | | | Also use ordered comparisons for old cmp instructions. Tested-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* vl: Add support for max level query v2Rico Schüller2013-08-141-0/+2
| | | | | | | | | This patch adds the level query support to the video decoders and uses some more reasonable defaults. v2: (ck) add commit message Reviewed-by: Christian König <[email protected]>
* r600g/sb: use MULADD workaround on R7xx for MULADD_IEEEVadim Girlin2013-08-141-1/+2
| | | | | | | | | | | | Looks like the same issue that was seen with MULADD in trans slot on R7xx also affects MULADD_IEEE (maybe all OP3 instructions and MULADD is just a most frequently used?). So the workaround is to not allow affected instructions to be placed into the trans slot. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=67927 Signed-off-by: Vadim Girlin <[email protected]> Cc: "9.2" <[email protected]>
* r600g/sb: Dump correct value for CND.Vinson Lee2013-08-041-1/+1
| | | | | | | Fixes "Copy-paste error" reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Vadim Girlin <[email protected]>
* r600g: honour semantic index in fragment color exportsChristoph Bumiller2013-08-021-5/+5
| | | | Signed-off-by: Marek Olšák <[email protected]>
* r600g/compute: Added missing address space checking of kernel parametersJonathan Charest2013-07-301-3/+2
| | | | | | | | | | | To have non-static buffers in local memory, it is necessary to pass them as arguments to the kernel. For r600, the correct lds size must be set to the SQ_LDS_ALLOC register. The correct size is the clover size plus the size reported by the compiler. Reviewed-by: Tom Stellard <[email protected]>
* gallium: Add PIPE_CAP_ENDIANNESSTom Stellard2013-07-221-0/+2
| | | | | | Cc: [email protected] [ Francisco Jerez: Fix "PIPE_ENDIAN_SMALL" in the documentation, define PIPE_ENDIAN_NATIVE. ]
* r600g: use WAIT_3D_IDLE before using CP DMAMarek Olšák2013-07-182-0/+2
| | | | I broke this with 7948ed1250cae78ae1b22dbce4ab23aceacc6159 for r700 at least.
* r600g/sb: improve alu packing on caymanVadim Girlin2013-07-172-15/+89
| | | | | | | | | | | | | | | | Scheduler/register allocator in r600-sb was developed and optimized on evergreen (VLIW-5) hardware, so currently it's not optimal for VLIW-4 chips. This patch should improve performance on cayman gpus due to better alu packing, but also it tends to increase register usage, so overall positive effect on performance has to be proven by real benchmarks yet. Some results with bfgminer kernel on cayman: source bytecode: 60 gprs, 3905 alu groups, sbcl before the patch: 45 gprs, 4088 alu groups, sbcl with this patch: 55 gprs, 3474 alu groups. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix handling of new multislot instructions on caymanVadim Girlin2013-07-173-5/+6
| | | | | | | Ex-scalar instructions that became multislot on cayman do replicate result to all channels - handle them similar to DOT4. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix debug dump code in schedulerVadim Girlin2013-07-171-4/+5
| | | | | | Update the stale debug code for other changes related to debug output. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix initial register allocationVadim Girlin2013-07-171-0/+1
| | | | | | | | | | Mark values that are members of the 'same register' constraint as preallocated in ra_init pass, this will prevent incorrect reallocation in scheduler in some cases. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713 Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: move chip & class name functions to sb_contextVadim Girlin2013-07-174-53/+55
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: fix handling of PS in source bytecode on caymanVadim Girlin2013-07-171-0/+5
| | | | | | | | | Actually PS doesn't make sense for cayman and isn't even mentioned in cayman docs, but llvm backend currently uses it in bytecode and, assuming that hw seems to be mostly ok with it, this will allow sb to parse such source bytecode correctly. Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: Initialize ra_checker member variables.Vinson Lee2013-07-171-1/+1
| | | | | | Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]>
* r600g/sb: Initialize ra_constraint::cost.Vinson Lee2013-07-131-1/+1
| | | | | | Fixes "Uninitialized scalar field" reported by Coverity. Signed-off-by: Vinson Lee <[email protected]>
* r600g: don't use the CB/DB CP COHER logic on r6xxAlex Deucher2013-07-121-2/+10
| | | | | | | | | There are hw bugs. Flush and inv event is sufficient. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66837 Signed-off-by: Alex Deucher <[email protected]>
* tgsi: rename the TGSI fragment kill opcodesBrian Paul2013-07-121-7/+7
| | | | | | | | | | | | | | | | | | | | | TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <[email protected]>
* radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2Christian König2013-07-121-4/+15
| | | | | | | | | | | UVD 2.x doesn't support hardware decoding of MPEG2, just use shader based decoding for those chipsets. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450 v2: fix interlacing as well Signed-off-by: Christian König <[email protected]>
* r600g: x/y coordinates must be divided by block dim in dma blitChristoph Bumiller2013-07-112-4/+16
| | | | | | | Note: this is a candidate for the 9.1 branch. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* r600g/sb: Fix Android build v2Chih-Wei Huang2013-07-124-7/+8
| | | | | | | Add the sb CXX files to the Android Makefile and also stop using some c++11 features. v2 (Vadim Girlin): use &bc[0] instead of bc.begin()
* r600g/sb: improve math optimizations v2Vadim Girlin2013-07-1111-47/+435
| | | | | | | | | | | | | | | | This patch adds support for some math optimizations that are generally considered unsafe, that's why they are currently disabled for compute shaders. GL requirements are less strict, so they are enabled for for GL shaders by default. In case of any issues with applications that rely on higher precision than guaranteed by GL, 'sbsafemath' option in R600_DEBUG allows to disable them. v2 - always set proper src vector size for transformed instructions - check for clamp modifier in the expr_handler::fold_assoc Signed-off-by: Vadim Girlin <[email protected]>
* r600g: improve the mechanism for recognizing an empty CSMarek Olšák2013-07-083-3/+8
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: explicitly flush caches for streamout-based buffer copying & clearingMarek Olšák2013-07-081-0/+13
| | | | | | | It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <[email protected]>
* r600g: only flush the caches that need to be flushed during CP DMA operationsMarek Olšák2013-07-083-32/+117
| | | | | | | This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <[email protected]>
* r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flagsMarek Olšák2013-07-085-27/+52
| | | | | | | also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <[email protected]>
* r600g: adjust flush flags (v3)Alex Deucher2013-07-086-7/+42
| | | | | | | | | | | | | 1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't call buffer_wait in buffer_mmap_sync_with_ringsMarek Olšák2013-07-081-2/+1
| | | | | | | | The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't read back the MSAA depth buffer if the read flag is not setMarek Olšák2013-07-081-8/+8
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: don't flush the context in texture_transfer_mapMarek Olšák2013-07-081-5/+0
| | | | | | the winsys does this automatically Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix texture offset computation for mapped MSAA depth buffersMarek Olšák2013-07-082-16/+14
| | | | | | | | | It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix color resolve for RGBX8 and RGBX16 integer formatsMarek Olšák2013-07-081-2/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: enable fast MSAA color clear for array/3D/cube texturesMarek Olšák2013-07-081-4/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: implement fast MSAA color clear for integer texturesMarek Olšák2013-07-081-9/+12
| | | | | | | this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <[email protected]>
* r600/uvd: fix check for UVD 2.xChristian König2013-07-081-1/+1
| | | | Signed-off-by: Christian König <[email protected]>
* mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependenciesMarek Olšák2013-07-021-1/+0
| | | | | | Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <[email protected]>
* r600g: implement fast color clears for MSAA on evergreen+Grigori Goronzy2013-07-013-2/+79
| | | | | | | | | | | | | | | | | | | | | | Allows MSAA colorbuffers, which have a CMASK automatically and don't need any further special handling, to be fast cleared. Instead of clearing the buffer, set the clear color and the CMASK to the cleared state. Fast clear is used only when all bound colorbuffers fulfill certain conditions: a CMASK is required, we have to be able to create a clear color value for the format and the texture mustn't contain multiple images. Technically, it should be possible to support array textures and cubemaps if all images are attached to the framebuffer, but this does not appear to be common. v2: fix fast clear check v3: Marek: - disable fast clear with 128-bit formats, which are unsupported - set tex->dirty_level_mask in r600_clear, so that the driver knows the resource must be decompressed/expanded - return early from r600_clear if there's nothing else to do Signed-off-by: Marek Olšák <[email protected]>
* r600g/compute: disable unused colorbuffer slotsMarek Olšák2013-07-011-1/+12
| | | | | Reviewed-by: Alex Deucher <[email protected]> Tested-by: Tom Stellard <[email protected]>
* r600g: Fix buildTom Stellard2013-06-281-2/+2
| | | | | Broken since 2840bec56f79347b95dec5458b20d4a46d1aa445 when opencl is disabled.
* r600g/compute: Accept LDS size from the LLVM backendTom Stellard2013-06-285-20/+44
| | | | | | And allocate the correct amount before dispatching the kernel. Tested-by: Aaron Watry <[email protected]>
* r600g/compute: Move compute_shader_create() function into evergreen_compute.cTom Stellard2013-06-283-37/+23
| | | | Tested-by: Aaron Watry <[email protected]>
* gallium: add condition parameter to render_conditionRoland Scheidegger2013-06-184-2/+8
| | | | | | | | | | | | | For conditional rendering this makes it possible to skip rendering if either the predicate is true or false, as supported by d3d10 (in fact previously it was sort of implied skip rendering if predicate is false for occlusion predicate, and true for so_overflow predicate). There's no cap bit for this as presumably all drivers could do it trivially (but this patch does not implement it for the drivers using true hw predicates, nvxx, r600, radeonsi, no change is expected for OpenGL functionality). Reviewed-by: Jose Fonseca <[email protected]>
* gallium: replace bswap_32 calls with util_bswap32Jonathan Gray2013-06-173-6/+6
| | | | | | | | | byteswap.h and bswap_32 aren't portable, replace them with calls to gallium's util_bswap32 as suggested by Mark Kettenis. Lets these files build on OpenBSD. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: upsample and downsample MSAA resources for transfersMarek Olšák2013-06-131-79/+141
| | | | | | | | | | | We did downsample (=resolve) MSAA resources to make ReadPixels work with MSAA GLX visuals, which was enough for read-only color-only transfers. This commit makes write color transfers and depth-stencil transfers work in a similar manner. It does downsampling in transfer_map and upsampling in transfer_unmap. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: make clearing independent of the colorbuffer formatMarek Olšák2013-06-131-2/+1
| | | | | | | There isn't any difference between 32_FLOAT and 32_*INT in vertex fetching. Both of them don't do any format conversion. Reviewed-by: Brian Paul <[email protected]>
* gallium/u_blitter: make clearing independent of the number of bound colorbuffersMarek Olšák2013-06-131-1/+1
| | | | | | We can use the fragment shader TGSI property WRITES_ALL_CBUFS. Reviewed-by: Brian Paul <[email protected]>
* r600g/sb: fix broken assertVadim Girlin2013-05-311-1/+1
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* gallium: Add support for multiple viewportsZack Rusin2013-05-253-9/+15
| | | | | | | | | | | | Gallium supported only a single viewport/scissor combination. This commit changes the interface to allow us to add support for multiple viewports/scissors. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: José Fonseca<[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600g/sb: handle more cases for folding in gvn passVadim Girlin2013-05-282-28/+118
| | | | Signed-off-by: Vadim Girlin <[email protected]>
* r600g/sb: improve folding for SETccVadim Girlin2013-05-271-8/+98
| | | | Signed-off-by: Vadim Girlin <[email protected]>