aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600/r600_asm.c
Commit message (Collapse)AuthorAgeFilesLines
* r600g: Implement GL_ARB_draw_indirect for EG/CMGlenn Kennard2015-02-241-1/+1
| | | | | | | | | | | | | | | | Requires Evergreen/Cayman and radeon kernel module 2.41.0 or newer. Expected piglit fails due to hardware limitations: * arb_draw_indirect-draw-arrays-prim-restart Restarts not applied for DrawArrays commands * arb_draw_indirect-vertexid Base vertex offset is not included in vertex id Marek: bump vgt_state num_dw by 3 (= space needed for one register write) Signed-off-by: Glenn Kennard <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* r600g: fix abs() support on ALU 3 source operands instructionsXavier Bouchoux2015-02-061-0/+6
| | | | | | | | Since alu does not support abs() modifier on source operands, spill and apply the modifiers to a temp register when needed. Signed-off-by: Xavier Bouchoux <[email protected]> Reviewed-by: Glenn Kennard <[email protected]>
* r600g: Implement sm5 UBO/sampler indexingGlenn Kennard2014-10-281-8/+50
| | | | | | | | | Caveat: Shaders using UBO/sampler indexing will not be optimized by SB, due to SB not currently supporting the necessary CF_INDEX_[01] index registers. Signed-off-by: Glenn Kennard <[email protected]>
* Eliminate several cases of multiplication in arguments to callocCarl Worth2014-09-031-1/+1
| | | | | | | | | | | | | | | | | | | | | | In commit 32f2fd1c5d6088692551c80352b7d6fa35b0cd09, several calls to _mesa_calloc(x) were replaced with calls to calloc(1, x). This is strictly equivalent to what the code was doing previously. But for cases where "x" involves multiplication, now that we are explicitly using the two-argument calloc, we can do one step better and replace: calloc(1, A * B); with: calloc(A, B); The advantage of the latter is that calloc will detect any overflow that would have resulted from the multiplication and will fail the allocation, (whereas the former would return a small allocation). So this fix can change potentially exploitable buffer overruns into segmentation faults. Reviewed-by: Matt Turner <[email protected]>
* r600g: switch SNORM conversion to DX and GLES behaviorMarek Olšák2014-07-281-1/+0
| | | | | | | | | it also matches GL 4.2 further discussion: http://lists.freedesktop.org/archives/mesa-dev/2013-August/042680.html Cc: [email protected]
* r600g: HW bug workaround for TGSI_OPCODE_BREAKCChristoph Bumiller2014-06-021-0/+1
| | | | Signed-off-by: Marek Olšák <[email protected]>
* r600g: Use util_cpu_to_le32() instead of bswap32() on big-endian systemsTom Stellard2014-02-241-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* r600g: add support for geom shaders to r600/r700 chipsets (v2)Dave Airlie2014-02-051-1/+1
| | | | | | | | | | | | | This is my first attempt at enabling r600/r700 geometry shaders, the basic tests pass on both my rv770 and my rv635, It requires this kernel patch: http://www.spinics.net/lists/dri-devel/msg52745.html v2: address Alex comments. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: initial support for geometry shaders on evergreen (v2)Vadim Girlin2014-02-051-1/+1
| | | | | | | | | | | | | | This is Vadim's initial work with a few regression fixes squashed in. v2: (airlied) fix regression in glsl-max-varyings - need to use vs and ps_dirty fix regression in shader exports from rebasing. whitespace fixing. v2.1: squash fix assert Signed-off-by: Vadim Girlin <[email protected]> Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g/bc: add support for indexed memory writes.Dave Airlie2014-02-051-2/+7
| | | | | | | It looks like we need these for geom shaders in the future. Signed-off-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: move barrier and end_of_program bits from output to cf struct (v2)Vadim Girlin2014-02-051-11/+13
| | | | | | | | v2: fix regression on r600 NOP instructions. Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: Removed unnecessary positivity check for unsigned int variable.Siavash Eliasi2014-01-311-1/+1
| | | | Signed-off-by: Marek Olšák <[email protected]>
* r600g: Add support for PIPE_FORMAT_R11G11B10_FLOAT vertex elementsFredrik Höglund2013-11-071-0/+6
| | | | | Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: move the low-level buffer functions for multiple rings to drivers/radeonMarek Olšák2013-09-291-1/+1
| | | | Also slightly optimize r600_buffer_map_sync_with_rings.
* r600g: move some debug options to drivers/radeonMarek Olšák2013-09-291-3/+3
|
* r600g: move streamout state to drivers/radeonMarek Olšák2013-08-311-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | This streamout state code will be used by radeonsi. There are new structures r600_common_context and r600_common_screen. What is inherited by what is shown here: pipe_context -> r600_common_context -> r600_context pipe_screen -> r600_common_screen -> r600_screen The common structures reside in drivers/radeon. Currently they only contain enough functionality to be able to handle streamout. Eventually I'd like the whole pipe_screen implementation to be shared and some of the context stuff too. This is quite big, but most changes are because of the new structures and the fact r600_write_value is replaced by radeon_emit. Thanks to Tom Stellard for fixing the build for r600g/compute. Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Christian König <[email protected]> Tested-by: Tom Stellard <[email protected]>
* r600g: enable SB backend by defaultVadim Girlin2013-08-301-1/+2
| | | | | | | Signed-off-by: Vadim Girlin <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Christian König <[email protected]>
* gallium: replace bswap_32 calls with util_bswap32Jonathan Gray2013-06-171-2/+2
| | | | | | | | | byteswap.h and bswap_32 aren't portable, replace them with calls to gallium's util_bswap32 as suggested by Mark Kettenis. Lets these files build on OpenBSD. Signed-off-by: Jonathan Gray <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* r600g: cleanup MSAA texture support checkingMarek Olšák2013-05-151-3/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: use old shader disassembler by defaultVadim Girlin2013-05-031-6/+7
| | | | | | | | | | | | | | New disassembler is not completely isolated yet from further processing in r600g/sb that is not required for printing the dump, so it has higher probability to fail in case of any unexpected features in the bytecode. This patch adds "sbdisasm" flag for R600_DEBUG that allows to use new disassembler in r600g/sb for shader dumps when shader optimization is not enabled. If shader optimization is enabled, new disassembler is used by default. Signed-off-by: Vadim Girlin <[email protected]>
* r600g: plug in optimizing backendVadim Girlin2013-04-301-0/+11
| | | | | | Optimization is enabled with "R600_DEBUG=sb". Signed-off-by: Vadim Girlin <[email protected]>
* r600g: fix valgrind warning on CaymanMarek Olšák2013-04-101-1/+1
| | | | Warning: "Conditional jump or move depends on uninitialised value(s)".
* r600g/llvm: Add support for native isa for pre EGVincent Lejeune2013-04-081-1/+5
| | | | | This fixes bug 62756 : https://bugs.freedesktop.org/show_bug.cgi?id=62756#c12
* r600g/llvm: Do not override llvm provided stack_sizeVincent Lejeune2013-04-031-1/+2
|
* r600g: don't reserve more stack space than required v5Vadim Girlin2013-04-021-4/+39
| | | | | | | | | | | Reduced stack size allows to run more threads in some cases, improving performance for the shaders that use stack (that is, for the shaders with control flow instructions). E.g. with unigine-based apps. v4: implement exact computation taking into account wavefront size v5: add cases for RV620, RS880 Signed-off-by: Vadim Girlin <[email protected]>
* r600g/llvm: Add support for cf_alu native encodeVincent Lejeune2013-04-011-1/+1
|
* r600g: dump vertex elements state along with the fetch shaderMarek Olšák2013-03-111-0/+8
|
* r600g: remove bytecode dumpingMarek Olšák2013-03-111-239/+0
| | | | Reviewed-by: Tom Stellard <[email protected]>
* r600g: use a single env var R600_DEBUG, disable bytecode dumpingMarek Olšák2013-03-111-10/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Only the disassembler is used to dump shaders. Here's a few examples how to use R600_DEBUG. Log compute info: R600_DEBUG=compute Dump all shaders: R600_DEBUG=fs,vs,gs,ps,cs Dump pixel shaders only: R600_DEBUG=ps Disable Hyper-Z: R600_DEBUG=nohyperz Disable the LLVM backend: R600_DEBUG=nollvm Or use any combination of the above, or print all options: R600_DEBUG=help Reviewed-by: Tom Stellard <[email protected]>
* r600g: Check comp_mask before merging export instructionsVincent Lejeune2013-03-031-0/+1
| | | | | Fixes a llvm uncovered (rare) bug where consecutive exports were merged even if they have incompatible mask.
* r600g: fix check_and_set_bank_swizzle for caymanVadim Girlin2013-03-031-7/+3
| | | | | Tested-by: Vincent Lejeune <vljn at ovi.com> Reviewed-by: Vincent Lejeune <vljn at ovi.com>
* r600g: implement shader disassembler v3Vadim Girlin2013-02-011-2/+434
| | | | | | | | | | | | | | R600_DUMP_SHADERS environment var now allows to choose dump method: 0 (default) - no dump 1 - full dump (old dump) 2 - disassemble 3 - both v2: fix output for burst_count > 1 v3: use more human-readable output for kcache data in CF_ALU_xxx clauses, improve output for ALU_EXTENDED, other minor fixes Signed-off-by: Vadim Girlin <[email protected]>
* r600g: use tables with ISA info v3Vadim Girlin2013-02-011-913/+173
| | | | | | | | | v3: added some flags including condition codes for ALU, fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx and other CF instructions) rebased on current master Signed-off-by: Vadim Girlin <[email protected]>
* r600g: Add ar_chan member to struct r600_bytecodeTom Stellard2013-01-281-0/+2
| | | | | | | | r600_bytecode::ar_chan stores the register channel for the value that will be loaded into the AR register. At the moment, this field is only used by the LLVM backend. The default backend always sets ar_chan = 0.
* r600g: More robust checks for MOVA_INT instructionsTom Stellard2013-01-281-8/+35
|
* r600g: add multi ring support with dma as first second ring v4Jerome Glisse2013-01-281-2/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We keep track of ring emission order in a stack, whenever we need to flush we empty the stack in a fifo order. There is few helpers function for bo mapping and other ring activities that will make sure that the ring stack is properly flush and submitted. v2: fix st flush path, and other flush path to properly flush all rings if necessary v3: - improve name of ring helpers - make sure that each time a cs is gona be written it endup at top of the stack to avoid any issue such as : STACK[0] = dma (withbo A,B) STACK[1] = gfx (withbo C,D) Now if code try to emit a dma command relative to bo C or D it will start writting cmd stream into the cs and once it reach the point where it adds relocation it will flush. At that point the cs will have cmd that don't have proper relocation into the relocation buffer and kernel will just refuse to run. v4: - Drop the stack idea as it turn out there is no way to use it or benefit from it. Any time the driver start command on other ring, it always need to flush the previous ring. So make code simpler by not using a stack. Signed-off-by: Jerome Glisse <[email protected]>
* r600g/llvm: tgsi to llvm emits stream output intrinsics.Vincent Lejeune2013-01-181-0/+2
| | | | Reviewed-by: Tom Stellard <[email protected]>
* r600g: texture buffer object + glsl 1.40 enable support (v2)Dave Airlie2013-01-111-1/+1
| | | | | | | | | | | | | | | | | | | This adds TBO support to r600g, and with GLSL 1.40 enabled, we now get 3.1 core profiles advertised for r600g. The r600/700 implementation is a bit different from the evergreen one, as r6/7 hw lacks vertex fetch swizzles. So we implement it by passing 5 constants per sampler to the shader, the shader uses the first 4 as masks for each component and the 5th as the alpha value to OR in. Now TXQ is also broken so we have to pass a constant for the buffer size, on evergreen we just pass this, on r6/7 we pass it as the 6th element in the const info buffer. v1.1: drop return as DDX doesn't use a texture type v2: add r600/700 support. Signed-off-by: Dave Airlie <[email protected]>
* r600g: Fix memory leak in r600_bytecode_add_vtx.Vinson Lee2013-01-091-0/+1
| | | | | | Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]>
* radeon/winsys: move radeon family/class identification to winsysJerome Glisse2013-01-071-2/+3
| | | | | | | | Upcoming async dma support rely on winsys knowing about GPU families. Signed-off-by: Jerome Glisse <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: suballocate memory for fetch shaders from a large bufferMarek Olšák2012-12-121-12/+14
| | | | | | | | | | Fetch shaders are usually destroyed at the context destruction by the state tracker, so we can put them all in a large buffer without wasting memory. This reduces the number of relocations sent to the kernel a little bit. Tested-by: Aaron Watry <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: fix pre eg export with llvmVincent Lejeune2012-11-081-1/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher at amd.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: make tgsi-to-llvm generates store.pixel* intrinsic for fsVincent Lejeune2012-11-021-0/+17
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: implement texturing with 8x MSAA compressed surfaces for EvergreenMarek Olšák2012-10-291-2/+8
| | | | | | | | | | The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns garbage there. The 8x MSAA case is broken on Cayman, though at least the result looks somewhat correct. Only the 8x MSAA case works on Evergreen and is enabled.
* r600g: force bank_swizzle if already setVincent Lejeune2012-10-241-0/+2
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: move shader structures into r600_shader.hMarek Olšák2012-10-121-0/+1
|
* r600g: atomize fetch shaderMarek Olšák2012-10-101-32/+31
| | | | | | | The state object is actually a buffer, it's literally a buffer containing the shader code. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: fix instance divisor on CaymanMarek Olšák2012-09-271-19/+35
| | | | | | Not sure if this is the best way to fix it. NOTE: This is a candidate for the stable branches.
* r600g: Use LOOP_START_DX10 for loopsTom Stellard2012-09-191-1/+7
| | | | | | | | | | LOOP_START_DX10 ignores the LOOP_CONFIG* registers, so it is not limited to 4096 iterations like the other LOOP_* instructions. Compute shaders need to use this instruction, and since we aren't optimizing loops with the LOOP_CONFIG* registers for pixel and vertex shaders, it seems like we should just use it for everything. Reviewed-by: Marek Olšák <[email protected]>
* radeon/llvm: Emit ISA for ALU instructions in the R600 code emitterMichal Sciubidlo2012-09-191-0/+43
| | | | Signed-off-by: Tom Stellard <[email protected]>