aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600/r600_pipe.h
Commit message (Collapse)AuthorAgeFilesLines
* r600g/sb: use source bytecode in case of optimization errorsVadim Girlin2013-04-301-0/+1
|
* r600g: plug in optimizing backendVadim Girlin2013-04-301-0/+7
| | | | | | Optimization is enabled with "R600_DEBUG=sb". Signed-off-by: Vadim Girlin <[email protected]>
* r600/uvd: stop advertising MPEG4 on UVD 2.x chips v2Christian König2013-04-261-0/+3
| | | | | | | | | That is just not supported by the hardware. v2: fix compare Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* winsys/radeon: consolidate tracing into winsys v2Jerome Glisse2013-04-251-9/+2
| | | | | | | | | | | | This move the tracing timeout and printing into winsys and add an debug environement variable for it (R600_DEBUG=trace_cs). Lot of file touched because of winsys API changes. v2: Do not write lockup file if ib uniq id does not match last one Signed-off-by: Jerome Glisse <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: Add evergreen_emit_cs_constant_buffers() v2Tom Stellard2013-04-251-1/+10
| | | | | | | | v2: - Bump R600_NUM_ATOMS Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: use CP DMA for buffer clears on evergreen+Alex Deucher2013-04-241-0/+3
| | | | | | | | | | Lighter weight then using streamout. Only evergreen and newer asics support embedded data as src with CP DMA. Reviewed-by: Jerome Glisse <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
* r600g: initialize CMASK and HTILE with the GPU using streamoutMarek Olšák2013-04-231-0/+7
| | | | | | | | | | | | | This fixes a crash when a resource cannot be mapped to the CPU's address space because it's too big. This puts a global pipe_context in r600_screen, which is guarded by a mutex, so that we can use pipe_context when there isn't one around. Hopefully our multi-context support is solid. Reviewed-by: Alex Deucher <[email protected]> NOTE: This is a candidate for the 9.1 branch.
* r600g: implement pipeline statistics queryMarek Olšák2013-04-161-0/+1
|
* r600g: add a debug flag for printing virtual addresses of resourcesMarek Olšák2013-04-161-0/+1
|
* r600g: add a query returning the amount of time spent during bo_map sync.Marek Olšák2013-04-161-0/+1
|
* radeon/uvd: add UVD implementation v5Christian König2013-04-111-0/+12
| | | | | | | | | | | | | | Just everything you need for UVD with r600g and radeonsi. v2: move UVD code to radeon subdir, clean up build system additions, remove an unused SI function, disable tiling on SI for now. v3: some minor indentation fix and rebased v4: dpb size calculation fixed v5: implement proper fall-back in case the kernel doesn't support UVD, based on patches from Andreas Boll but cleaned up a bit more. Signed-off-by: Christian König <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* gallium: PIPE_COMPUTE_CAP_IR_TARGET - allow drivers to specify a processor v2Tom Stellard2013-04-051-0/+2
| | | | | | | | | | | | This target string now contains four values instead of three. The old processor field (which was really being interpreted as arch) has been split into two fields: processor and arch. This allows drivers to pass a more a more detailed description of the hardware to compiler frontends. v2: - Adapt to libclc changes Reviewed-by: Francisco Jerez <[email protected]>
* r600g: add a driver query returning the amount of requested VRAM and GTT memoryMarek Olšák2013-03-261-0/+2
|
* r600g: add a driver query returning the number of draw_vbo callsMarek Olšák2013-03-261-0/+6
| | | | between begin_query and end_query
* r600g: add debug options disabling various copy-buffer-related featuresMarek Olšák2013-03-111-0/+3
| | | | This will be invaluable for debugging and bug reports.
* r600g: remove r600.h, move the stuff elsewhere (mostly to r600_pipe.h)Marek Olšák2013-03-111-6/+101
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: remove r600_hw_context_priv.h, move the stuff to r600_pipe.hMarek Olšák2013-03-111-0/+11
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: remove deprecated state management codeMarek Olšák2013-03-111-8/+0
| | | | | | It's nice to see so much code that did pretty much nothing go away. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize pixel shaderMarek Olšák2013-03-111-1/+8
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize vertex shaderMarek Olšák2013-03-111-3/+11
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: inline r600_pipe_shader functionMarek Olšák2013-03-111-4/+4
| | | | | | also change names of other functions, so that they make sense Reviewed-by: Jerome Glisse <[email protected]>
* r600g: use a single env var R600_DEBUG, disable bytecode dumpingMarek Olšák2013-03-111-1/+14
| | | | | | | | | | | | | | | | | | | | | | | | | Only the disassembler is used to dump shaders. Here's a few examples how to use R600_DEBUG. Log compute info: R600_DEBUG=compute Dump all shaders: R600_DEBUG=fs,vs,gs,ps,cs Dump pixel shaders only: R600_DEBUG=ps Disable Hyper-Z: R600_DEBUG=nohyperz Disable the LLVM backend: R600_DEBUG=nollvm Or use any combination of the above, or print all options: R600_DEBUG=help Reviewed-by: Tom Stellard <[email protected]>
* r600g: cleanup #include recursion between r600_pipe.h and evergreen_compute.hMarek Olšák2013-03-111-1/+0
| | | | Reviewed-by: Tom Stellard <[email protected]>
* r600g: don't require dword alignment with CP DMA for buffer transfersMarek Olšák2013-03-011-0/+1
| | | | | | which is a leftover from the days when we used streamout to copy buffers Tested-by: Andreas Boll <[email protected]>
* r600g: unify vgt statesMarek Olšák2013-03-011-6/+0
| | | | | | | The states were split because we thought it caused a hardlock. Now we know the hardlock was caused by something else and has since been fixed. Tested-by: Andreas Boll <[email protected]>
* r600g: atomize streamout enablingMarek Olšák2013-03-011-9/+20
| | | | | | | | | | | | This doesn't fix any issue we know of, but there indeed is a week spot in draw_vbo where streamout can fail. After streamout is enabled, the need_cs_space call can flush the context, which causes the streamout to be disabled right after it was enabled and bad things happen. One way to fix it is to atomize the beginning part, so that no context flush can happen between streamout enabling and the first drawing. Tested-by: Andreas Boll <[email protected]>
* r600g: workaround hyperz lockup on evergreenJerome Glisse2013-02-281-1/+3
| | | | | | | | | | | This work around disable hyperz if write to zbuffer is disabled. Somehow using hyperz when not writting to the zbuffer trigger GPU lockup. See : https://bugs.freedesktop.org/show_bug.cgi?id=60848 Candidate for 9.1 Signed-off-by: Jerome Glisse <[email protected]>
* gallium/util: add helper util_max_layer from r600gMarek Olšák2013-02-261-16/+0
|
* r600g: use tables with ISA info v3Vadim Girlin2013-02-011-0/+2
| | | | | | | | | v3: added some flags including condition codes for ALU, fixed issue with CF reverse lookup (overlapping ranges of CF_ALU_xxx and other CF instructions) rebased on current master Signed-off-by: Vadim Girlin <[email protected]>
* r600g: add cs memory usage accounting and limit it v3Jerome Glisse2013-01-311-0/+28
| | | | | | | | | | | | | | | | | | | We are now seing cs that can go over the vram+gtt size to avoid failing flush early cs that goes over 70% (gtt+vram) usage. 70% is use to allow some fragmentation. The idea is to compute a gross estimate of memory requirement of each draw call. After each draw call, memory will be precisely accounted. So the uncertainty is only on the current draw call. In practice this gave very good estimate (+/- 10% of the target memory limit). v2: Remove left over from testing version, remove useless NULL checking. Improve commit message. v3: Add comment to code on memory accounting precision Signed-off-by: Jerome Glisse <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: real fix for non 3.8 kernelJerome Glisse2013-01-281-3/+5
| | | | Signed-off-by: Jerome Glisse <[email protected]>
* r600g: fix segfault with old kernel9.1-branchpointJerome Glisse2013-01-281-1/+1
| | | | | | | Old kernel do not have dma support, patch pushed were missing some of the check needed to not use dma. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: add async for staging buffer upload v2Jerome Glisse2013-01-281-0/+9
| | | | | | v2: Add virtual address to dma src/dst offset for cayman Signed-off-by: Jerome Glisse <[email protected]>
* r600g: add multi ring support with dma as first second ring v4Jerome Glisse2013-01-281-6/+34
| | | | | | | | | | | | | | | | | | | | | | | | | | | We keep track of ring emission order in a stack, whenever we need to flush we empty the stack in a fifo order. There is few helpers function for bo mapping and other ring activities that will make sure that the ring stack is properly flush and submitted. v2: fix st flush path, and other flush path to properly flush all rings if necessary v3: - improve name of ring helpers - make sure that each time a cs is gona be written it endup at top of the stack to avoid any issue such as : STACK[0] = dma (withbo A,B) STACK[1] = gfx (withbo C,D) Now if code try to emit a dma command relative to bo C or D it will start writting cmd stream into the cs and once it reach the point where it adds relocation it will flush. At that point the cs will have cmd that don't have proper relocation into the relocation buffer and kernel will just refuse to run. v4: - Drop the stack idea as it turn out there is no way to use it or benefit from it. Any time the driver start command on other ring, it always need to flush the previous ring. So make code simpler by not using a stack. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: texture buffer object + glsl 1.40 enable support (v2)Dave Airlie2013-01-111-1/+9
| | | | | | | | | | | | | | | | | | | This adds TBO support to r600g, and with GLSL 1.40 enabled, we now get 3.1 core profiles advertised for r600g. The r600/700 implementation is a bit different from the evergreen one, as r6/7 hw lacks vertex fetch swizzles. So we implement it by passing 5 constants per sampler to the shader, the shader uses the first 4 as masks for each component and the 5th as the alpha value to OR in. Now TXQ is also broken so we have to pass a constant for the buffer size, on evergreen we just pass this, on r6/7 we pass it as the 6th element in the const info buffer. v1.1: drop return as DDX doesn't use a texture type v2: add r600/700 support. Signed-off-by: Dave Airlie <[email protected]>
* r600g: uniform buffer object supportDave Airlie2013-01-111-1/+1
| | | | | | | | | | This adds 12 more constant buffers for use as UBOs, along with adding relative constant fetching for 2D indices. This with GLSL 1.40 enabled passes all the same tests as softpipe on my evergreen system. Signed-off-by: Dave Airlie <[email protected]>
* r600g: implement buffer copying using CP DMA for R7xx, Evergreen, CaymanMarek Olšák2013-01-081-2/+1
| | | | | | | | | | R6xx doesn't work - the issue seems to be with flushing (sometimes the destination buffer contains garbage). There are no hangs, so we're good. R7xx doesn't seem to have any alignment restriction despite our initial thinking. Everything just works. Reviewed-by: Alex Deucher <[email protected]>
* r600g/radeon/winsys: indentation cleanupJerome Glisse2013-01-071-1/+1
| | | | | | Signed-off-by: Jerome Glisse <[email protected]> Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* r600g: don't suspend TIME_ELAPSED queries during flushingMarek Olšák2012-12-211-12/+4
| | | | | According to the GL spec, the result should be equivalent to comparing two timestamps.
* r600g: add cs tracing infrastructure for lockup pin pointingJerome Glisse2012-12-201-0/+16
| | | | | | | | It's a build time option you need to set R600_TRACE_CS to 1 and it will print to stderr all cs along as cs trace point value which gave last offset into a cs process by the GPU. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: add htile support v16Jerome Glisse2012-12-201-9/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | htile is used for HiZ and HiS support and fast Z/S clears. This commit just adds the htile setup and Fast Z clear. We don't take full advantage of HiS with that patch. v2 really use fast clear, still random issue with some tiles need to try more flush combination, fix depth/stencil texture decompression v3 fix random issue on r6xx/r7xx v4 rebase on top of lastest mesa, disable CB export when clearing htile surface to avoid wasting bandwidth v5 resummarize htile surface when uploading z value. Fix z/stencil decompression, the custom blitter with custom dsa is no longer needed. v6 Reorganize render control/override update mecanism, fixing more issues in the process. v7 Add nop after depth surface base update to work around some htile flushing issue. For htile to 8x8 on r6xx/r7xx as other combination have issue. Do not enable hyperz when flushing/uncompressing depth buffer. v8 Fix htile surface, preload and prefetch setup. Only set preload and prefetch on htile surface clear like fglrx. Record depth clear value per level. Support several level for the htile surface. First depth clear can't be a fast clear. v9 Fix comments, properly account new register in emit function, disable fast zclear if clearing different layer of texture array to different value v10 Disable hyperz for texture array making test simpler. Force db_misc_state update when no depth buffer is bound. Remove unused variable, rename depth_clearstencil to depth_clear. Don't allocate htile surface for flushed depth. Something broken the cliprect change, this need to be investigated. v11 Rebase on top of newer mesa v12 Rebase on top of newer mesa v13 Rebase on top of newer mesa, htile surface need to be initialized to zero, somehow special casing first clear to not use fast clear and thus initialize the htile surface with proper value does not work in all case. v14 Use resource not texture for htile buffer make the htile buffer size computation easier and simpler. Disable preload on evergreen as its still troublesome in some case v15 Cleanup some comment and remove some left over v16 Define name for bit 20 of CP_COHER_CNTL Signed-off-by: Pierre-Eric Pelloux-Prayer <[email protected]> Signed-off-by: Alex Deucher <[email protected]> Signed-off-by: Jerome Glisse <[email protected]>
* r600g: add assertions to prevent creation of invalid surfacesMarek Olšák2012-12-201-0/+16
|
* r600g: suballocate memory for fetch shaders from a large bufferMarek Olšák2012-12-121-0/+6
| | | | | | | | | | Fetch shaders are usually destroyed at the context destruction by the state tracker, so we can put them all in a large buffer without wasting memory. This reduces the number of relocations sent to the kernel a little bit. Tested-by: Aaron Watry <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: suballocate memory for the STRMOUT_BUFFER_FILLED_SIZE registerMarek Olšák2012-12-121-0/+2
| | | | | | | | | | Instead of having a 4-byte buffer for each streamout target, we suballocate each dword from a 4K buffer. This further reduces the overall number of relocations. Tested-by: Aaron Watry <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* winsys/radeon: don't use BIND flags, add a flag for the cache bufmgr insteadMarek Olšák2012-12-121-1/+1
|
* r600g: fix ARB_map_buffer_alignment with unaligned offsets and staging buffersMarek Olšák2012-11-221-0/+2
|
* r600g: add initial cube map array support (v2)Dave Airlie2012-11-101-1/+6
| | | | | | | | | | | | | | | | | | | | | This contains the evergreen support. Support is possible on rv670 upwards and the code in here should work, but it doesn't and I haven't debugged it to figure out why. Beyond just adding support for the cube map array sampling, r600 resinfo isn't conformant with the GL specification, which states the number of layers should be returned for the textureSize, so we have to track in an external constant buffer the layers for each sampler if we need them in the shader. v2: only update the sampler constants if the sampler views have changed, as suggested by Marek. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: clarify const buffer numbering and handlingDave Airlie2012-11-091-1/+7
| | | | | | | | | For cube map arrays I'll need another driver private constant buffer, and looking forward to UBOs. So clean up with some defines, that can be modified when adding cube map array and ubos later. Signed-off-by: Dave Airlie <[email protected]>
* r600g: add in-place DB decompression and texturing with DB tilingMarek Olšák2012-11-061-0/+1
| | | | | | | | | | | | | | | | | | | | | The decompression is done in-place and only the compressed tiles are decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F. The texture unit is programmed to use non-displayable tiling and depth ordering of samples, so that it can fetch the texture in the native DB format. The latest version of the libdrm surface allocator is required for stencil texturing to work. The old one didn't create the mipmap tree correctly. We need a separate mipmap tree for stencil, because the stencil mipmap offsets are not really depth offsets/4. There are still some known bugs, but this should save some memory and it also improves performance a little bit in Lightsmark (especially with low resolutions; tested with Radeon HD 5000). The DB->CB copy is still used for transfers. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: avoid shader needing too many gpr to lockup the gpu v2Jerome Glisse2012-10-311-1/+1
| | | | | | | | | | | | | | On r6xx/r7xx shader resource management need to make sure that the shader does not goes over the gpr register limit. Each specific asic has a maxmimum register that can be split btw shader stage. For each stage the shader must not use more register than the limit programmed. v2: Print an error message when discarding draw. Don't add another boolean to context structure, but rather propagate the discard boolean through the call chain. Signed-off-by: Jerome Glisse <[email protected]>