summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/r600
Commit message (Collapse)AuthorAgeFilesLines
* r600g: fix broken streamout if streamout_begin caused a context flushMarek Olšák2012-11-231-2/+6
| | | | | | | This fixes graphics corruption in the case where the DISCARD_RANGE flag is used to map a buffer. NOTE: This is a candidate for the stable branches.
* r600g: fix ARB_map_buffer_alignment with unaligned offsets and staging buffersMarek Olšák2012-11-223-3/+8
|
* r600g: use LINEAR_ALIGNED tiling for 1D array textures and if height0 <= 3Marek Olšák2012-11-131-1/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: untiled window-system buffers should be LINEAR_ALIGNEDMarek Olšák2012-11-131-1/+1
| | | | | | though I guess the DDX allocates them as LINEAR_GENERAL Reviewed-by: Alex Deucher <[email protected]>
* r600g: use LINEAR_ALIGNED tiling for 1D texturesMarek Olšák2012-11-131-1/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: use LINEAR_ALIGNED tiling for staging textures, reorder the codeMarek Olšák2012-11-131-6/+10
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove redundant parameter in r600_init_surfaceMarek Olšák2012-11-131-6/+4
|
* r600g: fix printk warningsDave Airlie2012-11-101-4/+4
| | | | | | | | | | | | | Brian reported seeing: r600_texture.c: In function ‘r600_texture_create_object’: r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ this should wrap over them fine. Signed-off-by: Dave Airlie <[email protected]>
* r600g: add initial cube map array support (v2)Dave Airlie2012-11-109-15/+238
| | | | | | | | | | | | | | | | | | | | | This contains the evergreen support. Support is possible on rv670 upwards and the code in here should work, but it doesn't and I haven't debugged it to figure out why. Beyond just adding support for the cube map array sampling, r600 resinfo isn't conformant with the GL specification, which states the number of layers should be returned for the textureSize, so we have to track in an external constant buffer the layers for each sampler if we need them in the shader. v2: only update the sampler constants if the sampler views have changed, as suggested by Marek. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: fix lod bias/explicit lod with cube maps.Dave Airlie2012-11-091-8/+20
| | | | | | | | | | | | While developing cube map array support I found that we didn't support this properly, also piglit didn't test for it at all. I've submitted a test to piglit to check for this, and this fixes explicit lod and lod bias with cube maps. NOTE: This is a candidate for the 9.0 branch. Signed-off-by: Dave Airlie <[email protected]>
* r600g: clarify const buffer numbering and handlingDave Airlie2012-11-094-4/+10
| | | | | | | | | For cube map arrays I'll need another driver private constant buffer, and looking forward to UBOs. So clean up with some defines, that can be modified when adding cube map array and ubos later. Signed-off-by: Dave Airlie <[email protected]>
* r600g: fix pre eg export with llvmVincent Lejeune2012-11-081-1/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher at amd.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g/compute: fix call to r600_bytecode_initAlex Deucher2012-11-071-1/+2
| | | | Signed-off-by: Alex Deucher <[email protected]>
* r600g: add in-place DB decompression and texturing with DB tilingMarek Olšák2012-11-068-79/+216
| | | | | | | | | | | | | | | | | | | | | The decompression is done in-place and only the compressed tiles are decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F. The texture unit is programmed to use non-displayable tiling and depth ordering of samples, so that it can fetch the texture in the native DB format. The latest version of the libdrm surface allocator is required for stencil texturing to work. The old one didn't create the mipmap tree correctly. We need a separate mipmap tree for stencil, because the stencil mipmap offsets are not really depth offsets/4. There are still some known bugs, but this should save some memory and it also improves performance a little bit in Lightsmark (especially with low resolutions; tested with Radeon HD 5000). The DB->CB copy is still used for transfers. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: make tgsi-to-llvm generates store.pixel* intrinsic for fsVincent Lejeune2012-11-025-12/+127
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: re-enable handling of DISCARD_RANGE, improving performanceMarek Olšák2012-11-011-2/+0
| | | | | | It seems to work for me now. Even the graphics corruption is gone. This also boosts performance in Reaction Quake.
* r600g: fix abysmal performance in Reaction QuakeMarek Olšák2012-11-012-21/+24
| | | | | | | | | | | | | The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. NOTE: This is a candidate for the stable branches. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Jerome Glisse <[email protected]>
* r600g: avoid shader needing too many gpr to lockup the gpu v2Jerome Glisse2012-10-313-34/+62
| | | | | | | | | | | | | | On r6xx/r7xx shader resource management need to make sure that the shader does not goes over the gpr register limit. Each specific asic has a maxmimum register that can be split btw shader stage. For each stage the shader must not use more register than the limit programmed. v2: Print an error message when discarding draw. Don't add another boolean to context structure, but rather propagate the discard boolean through the call chain. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: use SQ_VTX_SEMANTIC_CLEAR to clear the semantic registersMarek Olšák2012-10-314-99/+11
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gallium: expose ARB_map_buffer_alignment on RadeonMarek Olšák2012-10-311-0/+3
| | | | | | | | Reviewed-by: Brian Paul <[email protected]> v2: update relnotes-9.1 v3: use align_malloc and align_free for malloced buffers in r300g v4: document the new CAP in the docs
* r600g: use better sample positions for 8x MSAAMarek Olšák2012-10-312-12/+12
| | | | | Taken from the intel driver. The sample positions are actually a solution to the 8 queens puzzle. It gives more accurate and smoother AA.
* gallium: add start_slot parameter to set_vertex_buffersMarek Olšák2012-10-312-29/+28
| | | | | | | | | | | | | | | | | | | | | This allows updating only a subrange of buffer bindings. set_vertex_buffers(pipe, start_slot, count, NULL) unbinds buffers in that range. Binding NULL resources unbinds buffers too (both buffer and user_buffer must be NULL). The meta ops are adapted to only save, change, and restore the single slot they use. The cso_context can save and restore only one vertex buffer slot. The clients can query which one it is using cso_get_aux_vertex_buffer_slot. It's currently set to 0. (the Draw module breaks if it's set to non-zero) It should decrease the CPU overhead when using a lot of meta ops, but the drivers must be able to treat each vertex buffer slot as a separate state (only r600g does so at the moment). I can imagine this also being useful for optimizing some OpenGL use cases. Reviewed-by: Brian Paul <[email protected]>
* r600g: tgsi-to-llvm emits right input intrinsicsVincent Lejeune2012-10-302-20/+64
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: implement texturing with 8x MSAA compressed surfaces for EvergreenMarek Olšák2012-10-2911-37/+256
| | | | | | | | | | The 2x and 4x MSAA cases are completely broken. The lfdptr instruction returns garbage there. The 8x MSAA case is broken on Cayman, though at least the result looks somewhat correct. Only the 8x MSAA case works on Evergreen and is enabled.
* r600g: advertise 32 streamout vec4 outputsMarek Olšák2012-10-261-1/+1
| | | | to match the varying limit.
* r600g: advertise 32 fragment shaders inputs, not 34Marek Olšák2012-10-261-4/+1
|
* r600g: split cayman common state out into a shared functionAlex Deucher2012-10-263-16/+35
| | | | | | | | | And use it for compute. This should improve compute support on cayman. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g: emit some additional regs on caymanAlex Deucher2012-10-261-0/+54
| | | | | | | | | These are common to both evergreen and cayman, but were not emitted on cayman. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g: there are 16 const buffer size regs for each shader stageAlex Deucher2012-10-261-2/+19
| | | | | | | | we were previously only setting 8 of them. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g: rework evergreen_init_common_regs()Alex Deucher2012-10-261-40/+33
| | | | | | | | | Move gfx specific bits out as the code is shared with compute. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g/compute: always CONTEXT_CONTROL packet at start of CSAlex Deucher2012-10-262-0/+10
| | | | | | | | | | It's required. The CP uses this to properly allocate new contexts. Also do a CS partial flush since we are updating CONFIG regs which are single state. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* r600g: force bank_swizzle if already setVincent Lejeune2012-10-242-0/+4
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: rewrite tgsi-to-llvm load-input to handle fragcoordVincent Lejeune2012-10-242-42/+84
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g: Remove special handling of PRED_SET* insructions for LLVM 3.2Tom Stellard2012-10-191-0/+2
| | | | | The 3.2 version of the backend now sets all the correct fields for PRED_SET* instructions.
* gallium: remove unused data pointer from pipe_transferMarek Olšák2012-10-182-2/+0
| | | | Reviewed-by: Brian Paul <[email protected]>
* r600g: Fix segfault in r600_compute_global_transfer_map()Tom Stellard2012-10-161-1/+1
| | | | | | This segfault was caused by commit 369e46888904c6d379b8b477d9242cff1608e30e, however it is my fault for not testing the patch while it was on the list.
* r600g: Fix build with --enable-openclTom Stellard2012-10-161-0/+1
|
* r600g: drop useless switch statementAndreas Boll2012-10-151-94/+7
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: emit the border color only when it's neededMarek Olšák2012-10-154-4/+24
| | | | That depends on the texture wrap modes and filtering.
* r600g: cleanup create_sampler_state functionsMarek Olšák2012-10-153-57/+46
| | | | | | | - stopped using util_color - reformatted to occupy less characters per line. - used memcpy for the border color - used pipe_color_union in the state structure
* r600g: move shader structures into r600_shader.hMarek Olšák2012-10-129-25/+30
|
* r600g: implement MSAA resolving for 8-bit and 16-bit integer formatsMarek Olšák2012-10-121-3/+41
| | | | by changing the format to NORM.
* r600g: put user indices in the command stream for small index countsMarek Olšák2012-10-112-14/+28
| | | | | This improves performance a little bit if there are lots of small indexed draw commands.
* r600g: inline r600_translate_index_bufferMarek Olšák2012-10-114-65/+22
|
* gallium: unify transfer functionsMarek Olšák2012-10-117-218/+170
| | | | | | | | | | | | | | "get_transfer + transfer_map" becomes "transfer_map". "transfer_unmap + transfer_destroy" becomes "transfer_unmap". transfer_map must create and return the transfer object and transfer_unmap must destroy it. transfer_map is successful if the returned buffer pointer is not NULL. If transfer_map fails, the pointer to the transfer object remains unchanged (i.e. doesn't have to be NULL). Acked-by: Brian Paul <[email protected]>
* r600g: move SQ_GPR_RESOURCE_MGMT_1 into new config_stateMarek Olšák2012-10-103-14/+22
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: move DB_SHADER_CONTROL into db_misc_stateMarek Olšák2012-10-106-36/+27
| | | | | | | Also update the register value in more appropriate places than r600_update_derived_state. Reviewed-by: Jerome Glisse <[email protected]>
* r600g: emit PS_PARTIAL_FLUSH at the beginning of CSMarek Olšák2012-10-102-0/+12
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize depth-stencil-alpha stateMarek Olšák2012-10-108-50/+28
| | | | Reviewed-by: Jerome Glisse <[email protected]>
* r600g: atomize rasterizer stateMarek Olšák2012-10-107-173/+126
| | | | Reviewed-by: Jerome Glisse <[email protected]>