summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* r300/compiler: Avoid generating MOV instructions for invalid IMM swizzles v2Tom Stellard2012-11-161-4/+349
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If an instruction reads from a constant register that contains immediates using an invalid swizzle, we can avoid generating MOV instructions to fix up the swizzle by loading the immediates into a different constant register that can be read using a valid swizzle. This only affects r300 and r400 cards. For example: CONST[1] = { -3.5000 3.5000 2.5000 1.5000 } MAD temp[4].xy, const[0].xy__, const[1].xz__, input[0].xy__; ========== Before this change would be lowered to: ========= CONST[1] = { -3.5000 3.5000 2.5000 1.5000 } MOV temp[0].x, const[1].x___; MOV temp[0].y, const[1]._z__; MAD temp[4].xy, const[0].xy__, temp[0].xy__, input[0].xy__; ========== After this change is lowered to: =============== CONST[1] = { -3.5000 3.5000 2.5000 1.5000 } CONST[2] = { 0.0000 -3.5000 2.5000 0.0000 } MAD temp[4].xy, const[0].xy__, const[2].yz__, input[0].xy__; ============================================================ This change reduces one of the Lightsmark shaders from 133 to 91 instructions. v2: - Fix crash caused by swizzles with only inline constants.
* radeonsi: clean up some magic numbersAlex Deucher2012-11-161-1/+2
| | | | | Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: emit PA_SC_RASTER_CONFIGAlex Deucher2012-11-161-0/+11
| | | | | | | | | | | | | Use per asic golden values. Programming this register doesn't seem to be strictly necessary on SI, but programming it wrong leads to rendering issues or reduced performance so just go ahead and program the golden values explicitly to avoid any potential problems down the road. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: remove new asserts and replace with warningsAlex Deucher2012-11-151-2/+6
| | | | | | Fixes piglit regressions. Signed-off-by: Alex Deucher <[email protected]>
* radeonsi: cleanup si_db()Alex Deucher2012-11-152-12/+12
| | | | | | | | Clean up a few magic numbers and rework the code a bit. Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Christian König <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: assert the CB format is valid (v2)Alex Deucher2012-11-151-2/+3
| | | | | | | | | | | Assert the the CB format is valid and default to the INVALID hw format rather than ~0U when the format doesn't match for non-debug builds. v2: use INVALID hw format rather than ~0U Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: assert that the DB format is valid (v2)Alex Deucher2012-11-151-8/+5
| | | | | | | | | | | Assert that the DB format is valid and default to the INVALID hw format rather than ~0U when the format doesn't match for non-debug builds. v2: use INVALID hw format rather than ~0U Signed-off-by: Alex Deucher <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: Set STENCILOPVAL fields to 1.Michel Dänzer2012-11-141-2/+4
| | | | | | | | This is necessary for backwards compatibility with pre-SI for stencil. Fixes a number of stencil related piglit tests, and real apps using stencil. Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Bump SI_PM4_MAX_DW.Michel Dänzer2012-11-141-1/+1
| | | | | | | Fixes assertion failure with Mesa demo glsl/samplers. Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: Handle TGSI TXL opcode.Michel Dänzer2012-11-141-0/+7
| | | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* radeonsi: Handle TGSI TXB opcode.Michel Dänzer2012-11-141-0/+7
| | | | | | Signed-off-by: Michel Dänzer <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* r600g: use LINEAR_ALIGNED tiling for 1D array textures and if height0 <= 3Marek Olšák2012-11-131-1/+3
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r300g: don't call buffer_unmap in draw functionsMarek Olšák2012-11-131-11/+0
| | | | It's been a no-op anyway.
* r300g: fix crash since the set_vertex_buffers(start_slot) changeMarek Olšák2012-11-131-6/+7
|
* r600g: untiled window-system buffers should be LINEAR_ALIGNEDMarek Olšák2012-11-131-1/+1
| | | | | | though I guess the DDX allocates them as LINEAR_GENERAL Reviewed-by: Alex Deucher <[email protected]>
* r600g: use LINEAR_ALIGNED tiling for 1D texturesMarek Olšák2012-11-131-1/+2
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: use LINEAR_ALIGNED tiling for staging textures, reorder the codeMarek Olšák2012-11-131-6/+10
| | | | Reviewed-by: Alex Deucher <[email protected]>
* r600g: remove redundant parameter in r600_init_surfaceMarek Olšák2012-11-131-6/+4
|
* gallivm,draw,llvmpipe: use base ptr + mip offsets instead of mip pointersRoland Scheidegger2012-11-126-38/+83
| | | | | | | | | This might have a slight overhead but handling mip offsets more like the width (and image) strides should make some things easier (mip level being just part of the offset calculation) later. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* llvmpipe: always allocate whole miptrees not individual levelsRoland Scheidegger2012-11-122-60/+81
| | | | | | | | This is preparation work for using mip level offsets + base_ptr for texture sampling instead of per-mip pointers. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: José Fonseca <[email protected]>
* radeonsi: Implement alpha testing in pixel shader.Michel Dänzer2012-11-126-38/+52
| | | | Signed-off-by: Michel Dänzer <[email protected]>
* radeonsi: Initialize uses_kill boolean from TGSI info.Michel Dänzer2012-11-121-0/+1
| | | | | | Fixes discarded pixels incorrectly updating the depth buffer. Signed-off-by: Michel Dänzer <[email protected]>
* r600g: fix printk warningsDave Airlie2012-11-101-4/+4
| | | | | | | | | | | | | Brian reported seeing: r600_texture.c: In function ‘r600_texture_create_object’: r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ r600_texture.c:468:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 3 has type ‘uint64_t’ r600_texture.c:485:12: warning: format ‘%llu’ expects type ‘long long unsigned int’, but argument 4 has type ‘uint64_t’ this should wrap over them fine. Signed-off-by: Dave Airlie <[email protected]>
* softpipe: fix unused variable warning.Dave Airlie2012-11-101-1/+1
| | | | Signed-off-by: Dave Airlie <[email protected]>
* gallium: fix unused cap warnings in drivers for cube map array cap.Dave Airlie2012-11-103-0/+3
| | | | Signed-off-by: Dave Airlie <[email protected]>
* r600g: add initial cube map array support (v2)Dave Airlie2012-11-109-15/+238
| | | | | | | | | | | | | | | | | | | | | This contains the evergreen support. Support is possible on rv670 upwards and the code in here should work, but it doesn't and I haven't debugged it to figure out why. Beyond just adding support for the cube map array sampling, r600 resinfo isn't conformant with the GL specification, which states the number of layers should be returned for the textureSize, so we have to track in an external constant buffer the layers for each sampler if we need them in the shader. v2: only update the sampler constants if the sampler views have changed, as suggested by Marek. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* trace: Support geometry shaders.José Fonseca2012-11-091-115/+71
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* galahad: Support geometry shader / stream-output methods.José Fonseca2012-11-091-82/+110
| | | | | Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Fix rgb_dst_factor == PIPE_BLENDFACTOR_SRC_ALPHA_SATURATE.José Fonseca2012-11-091-3/+3
| | | | | | | | | We must multiply the factor against the destination, not the source. NOTE: Candidate for the stable branches. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* softpipe: Handle adjacency primitives.José Fonseca2012-11-091-0/+46
| | | | | | | | | | | | Not fully tested. Based on diagrams from http://msdn.microsoft.com/en-us/library/windows/desktop/bb205124.aspx#Primitive_Adjacency v2: Fix based on Brian's feedback. Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* svga: Fix memory leak in svga_buffer_transfer_map.Vinson Lee2012-11-081-0/+2
| | | | | | | Fixes resource leak defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* softpipe: add ARB_texture_cube_map_array support (v1.1)Dave Airlie2012-11-094-14/+171
| | | | | | | | | | | | | This adds support to the softpipe texture sampler and tgsi exec. In order to handle the extra input to the texture sampling, I've had to expand the interfaces to take a c1 value for storing the texture compare value for the TEX2 case. v1.1: add comments (Brian) Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* r600g: fix lod bias/explicit lod with cube maps.Dave Airlie2012-11-091-8/+20
| | | | | | | | | | | | While developing cube map array support I found that we didn't support this properly, also piglit didn't test for it at all. I've submitted a test to piglit to check for this, and this fixes explicit lod and lod bias with cube maps. NOTE: This is a candidate for the 9.0 branch. Signed-off-by: Dave Airlie <[email protected]>
* r600g: clarify const buffer numbering and handlingDave Airlie2012-11-094-4/+10
| | | | | | | | | For cube map arrays I'll need another driver private constant buffer, and looking forward to UBOs. So clean up with some defines, that can be modified when adding cube map array and ubos later. Signed-off-by: Dave Airlie <[email protected]>
* r600g: fix pre eg export with llvmVincent Lejeune2012-11-081-1/+1
| | | | | Reviewed-by: Alex Deucher <alexander.deucher at amd.com> Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* r600g/compute: fix call to r600_bytecode_initAlex Deucher2012-11-071-1/+2
| | | | Signed-off-by: Alex Deucher <[email protected]>
* svga: Ensure vb_transfer in svga_swtnl_draw_vbo in initialized.Vinson Lee2012-11-061-1/+1
| | | | | | | Fixes a uninitialized pointer read defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* r600g: add in-place DB decompression and texturing with DB tilingMarek Olšák2012-11-068-79/+216
| | | | | | | | | | | | | | | | | | | | | The decompression is done in-place and only the compressed tiles are decompressed. Note: R6xx-R7xx can do that only with Z16 and Z32F. The texture unit is programmed to use non-displayable tiling and depth ordering of samples, so that it can fetch the texture in the native DB format. The latest version of the libdrm surface allocator is required for stencil texturing to work. The old one didn't create the mipmap tree correctly. We need a separate mipmap tree for stencil, because the stencil mipmap offsets are not really depth offsets/4. There are still some known bugs, but this should save some memory and it also improves performance a little bit in Lightsmark (especially with low resolutions; tested with Radeon HD 5000). The DB->CB copy is still used for transfers. Reviewed-by: Jerome Glisse <[email protected]>
* trace: Prevent segfault when passing NULL to set_vertex_buffers.José Fonseca2012-11-052-15/+23
| | | | State tracker now passes NULL buffer array to unbind buffers.
* galahad: Prevent segfault when passing NULL to set_vertex_buffers.José Fonseca2012-11-051-1/+1
| | | | State tracker now passes NULL buffer array to unbind buffers.
* nv50,nvc0: expose ARB_map_buffer_alignmentLucas Stach2012-11-044-6/+8
| | | | | | All HW buffers (also suballocated ones) are already aligned. Just make sure that also the initial sysram buffers have proper alignment.
* r600g: make tgsi-to-llvm generates store.pixel* intrinsic for fsVincent Lejeune2012-11-026-12/+130
| | | | Reviewed-by: Tom Stellard <thomas.stellard at amd.com>
* radeonsi: Implement support for vertex shader samplers.Michel Dänzer2012-11-022-22/+60
| | | | Signed-off-by: Michel Dänzer <[email protected]>
* r600g: re-enable handling of DISCARD_RANGE, improving performanceMarek Olšák2012-11-011-2/+0
| | | | | | It seems to work for me now. Even the graphics corruption is gone. This also boosts performance in Reaction Quake.
* r600g: fix abysmal performance in Reaction QuakeMarek Olšák2012-11-012-21/+24
| | | | | | | | | | | | | The problem was we set VRAM|GTT for relocations of STATIC resources. Setting just VRAM increases the framerate 4 times on my machine. I rewrote the switch statement and adjusted the domains for window framebuffers too. NOTE: This is a candidate for the stable branches. Reviewed-by: Alex Deucher <[email protected]> Reviewed-by: Jerome Glisse <[email protected]>
* llvmpipe: Obey back writemask.José Fonseca2012-10-311-2/+8
| | | | | | | | Tested with a modified glean tstencil2 test. NOTE: This is a candidate for stable branches. Reviewed-by: Brian Paul <[email protected]>
* r600g: avoid shader needing too many gpr to lockup the gpu v2Jerome Glisse2012-10-313-34/+62
| | | | | | | | | | | | | | On r6xx/r7xx shader resource management need to make sure that the shader does not goes over the gpr register limit. Each specific asic has a maxmimum register that can be split btw shader stage. For each stage the shader must not use more register than the limit programmed. v2: Print an error message when discarding draw. Don't add another boolean to context structure, but rather propagate the discard boolean through the call chain. Signed-off-by: Jerome Glisse <[email protected]>
* r600g: use SQ_VTX_SEMANTIC_CLEAR to clear the semantic registersMarek Olšák2012-10-314-99/+11
| | | | Reviewed-by: Alex Deucher <[email protected]>
* gallium: expose ARB_map_buffer_alignment on RadeonMarek Olšák2012-10-3111-2/+18
| | | | | | | | Reviewed-by: Brian Paul <[email protected]> v2: update relnotes-9.1 v3: use align_malloc and align_free for malloced buffers in r300g v4: document the new CAP in the docs
* r600g: use better sample positions for 8x MSAAMarek Olšák2012-10-312-12/+12
| | | | | Taken from the intel driver. The sample positions are actually a solution to the 8 queens puzzle. It gives more accurate and smoother AA.