mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	r600g/sb: improve alu packing on cayman	Vadim Girlin	2013-07-17	2	-15/+89
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Scheduler/register allocator in r600-sb was developed and optimized on evergreen (VLIW-5) hardware, so currently it's not optimal for VLIW-4 chips. This patch should improve performance on cayman gpus due to better alu packing, but also it tends to increase register usage, so overall positive effect on performance has to be proven by real benchmarks yet. Some results with bfgminer kernel on cayman: source bytecode: 60 gprs, 3905 alu groups, sbcl before the patch: 45 gprs, 4088 alu groups, sbcl with this patch: 55 gprs, 3474 alu groups. Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: fix handling of new multislot instructions on cayman	Vadim Girlin	2013-07-17	3	-5/+6
\| \| \| \| \| \| \|	Ex-scalar instructions that became multislot on cayman do replicate result to all channels - handle them similar to DOT4. Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: fix debug dump code in scheduler	Vadim Girlin	2013-07-17	1	-4/+5
\| \| \| \| \| \|	Update the stale debug code for other changes related to debug output. Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: fix initial register allocation	Vadim Girlin	2013-07-17	1	-0/+1
\| \| \| \| \| \| \| \| \| \|	Mark values that are members of the 'same register' constraint as preallocated in ra_init pass, this will prevent incorrect reallocation in scheduler in some cases. Should fix https://bugs.freedesktop.org/show_bug.cgi?id=66713 Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: move chip & class name functions to sb_context	Vadim Girlin	2013-07-17	4	-53/+55
\| \| \| \|	Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: fix handling of PS in source bytecode on cayman	Vadim Girlin	2013-07-17	1	-0/+5
\| \| \| \| \| \| \| \| \|	Actually PS doesn't make sense for cayman and isn't even mentioned in cayman docs, but llvm backend currently uses it in bytecode and, assuming that hw seems to be mostly ok with it, this will allow sb to parse such source bytecode correctly. Signed-off-by: Vadim Girlin <[email protected]>
*	r600g/sb: Initialize ra_checker member variables.	Vinson Lee	2013-07-17	1	-1/+1
\| \| \| \| \| \|	Fixes "Uninitialized scalar field" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]>
*	llvmpipe: support sRGB framebuffers	Roland Scheidegger	2013-07-16	2	-14/+57
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Just use the new conversion functions to do the work. The way it's plugged in into the blend code is quite hacktastic but follows all the same hacks as used by packed float format already. Only support 4x8bit srgb formats (rgba/rgbx plus swizzle), 24bit formats never worked anyway in the blend code and are thus disabled, and I don't think anyone is interested in L8/L8A8. Would need even more hacks otherwise. Unless I'm missing something, this is the last feature except MSAA needed for OpenGL 3.0, and for OpenGL 3.1 as well I believe. v2: prettify a bit, use separate function for packing. Reviewed-by: Jose Fonseca <[email protected]>
*	Revert "r300g: allow HiZ with a 16-bit zbuffer"	Marek Olšák	2013-07-15	1	-0/+1
\| \| \| \| \| \| \| \|	This reverts commit 631c631cbf5b7e84e42a7cfffa1c206d63143370. https://bugs.freedesktop.org/show_bug.cgi?id=66921 Cc: [email protected]
*	r300g/swtcl: fix a lockup in MSAA resolve	Marek Olšák	2013-07-15	1	-0/+7
\| \| \| \|	Cc: [email protected]
*	r300g/swtcl: fix geometry corruption by uploading indices to a buffer	Marek Olšák	2013-07-15	3	-45/+31
\| \| \| \| \| \| \| \| \| \| \| \| \|	The splitting of a draw call into several draw commands was broken, because the split sometimes took place in the middle of a primitive. The splitting was supposed to be dealing with the case when there are more indices than the maximum size of a CS. This commit throws that code away and uses a real index buffer instead. https://bugs.freedesktop.org/show_bug.cgi?id=66558 Cc: [email protected]
*	ilo: skip 3DSTATE_INDEX_BUFFER when possible	Chia-I Wu	2013-07-14	4	-59/+77
\| \| \| \| \| \|	When only the offset to the index buffer is changed, we can skip the 3DSTATE_INDEX_BUFFER if we always use 0 for the offset, and add (offset / index_size) to Start Vertex Location in 3DPRIMITIVE.
*	r600g/sb: Initialize ra_constraint::cost.	Vinson Lee	2013-07-13	1	-1/+1
\| \| \| \| \| \|	Fixes "Uninitialized scalar field" reported by Coverity. Signed-off-by: Vinson Lee <[email protected]>
*	ilo: move a santiy check into its assert()	Chia-I Wu	2013-07-13	1	-5/+2
\| \| \| \| \| \|	The compiler does not know that ilo_3d_pipeline_estimate_size() is pure and can be eliminated in a release build in gen6_pipeline_end(). Move the call into the assert().
*	ilo: mark some states dirty when they are really changed	Chia-I Wu	2013-07-13	1	-0/+16
\| \| \| \| \|	The checks may seem redundant because cso_context handles them, but util_blitter does not have access to cso_context.
*	ilo: clean up ilo_blitter_pipe_begin()	Chia-I Wu	2013-07-13	3	-27/+39
\| \| \| \| \|	Document why certain states need to be saved, and fix a bug when blitting with scissor enabled.
*	r600g: don't use the CB/DB CP COHER logic on r6xx	Alex Deucher	2013-07-12	1	-2/+10
\| \| \| \| \| \| \| \| \|	There are hw bugs. Flush and inv event is sufficient. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66837 Signed-off-by: Alex Deucher <[email protected]>
*	nv30: fix KILL_IF breakage	Brian Paul	2013-07-12	1	-1/+1
\| \| \| \|	Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66858
*	tgsi: rename the TGSI fragment kill opcodes	Brian Paul	2013-07-12	14	-55/+53
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	TGSI_OPCODE_KIL and KILP had confusing names. The former was conditional kill (if any src component < 0). The later was unconditional kill. At one time KILP was supposed to work with NV-style condition codes/predicates but we never had that in TGSI. This patch renames both opcodes: TGSI_OPCODE_KIL -> KILL_IF (kill if src.xyzw < 0) TGSI_OPCODE_KILP -> KILL (unconditional kill) Note: I didn't just transpose the opcode names to help ensure that I didn't miss updating any code anywhere. I believe I've updated all the relevant code and comments but I'm not 100% sure that some drivers had this right in the first place. For example, the radeon driver might have llvm.AMDGPU.kill and llvm.AMDGPU.kilp mixed up. Driver authors should review their code. Reviewed-by: Jose Fonseca <[email protected]>
*	softpipe: silence some MSVC warnings	Brian Paul	2013-07-12	2	-14/+14
\|
*	radeon/uvd: fall back to shader based decoding for MPEG2 on UVD 2.x v2	Christian König	2013-07-12	2	-5/+19
\| \| \| \| \| \| \| \| \| \| \|	UVD 2.x doesn't support hardware decoding of MPEG2, just use shader based decoding for those chipsets. Fixes: https://bugs.freedesktop.org/show_bug.cgi?id=66450 v2: fix interlacing as well Signed-off-by: Christian König <[email protected]>
*	r600g: x/y coordinates must be divided by block dim in dma blit	Christoph Bumiller	2013-07-11	2	-4/+16
\| \| \| \| \| \| \|	Note: this is a candidate for the 9.1 branch. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Alex Deucher <[email protected]>
*	r600g/sb: Fix Android build v2	Chih-Wei Huang	2013-07-12	4	-7/+8
\| \| \| \| \| \| \|	Add the sb CXX files to the Android Makefile and also stop using some c++11 features. v2 (Vadim Girlin): use &bc[0] instead of bc.begin()
*	r600g/sb: improve math optimizations v2	Vadim Girlin	2013-07-11	11	-47/+435
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This patch adds support for some math optimizations that are generally considered unsafe, that's why they are currently disabled for compute shaders. GL requirements are less strict, so they are enabled for for GL shaders by default. In case of any issues with applications that rely on higher precision than guaranteed by GL, 'sbsafemath' option in R600_DEBUG allows to disable them. v2 - always set proper src vector size for transformed instructions - check for clamp modifier in the expr_handler::fold_assoc Signed-off-by: Vadim Girlin <[email protected]>
*	ilo: reduce PIPE_CAP_MAX_TEXTURE_CUBE_LEVELS to 12	Chia-I Wu	2013-07-11	1	-2/+3
\| \| \| \|	So that there are at most (2^22 * 6) texels, lower than the 2^26 limit.
*	ilo: correctly initialize undefined registers in fs	Chia-I Wu	2013-07-11	1	-5/+15
\| \| \| \| \|	Initialize all 4 channels of undefined registers (that is, TEMPs that are used before being assigned) in FS.
*	radeonsi: Handle TGSI_OPCODE_DDX/Y using local memory	Michel Dänzer	2013-07-10	4	-2/+103
\| \| \| \| \| \|	16 more little piglits. Reviewed-by: Tom Stellard <[email protected]>
*	radeonsi: Handle TGSI_OPCODE_TXD	Michel Dänzer	2013-07-10	1	-2/+25
\| \| \| \| \| \|	One more little piglit. Reviewed-by: Tom Stellard <[email protected]>
*	ilo: honor surface padding requirements	Chia-I Wu	2013-07-10	1	-0/+53
\| \| \| \|	The PRM specifies several padding requirements that we failed to honor.
*	util: treat denorm'ed floats like zero	Zack Rusin	2013-07-09	2	-0/+11
\| \| \| \| \| \| \| \| \| \| \| \| \|	The D3D10 spec is very explicit about treatment of denorm floats and the behavior is exactly the same for them as it would be for -0 or +0. This makes our shading code match that behavior, since OpenGL doesn't care and on a few cpu's it's faster (worst case the same). Float16 conversions will likely break but we'll fix them in a follow up commit. Signed-off-by: Zack Rusin <[email protected]> Reviewed-by: Jose Fonseca <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
*	r600g: improve the mechanism for recognizing an empty CS	Marek Olšák	2013-07-08	3	-3/+8
\| \| \| \|	Reviewed-by: Alex Deucher <[email protected]>
*	r600g: explicitly flush caches for streamout-based buffer copying & clearing	Marek Olšák	2013-07-08	1	-0/+13
\| \| \| \| \| \| \|	It's done automatically for vertex buffers, but not for constant buffers, textures, and colorbuffers. Reviewed-by: Alex Deucher <[email protected]>
*	r600g: only flush the caches that need to be flushed during CP DMA operations	Marek Olšák	2013-07-08	3	-32/+117
\| \| \| \| \| \| \|	This should increase performance if constant uploads are done with the CP DMA, because only the cache that needs to be flushed is flushed. Reviewed-by: Alex Deucher <[email protected]>
*	r600g: split INVAL_READ_CACHES into vertex, tex, and const cache flags	Marek Olšák	2013-07-08	5	-27/+52
\| \| \| \| \| \| \|	also flushing any cache in evergreen_emit_cs_shader seems to be superfluous (we don't flush caches when changing the other shaders either) Reviewed-by: Alex Deucher <[email protected]>
*	r600g: adjust flush flags (v3)	Alex Deucher	2013-07-08	6	-7/+42
\| \| \| \| \| \| \| \| \| \| \| \| \|	1. flush SH with read caches 2. add flag for DB flushes 3. add flag for CB flushes v2: flush all CBs, remove redundant emit_state variable. v3: Marek: also set the new flags in r600_context_flush, the CP dma functions, and texture_barrier, and rename them Signed-off-by: Marek Olšák <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r600g: don't call buffer_wait in buffer_mmap_sync_with_rings	Marek Olšák	2013-07-08	1	-2/+1
\| \| \| \| \| \| \| \|	The winsys should do this, because it measures how much time we spend in buffer_map doing synchronization, which can be viewed with the gallium HUD. Reviewed-by: Alex Deucher <[email protected]>
*	r600g: don't read back the MSAA depth buffer if the read flag is not set	Marek Olšák	2013-07-08	1	-8/+8
\| \| \| \|	Reviewed-by: Alex Deucher <[email protected]>
*	r600g: don't flush the context in texture_transfer_map	Marek Olšák	2013-07-08	1	-5/+0
\| \| \| \| \| \|	the winsys does this automatically Reviewed-by: Alex Deucher <[email protected]>
*	r600g: fix texture offset computation for mapped MSAA depth buffers	Marek Olšák	2013-07-08	2	-16/+14
\| \| \| \| \| \| \| \| \|	It was wrong, because the offset shouldn't be applied to MSAA depth buffers. This small cleanup should prevent such issues in the future. This fixes a lockup in "piglit/fbo-depthstencil default_fb -samples=n". Reviewed-by: Alex Deucher <[email protected]>
*	r600g: fix color resolve for RGBX8 and RGBX16 integer formats	Marek Olšák	2013-07-08	1	-2/+2
\| \| \| \|	Reviewed-by: Alex Deucher <[email protected]>
*	r600g: enable fast MSAA color clear for array/3D/cube textures	Marek Olšák	2013-07-08	1	-4/+3
\| \| \| \|	Reviewed-by: Alex Deucher <[email protected]>
*	r600g: implement fast MSAA color clear for integer textures	Marek Olšák	2013-07-08	1	-9/+12
\| \| \| \| \| \| \|	this also fixes the fast clear with multiple colorbuffers and each having a different format Reviewed-by: Alex Deucher <[email protected]>
*	r600/uvd: fix check for UVD 2.x	Christian König	2013-07-08	1	-1/+1
\| \| \| \|	Signed-off-by: Christian König <[email protected]>
*	nvc0: enable very initial support for nvf0 (GK110)	Ben Skeggs	2013-07-05	4	-5/+75
\| \| \| \| \| \| \|	Shaders need a lot of work still. Basic stuff generally works, so this is basically just fine for gnome-shell, OA etc at this point. Signed-off-by: Ben Skeggs <[email protected]>
*	gallivm: do per-pixel lod calculations for explicit lod	Roland Scheidegger	2013-07-04	1	-1/+2
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	d3d10 requires per-pixel lod calculations for explicit lod, lod bias and explicit derivatives, and we should probably do it for OpenGL too - at least if they are used from vertex or geometry shaders (so doesn't apply to lod bias) this doesn't just affect neighboring pixels. Some code was already there to handle this so fix it up and enable it. There will no doubt be a performance hit unfortunately, we could do better if we'd knew we had a real vector shift instruction (with variable shift count) but this requires AVX2 on x86 (or a AMD Bulldozer family cpu). Don't do anything for lod bias and explicit derivatives yet, though no special magic should be needed for them neither. Likewise, the size query is still broken just the same. v2: Use information if lod is a (broadcast) scalar or not. The idea would be to base this on the actual value, for now just pretend it's a scalar in fs and not a scalar otherwise (so, per-pixel lod is only used in gs/vs but same code is generated for fs as before). Reviewed-by: Jose Fonseca <[email protected]>
*	mesa,glsl,gallium: remove GLSLSkipStrictMaxVaryingLimitCheck and dependencies	Marek Olšák	2013-07-02	12	-12/+0
\| \| \| \| \| \|	Not needed with do_dead_builtin_varyings. Reviewed-by: Ian Romanick <[email protected]>
*	draw/translate: fix instancing	Zack Rusin	2013-06-28	4	-16/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	We were incorrectly computing the buffer offset when using the instances. The buffer offset is always equal to: start_instance * stride + (instance_num / instance_divisor) * stride We were completely ignoring the start instance quite often producing instances that completely wrong, e.g. if start instance = 5, instance divisor = 2, then on the first iteration it should be: 5 * stride, not (5/2) * stride as we'd have currently, and if start instance = 1, instance divisor = 3, then on the first iteration it should be: 1 * stride, not 0 as we'd have. This fixes it and adjusts all the code to the changes. Signed-off-by: Zack Rusin <[email protected]>
*	nvc0: allow frame dropping in h264	Maarten Lankhorst	2013-07-01	1	-3/+0
\| \| \| \| \| \| \| \|	The only reason the checks existed were paranoia, when I first wrote the code I wasn't sure it was correct. Now that I am, the asserts triggered when XBMC was dropping frames, so remove it. NOTE: This is a candidate for the 9.1 branch.
*	r300g/compiler: Prevent regalloc from swizzling texture operands v2	Tom Stellard	2013-06-30	5	-0/+124
\| \| \| \| \| \| \| \| \|	https://bugs.freedesktop.org/show_bug.cgi?id=63520 NOTE: This is a candidate for the stable branches. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
*	r300g/compiler/tests: Add an assembly parser	Tom Stellard	2013-06-30	5	-16/+200
\| \| \| \| \| \| \|	The assembly parser can be used to load r300 assembly dumps and run them through any of the r300 compiler passes. Reviewed-by: Alex Deucher <[email protected]>