mesa.git - Unnamed repository; edit this file 'description' to name the repository.

	Commit message (Collapse)	Author	Age	Files	Lines
*	cell: perform triangle cull a little earlier	Jonathan Adamczewski	2009-05-21	1	-31/+74
\| \| \| \| \| \| \| \| \| \|	In spu_tri.c:setup_sort_vertices() triangles are culled after the vertices are sorted. This patch moves the check a little earlier and performs the actual check a little faster through intrinsics and a little trickery. Reduced code size and less work is done before a triangle is deemed OK to skip.
*	cell: unroll inner loop of spu_render.c:cmd_render()	Jonathan Adamczewski	2009-05-21	1	-22/+18
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	It was taking approximately 50 cycles to extract the vertex indices, calculate the vertex_header pointers and call tri_draw() for each three vertices - . Unrolled, it takes less than 100 cycles to extract, unpack, calculate pointers and call tri_draw() eight times. It does have a nasty jump-tabled switch. I'm sure that there's a better way... Code size of spu_render.o gets larger due to the extra constants and work in the inner loop, there are extra stack saves and loads because there are more registers in use, and an assert. spu_tri.o gets a little smaller.
*	cell: use some SPU intrinsics to get slightly better code in eval_inputs()	Brian Paul	2009-02-16	1	-4/+7
\| \| \| \|	Suggested by Jonathan Adamczewski. There may be more places to do this...
*	cell: new/tighter code for computing fragment program inputs	Brian Paul	2009-02-15	1	-91/+76
\|
*	cell: combine eval_z(), eval_w() functions	Brian Paul	2009-02-15	1	-20/+27
\|
*	cell: Specify constant as float for CEILF().	Jonathan Adamczewski	2009-01-14	1	-1/+1
\| \| \| \| \| \|	Without the f, the constant is treated as a double, resulting in slower arithmetic and libgcc conversion calls each time CEILF() is used.
*	cell: SIMDize sorting in setup_sort_vertices()	Jonathan Adamczewski	2009-01-05	1	-55/+42
\| \| \| \| \| \|	Put setup.v{min,mid,max,provoke} into a union with qword vertex_headers. Rewrite vertex sorting to more efficiently handle the packed data items. Reduces spu_tri.o by ~128 bytes.
*	cell: SIMDize some subtractions	Jonathan Adamczewski	2009-01-05	1	-8/+10
\| \| \| \| \| \| \|	Put edge.{dx,dy} into a union with a vector and perform subtractions in setup_sort_vertices() on vectors. Reduces spu_tri.o by ~300 bytes.
*	cell: improvements to spu_tri.c	Jonathan Adamczewski	2009-01-04	1	-42/+52
\| \| \| \| \| \| \|	Replace int setup.span{left,right}[2] with vec_uint4 setup.span.quad SIMDize calculate_mask() and inline into into flush_spans() Set setup.span.quad members using spu_shuffle() or spu_sel(). Reduces spu_tri.o by ~116 bytes.
*	CELL: two-sided stencil fixes	Robert Ellison	2008-11-11	1	-4/+16
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	With these changes, the tests/stencil_twoside test now works. - Eliminate blending from the stencil_twoside test, as it produces an unneeded dependency on having blending working - The spe_splat() function will now work if the register being splatted and the destination register are the same - Separate fragment code generated for front-facing and back-facing fragments. Often these are the same; if two-sided stenciling is on, they can be different. This is easier and faster than generating code that does both tests and merges the results. - Fixed a cut/paste bug where if the back Z-pass stencil operation were different from all the other operations, the back Z-fail results were incorrect.
*	CELL: stencil bug fixes	Robert Ellison	2008-10-30	1	-1/+1
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	Two definitive bugs in stenciling were fixed. The first, reversed registers in the generated Select Bytes (selb) instruction, caused the stenciling INCR and DECR operations to fail dramatically, putting new values in where old values were supposed to be and vice versa. The second caused stencil tiles to not be read and written from main memory by the SPUs. A per-spu flag, spu.read_depth, was used to indicate whether the SPU should be reading depth tiles, and was set only when depth was enabled. A second flag, spu.read_stencil, was set when stenciling was enabled, but never referenced. As stenciling and depth are in the same tiles on the Cell, and there is no corresponding TAG_WRITE_TILE_STENCIL to complement TAG_WRITE_TILE_COLOR and TAG_WRITE_TILE_Z, I fixed this by eliminating the unused "spu.read_stencil", renaming "spu.read_depth" to "spu.read_depth_stencil", and setting it if either stenciling or depth is enabled. I also added an optimization to the fragment ops generation code, that avoids calculating stencil values and/or stencil writemask when the stencil operations are all KEEP.
*	cell: implement KIL instruction	Brian Paul	2008-10-16	1	-1/+4
\|
*	cell: get rid of last usage of float4 union/typedef	Brian Paul	2008-10-15	1	-34/+29
\| \| \| \|	Results in slightly tighter code.
*	cell: simplify triangle front/back face determination	Brian Paul	2008-10-15	1	-46/+23
\|
*	cell: send rasterizer state to SPUs in proper way, remove front_winding hack	Brian Paul	2008-10-15	1	-2/+2
\|
*	cell: updated vertex dump/debug code	Brian Paul	2008-10-15	1	-9/+14
\|
*	cell: more clean-up in spu_tri.c	Brian Paul	2008-10-13	1	-84/+16
\|
*	cell: remove dead code, clean-up, reformatting	Brian Paul	2008-10-13	1	-90/+24
\|
*	cell: finish-up perspective-corrected interpolation	Brian Paul	2008-10-13	1	-45/+82
\|
*	cell: remove old texture code	Brian Paul	2008-10-13	1	-66/+1
\|
*	cell: updates in response to draw's struct vertex_info changes	Brian Paul	2008-10-10	1	-2/+2
\|
*	cell: implement basic TXP instruction in fragment shaders	Brian Paul	2008-10-09	1	-1/+1
\| \| \| \| \| \|	Lots of restrictions for now (one 2D texture, no mipmaps, etc.) for now but basic texture demos work. TEX, TXD, TXP do the same thing for the time being.
*	CELL: changes to generate SPU code for stenciling	Robert Ellison	2008-10-03	1	-5/+30
\| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \| \|	This set of code changes are for stencil code generation support. Both one-sided and two-sided stenciling are supported. In addition to the raw code generation changes, these changes had to be made elsewhere in the system: - Added new "register set" feature to the SPE assembly generation. A "register set" is a way to allocate multiple registers and free them all at the same time, delegating register allocation management to the spe_function unit. It's quite useful in complex register allocation schemes (like stenciling). - Added and improved SPE macro calculations. These are operations between registers and unsigned integer immediates. In many cases, the calculation can be performed with a single instruction; the macros will generate the single instruction if possible, or generate a register load and register-to-register operation if not. These macro functions are: spe_load_uint() (which has new ways to load a value in a single instruction), spe_and_uint(), spe_xor_uint(), spe_compare_equal_uint(), and spe_compare_greater_uint(). - Added facing to fragment generation. While rendering, the rasterizer needs to be able to determine front- and back-facing fragments, in order to correctly apply two-sided stencil. That requires these changes: - Added front_winding field to the cell_command_render block, so that the state tracker could communicate to the rasterizer what it considered to be the front-facing direction. - Added fragment facing as an input to the fragment function. - Calculated facing is passed during emit_quad().
*	cell: evaluate multiple fragment inputs	Brian Paul	2008-09-12	1	-1/+7
\|
*	cell: setup fragment program inputs in SOA format	Brian Paul	2008-09-12	1	-56/+56
\| \| \| \|	Also remove old code, etc.
*	cell: initial support for fragment shader code generation.	Brian Paul	2008-09-11	1	-0/+35
\| \| \| \| \| \|	TGSI shaders are translated into SPE instructions which are then sent to the SPEs for execution. Only a few opcodes work, no swizzling yet, no support for constants/immediates, etc.
*	cell: asst. clean-up	Brian Paul	2008-09-11	1	-5/+5
\|
*	cell: remove old per-fragment code, replace with all new code	Brian Paul	2008-09-11	1	-96/+0
\|
*	cell: checkpoint commit of new per-fragment processing	Brian Paul	2008-09-11	1	-0/+30
\| \| \| \| \| \| \|	Do code generation for alpha test, z test, stencil, blend, colormask and framebuffer/tile read/write as a single code block. Ian's previous blend/z/stencil test code is still there but mostly disabled and will be removed soon.
*	cell: comments	Brian Paul	2008-09-11	1	-1/+4
\|
*	cell: asst fixes to get driver building/running again.	Brian	2008-08-25	1	-0/+1
\| \| \| \|	Note that SPU vertex transformation is disabled at this time.
*	gallium: refactor/replace p_util.h with util/u_memory.h and util/u_math.h	Brian Paul	2008-08-24	1	-1/+0
\| \| \| \|	Also, rename p_tile.[ch] to u_tile.[ch]
*	cell: more multi-texture fixes (mostly working now)	Brian	2008-04-01	1	-9/+9
\|
*	cell: checkpoint: more multi-texture work	Brian	2008-04-01	1	-4/+30
\|
*	cell: more work for multi-texture support	Brian	2008-03-31	1	-1/+1
\|
*	cell: initial work to support multi-texture	Brian	2008-03-31	1	-1/+1
\|
*	cell: Implement code-gen for logic op	Ian Romanick	2008-03-26	1	-26/+33
\| \| \| \| \| \| \|	This also implements code-gen for the float-to-packed color conversion. It's currently hardcoded for A8R8G8B8, but that can easily be fixed as soon as other color depths are supported by the Cell driver.
*	cell: Change code-gen for CONST_COLOR blend factor	Ian Romanick	2008-03-21	1	-0/+2
\| \| \| \| \| \| \| \| \| \| \| \|	Previously the constant color blend factor was compiled into the generated code. This meant that the code had to be regenerated each time the constant color was changed. This doesn't fit with the model used in Gallium. As-is, the code could be better. The constant color is loaded for every quad processed, even if it is not used. Also, if a lot of (1-x) blend factors are used, 1.0 will be loaded and reloaded into registers many times.
*	cell: Fix bus error when there is no depth buffer	Ian Romanick	2008-03-20	1	-0/+3
\|
*	cell: Use code-gen for alpha blend	Ian Romanick	2008-03-20	1	-19/+37
\| \| \| \|	So far this is only tested when GL_BLEND is disabled.
*	cell: Initial code-gen for alpha / stencil / depth testing	Ian Romanick	2008-03-17	1	-14/+9
\| \| \| \| \| \| \| \|	Alpha test is currently broken because all per-fragment testing occurs before alpha is calculated. Stencil test is currently broken because the Z-clear code asserts if there is a stencil buffer.
*	Code reorganization: move files into their places.	José Fonseca	2008-02-15	1	-0/+926
	This is in a separate commit to ensure renames are properly preserved.