summaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/rtasm
Commit message (Collapse)AuthorAgeFilesLines
* cell: datatype clean-ups in SPE rtasmBrian Paul2009-01-112-105/+99
|
* gallium: added comment/annotation support to PPC rtasmBrian Paul2009-01-102-62/+187
|
* gallium: s/false/FALSE/Brian Paul2009-01-101-1/+1
|
* rtasm: Remove spurious semi-colons after function bodies.José Fonseca2008-12-301-5/+5
|
* CELL: use variant-length fragment ops programsRobert Ellison2008-11-212-4/+14
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a set of changes that optimizes the memory use of fragment operation programs (by using and transmitting only as much memory as is needed for the fragment ops programs, instead of maximal sizes), as well as eliminate the dependency on hard-coded maximal program sizes. State that is not dependent on fragment facing (i.e. that isn't using two-sided stenciling) will only save and transmit a single fragment operation program, instead of two identical programs. - Added the ability to emit a LNOP (No Operation (Load)) instruction. This is used to pad the generated fragment operations programs to a multiple of 8 bytes, which is necessary for proper operation of the dual instruction pipeline, and also required for proper SPU-side decoding. - Added the ability to allocate and manage a variant-length struct cell_command_fragment_ops. This structure now puts the generated function field at the end, where it can be as large as necessary. - On the PPU side, we now combine the generated front-facing and back-facing code into a single variant-length buffer (and only use one if the two sets of code are identical) for transmission to the SPU. - On the SPU side, we pull the correct sizes out of the buffer, allocate a new code buffer if the one we have isn't large enough, and save the code to that buffer. The buffer is deallocated when the SPU exits. - Commented out the emit_fetch() static function, which was not being used.
* CELL: fix stencil twiddling, stencil invertRobert Ellison2008-11-131-3/+3
| | | | | | | | | | | | | Many stencil tests were failing because of a failure to read the stencil buffer, due to "twiddling" (or "untwiddling") "an unsupported texture format". This is fixed for the case of a stencil/Z S824Z format (which twiddles just like the 32-bit color formats). tests/stencilwrap.c was failing on the GL_INVERT test, because the emitted code for "spe_xori" turned out not to be an actual "xori" instruction, but rather a "stqd" instruction, because of a typo in the rtasm code. This is now fixed, and tests/stencil_wrap now works.
* gallium: add missing prototypesBrian Paul2008-11-121-0/+6
|
* cell: move semicolons to silence warnings w/ other compilersBrian Paul2008-11-121-189/+189
|
* cell: fix typo in EMIT_ macroBrian Paul2008-11-121-1/+1
|
* rtasm: Use INLINE keyword. Compile for all platforms, not only GALLIUM_CELL.Michal Krol2008-11-121-9/+5
|
* rtasm: Compile only for GALLIUM_CELL.Michal Krol2008-11-121-0/+4
|
* CELL: two-sided stencil fixesRobert Ellison2008-11-111-2/+5
| | | | | | | | | | | | | | | | | | | With these changes, the tests/stencil_twoside test now works. - Eliminate blending from the stencil_twoside test, as it produces an unneeded dependency on having blending working - The spe_splat() function will now work if the register being splatted and the destination register are the same - Separate fragment code generated for front-facing and back-facing fragments. Often these are the same; if two-sided stenciling is on, they can be different. This is easier and faster than generating code that does both tests and merges the results. - Fixed a cut/paste bug where if the back Z-pass stencil operation were different from all the other operations, the back Z-fail results were incorrect.
* gallium: grow SPE instruction buffer as neededBrian Paul2008-10-291-16/+41
|
* gallium: no longer pass max_inst to ppc_init_func()Brian Paul2008-10-292-2/+2
|
* gallium: use execmem for PPC code, grow instruction buffer as neededBrian Paul2008-10-292-21/+50
|
* gallium: fix alignment parameter passed to u_mmAllocMem()Brian Paul2008-10-291-2/+2
| | | | | | Was 32, now 5. The param is expressed as a power of two exponent. The net effect is that the alignment was a no-op on X86 but on PPC we always got the same memory address everytime rtasm_exec_malloc() was called.
* gallium: prefix memory manager functions with u_ to differentiate from ↵Brian Paul2008-10-291-4/+4
| | | | functions in mesa/main/mm.c
* gallium: test for PIPE_OS_LINUX instead of __linux__Brian Paul2008-10-291-4/+5
|
* gallium: added ppc_vnmsubfp()Brian Paul2008-10-292-1/+12
|
* scons: ppc support.Michel Dänzer2008-10-231-0/+1
|
* gallium: remove ppc_vload_float(), rename ppc_vecmove() -> ppc_vmove().Brian Paul2008-10-222-23/+2
|
* gallium: added ppc_vzero()Brian Paul2008-10-222-0/+13
|
* gallium: added ppc_vload_float(), for limited casesBrian Paul2008-10-222-0/+22
|
* gallium: fix-up confusing register allocation masks in rtasm_ppc.cBrian Paul2008-10-222-21/+36
| | | | Plus, add ppc_reserve_register() func.
* gallium: added ppc_lvewx()Brian Paul2008-10-222-0/+11
|
* cell: implement many more PPC instructions for code genBrian Paul2008-10-223-41/+704
|
* cell: add emit_RI10s() which does range checking on the 10-bit signed ↵Brian Paul2008-10-102-10/+30
| | | | | | immediate field This type of checking should be expanded to cover more instructions...
* cell: additional 'offset' checking in spe_lqd(), spe_stqd()Brian Paul2008-10-101-4/+14
|
* cell: fix assertions in spe_lqd(), spe_stqd()Brian Paul2008-10-101-2/+2
|
* CELL: fixing stencil bugsRobert Ellison2008-10-101-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | These are the defects found and fixed so far. Several more have been observed; I'm working on them. - Fixed an error in spe_load_uint() that caused incorrect values to be loaded if the given unsigned value had the low 18 bits as 0, and that caused inefficient code to be emitted if the given value had the high 14 bits as 0. - Fixed a problem in stencil code generation where optional registers weren't tracked correctly. - Fixed a problem that the stencil function NEVER was acting as ALWAYS. - Fixed several problems that could occur if stenciling were enabled but depth was disabled. - Fixed a problem with two-sided stencil writemask handling that could cause a stencil writemask to not be applied. - Fixed several state permutations that were incorrectly flagged as not requiring stencil values to be calculated.
* Merge commit 'origin/gallium-0.1' into gallium-0.2Keith Whitwell2008-10-102-1/+19
|\ | | | | | | | | | | | | | | | | | | | | | | | | | | Conflicts: src/gallium/auxiliary/gallivm/instructionssoa.cpp src/gallium/auxiliary/gallivm/soabuiltins.c src/gallium/auxiliary/rtasm/rtasm_x86sse.c src/gallium/auxiliary/rtasm/rtasm_x86sse.h src/mesa/main/texenvprogram.c src/mesa/shader/arbprogparse.c src/mesa/shader/prog_statevars.c src/mesa/state_tracker/st_draw.c src/mesa/vbo/vbo_exec_draw.c
| * gallium: replace assertion with conditional/recovery codeBrian2008-10-061-1/+5
| | | | | | | | | | The assertion failed when we ran out of exec memory. Found with conform texcombine test.
| * rtasm: fix debug buildKeith Whitwell2008-10-061-1/+1
| |
| * rtasm: add sse_movntpsKeith Whitwell2008-10-032-0/+14
| |
| * rtasm: add prefetch instructionsKeith Whitwell2008-10-022-0/+31
| |
* | cell: fix incorrect bitmask in spe_load_uint()Brian Paul2008-10-091-1/+1
| |
* | cell: implement function calls from shader code. fslight demo runs now.Brian Paul2008-10-082-14/+73
| | | | | | | | | | | | | | | | | | | | | | | | Used for SIN, COS, EXP2, LOG2, POW instructions. TEX next. Fixed some bugs in MIN, MAX, DP3, DP4, DPH instructions. In rtasm code: Special-case spe_lqd(), spe_stqd() functions so they take byte offsets but low-order 4 bits are shifted out. This makes things consistant with SPU assembly language conventions. Added spe_get_registers_used() function.
* | gallium: asst. clean-upsBrian Paul2008-10-081-11/+17
| | | | | | | | Don't use register qualifier. Doxygen-ize comments. Remove 'extern'.
* | gallium: better instruction printing for SPE codeBrian Paul2008-10-081-10/+36
| |
* | CELL: changes to generate SPU code for stencilingRobert Ellison2008-10-032-30/+257
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This set of code changes are for stencil code generation support. Both one-sided and two-sided stenciling are supported. In addition to the raw code generation changes, these changes had to be made elsewhere in the system: - Added new "register set" feature to the SPE assembly generation. A "register set" is a way to allocate multiple registers and free them all at the same time, delegating register allocation management to the spe_function unit. It's quite useful in complex register allocation schemes (like stenciling). - Added and improved SPE macro calculations. These are operations between registers and unsigned integer immediates. In many cases, the calculation can be performed with a single instruction; the macros will generate the single instruction if possible, or generate a register load and register-to-register operation if not. These macro functions are: spe_load_uint() (which has new ways to load a value in a single instruction), spe_and_uint(), spe_xor_uint(), spe_compare_equal_uint(), and spe_compare_greater_uint(). - Added facing to fragment generation. While rendering, the rasterizer needs to be able to determine front- and back-facing fragments, in order to correctly apply two-sided stencil. That requires these changes: - Added front_winding field to the cell_command_render block, so that the state tracker could communicate to the rasterizer what it considered to be the front-facing direction. - Added fragment facing as an input to the fragment function. - Calculated facing is passed during emit_quad().
* | rtasm: add prefetch instructionsKeith Whitwell2008-10-022-0/+31
| |
* | rtasm: Implement immediate group 1 instructions. Fix SIB emition.José Fonseca2008-09-292-15/+62
| |
* | gallium: SPU register commentsBrian Paul2008-09-261-2/+2
| |
* | cell: use different opcodes for spe_move() depending on even/odd addressBrian Paul2008-09-191-1/+7
| |
* | gallium: added spe_code_size()Brian Paul2008-09-192-0/+8
| |
* | cell: change spe_complement() to take a src and dst reg, like other instructionsBrian Paul2008-09-192-8/+10
| |
* | CELL: add codegen for logic op, color maskRobert Ellison2008-09-192-1/+26
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - rtasm_ppc_spe.c, rtasm_ppc_spe.h: added a new macro function "spe_load_uint" for loading and splatting unsigned integers in a register; it will use "ila" for values 18 bits or less, "ilh" for word values that are symmetric across halfwords, "ilhu" for values that have zeroes in their bottom halfwords, or "ilhu" followed by "iohl" for general 32-bit values. Of the 15 color masks of interest, 4 are 18 bits or less, 2 are symmetric across halfwords, 3 are zero in the bottom halfword, and 6 require two instructions to load. - cell_gen_fragment.c: added full codegen for logic op and color mask.
* | CELL: finish fragment ops blending (except for unusual D3D modes)Robert Ellison2008-09-182-1/+48
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | - Added new "macro" functions spe_float_min() and spe_float_max() to rtasm_ppc_spe.{ch}. These emit instructions that cause the minimum or maximum of each element in a vector of floats to be saved in the destination register. - Major changes to cell_gen_fragment.c to implement all the blending modes (except for the mysterious D3D-based PIPE_BLENDFACTOR_SRC1_COLOR, PIPE_BLENDFACTOR_SRC1_ALPHA, PIPE_BLENDFACTOR_INV_SRC1_COLOR, and PIPE_BLENDFACTOR_INV_SRC1_ALPHA). - Some revamping of code in cell_gen_fragment.c: use the new spe_float_min() and spe_float_max() functions (instead of expanding these calculations inline via macros); create and use an inline utility function for handling "optional" register allocation (for the {1,1,1,1} vector, and the blend color vectors) instead of expanding with macros; use the Float Multiply and Subtract (fnms) instruction to simplify and optimize many blending calculations.
* | gallium: emit SPU instructions in assembler-compatible syntaxBrian Paul2008-09-151-8/+12
| |
* | Fixed emit_RRRJonathan White2008-09-151-1/+1
| |