aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/auxiliary/tgsi/tgsi_sse2.c
Commit message (Collapse)AuthorAgeFilesLines
* gallium: remove the swizzling parts of ExtSwizzleKeith Whitwell2009-10-231-26/+8
| | | | | | | | | These haven't been used by the mesa state tracker since the conversion to tgsi_ureg, and it seems that none of the other state trackers are using it either. This helps simplify one of the biggest suprises when starting off with TGSI shaders.
* Merge branch 'mesa_7_6_branch'Brian Paul2009-09-241-5/+5
|\ | | | | | | | | | | | | | | | | | | | | Conflicts: src/mesa/drivers/dri/r600/r700_assembler.c src/mesa/drivers/dri/r600/r700_chip.c src/mesa/drivers/dri/r600/r700_render.c src/mesa/drivers/dri/r600/r700_vertprog.c src/mesa/drivers/dri/r600/r700_vertprog.h src/mesa/drivers/dri/radeon/radeon_span.c
| * tgsi/sse: Pass the lodbias, not zero. More comments.Brian Paul2009-09-241-5/+5
| | | | | | | | This fixes the glean/glsl1 "texture2D(), with bias" test when using SSE.
* | tgsi/sse: remove old commentsBrian Paul2009-09-241-8/+0
| |
* | tgsi/sse: implement SEQ, SGT, SLE, SNEBrian Paul2009-09-241-4/+4
| |
* | tgsi: handle some src/dst aliasing in tgsi_sse2.cKeith Whitwell2009-09-131-8/+23
| | | | | | | | | | | | | | | | | | | | | | | | Src/Dst aliasing (aka SOA dependencies) requires some care to ensure intermediate results do not overwrite yet-to-be read source registers. This change ensures that MOV/SWZ handle this correctly, which is poor but no worse than the current tgsi_exec.c path. Remove the fallback as there is nothing to be gained correctness-wise between the two implementations now. Fixing this properly looks like a bit of work in this code, but might be easily achieved by sending destination writes to temporary storage.
* | tgsi: implement saturationKeith Whitwell2009-09-121-17/+26
|/ | | | Fix recent performance regression.
* tgsi: remove redundant CND0 opcodeKeith Whitwell2009-09-011-4/+0
| | | | Can be implemented with CMP src2, src1, src0
* tgsi: check for SOA dependencies in SSE and PPC code generatorsBrian Paul2009-08-201-0/+4
| | | | Fall back to interpreter for now. This doesn't happen very often.
* Merge branch 'mesa_7_5_branch'Brian Paul2009-08-181-0/+4
|\
| * tgsi/sse: we don't implement saturation modes yetBrian Paul2009-08-181-0/+4
| | | | | | | | Fixes piglit fp-generic tests/shaders/generic/lrp_sat.fp, bug 23316.
* | tgsi: report opcode name in addition to the number when translation failsBrian Paul2009-08-031-2/+5
| |
* | Rename TGSI LOOP instruction to better match theri usage.Michal Krol2009-07-311-2/+2
| | | | | | | | | | | | | | | | The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour is similar to a C language for-loop. The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now there is no name collision.
* | gallium: fix SSE shadow texture instructionsBrian Paul2009-07-291-3/+3
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When sampling a 2D shadow map we need 3 texcoord components, not 2. The third component (distance from light source) is compared against the texture sample to return the result (visible vs. occluded). Also, enable proper handling of TGSI_TEXTURE_SHADOW targets in Mesa->TGSI translation. There's a possibility for breakage in gallium drivers if they fail to handle the TGSI_TEXTURE_SHADOW1D / TGSI_TEXTURE_SHADOW2D / TGSI_TEXTURE_SHADOWRECT texture targets for TGSI_OPCODE_TEX/TXP instructions, but that should be easy to fix. With these changes, progs/demos/shadowtex.c renders properly again with softpipe.
* | gallium: remove deprecated TGSI opcodesKeith Whitwell2009-07-231-12/+0
| | | | | | | | | | | | Various opcodes which can be implemented trivially with other TGSI opcodes, such as matrix multiplication and negation. These were not used by any state tracker or implemented by any of the drivers.
* | gallium: remove multiple aliases for TGSI opcodesKeith Whitwell2009-07-221-16/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This is a source of ongoing confusion. TGSI has multiple names for opcodes where the same semantics originate in multiple shader APIs. For instance, TGSI includes both Mesa/GLSL and DX/SM30 names for opcodes with the same semantics, but aliases those names to the same underlying opcode number. This makes it very difficult to visually inspect two sets of opcodes (eg in state tracker & driver) and check if they implement the same functionality. This patch arbitarily rips out the versions of the opcodes not currently favoured by the mesa state tracker and leaves us with a single name for each distinct operation.
* | gallium: simplify tgsi_full_immediate structKeith Whitwell2009-07-221-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | Remove the need to have a pointer in this struct by just including the immediate data inline. Having a pointer in the struct introduces complications like needing to alloc/free the data pointed to, uncertainty about who owns the data, etc. There doesn't seem to be a need for it, and it is unlikely to make much difference plus or minus to performance. Added some asserts as we now will trip up on immediates with more than four elements. There were actually already quite a few such asserts, but the >4 case could be used in the future to specify indexable immediate ranges, such as lookup tables.
* | tgsi: get texturing working in vertex shader sse2 pathKeith Whitwell2009-07-201-6/+6
| |
* | tgsi: fix regression in indexed const lookupsKeith Whitwell2009-07-201-2/+4
| | | | | | | | | | | | | | | | | | | | | | This function was calling get_input_base() and get_output_base() to get the names of a couple of register to use as temps. Those functions no longer return registers, so adjust it to get the registers elsewhere. This change doesn't address the issue that it's a fairly poor way to grab a register name by calling a function with an apparently unrelated meaning.
* | tgsi: simplify and fix sse KIL implementationKeith Whitwell2009-07-161-36/+28
| | | | | | | | | | | | Use sse_movmskps to extract the correct bits of the comparison result for use in updating the killmask. Simplify some logic around identifying the set of necessary comparisons to make.
* | tgsi: initial texturing support on sse pathKeith Whitwell2009-07-161-19/+183
| | | | | | | | | | Most obvious problem is drawpixels comes out blocky, but this may be an existing issue of KIL on the sse path.
* | tgsi: make sse function callout mechanism more genericKeith Whitwell2009-07-161-40/+48
| | | | | | | | Take a list of arguments rather than hardcoding TEMP_R0.
* | tgsi: reduce x86 reg usage in tgsi_sse generated programsKeith Whitwell2009-07-161-113/+77
| | | | | | | | | | | | Pass the tgsi_exec_machine struct in directly and just hold a single pointer to this struct, rather than keeping one for each of its internal members.
* | tgsi: make function call code in tgsi_sse.c less opaqueKeith Whitwell2009-07-161-23/+86
|/ | | | | | Explictly pass src and dst arguments (previously dst argument was also being used as a src). Separate argument handling from the rest of the function call emit.
* tgis: SSE code generator doesn't yet support indirect addressing of temp regsBrian Paul2009-04-241-0/+29
| | | | Fall back to interpreter in this case.
* tgsi/sse2: Cleanup NRM/NRM4 implementation.Michal Krol2009-04-101-25/+76
| | | | | | Fix comments. Make sure .w is set to 1.0 for NRM. Optimise for non-.xyzw writemasks.
* tgsi/sse2: Fix build.Michal Krol2009-04-091-1/+1
|
* tgsi/sse2: Fix ARL instruction.Michal Krol2009-04-091-0/+1
|
* tgsi/sse2: Fix LIT instruction.Michal Krol2009-04-091-1/+1
|
* util: Move p_debug.h into util module.José Fonseca2009-02-181-1/+1
| | | | | The debug functions depend on several util function for os abstractions, and these depend on debug functions, so a seperate module is not possible.
* gallium: fix glean's vertProg1Alan Hourihane2009-02-161-0/+1
| | | | RSQ test 2 (reciprocal square toot of negative value)
* tgsi: Fix build -- rename Size to NrTokens.Michal Krol2009-02-101-1/+1
|
* tgsi: Implement OPCODE_SSG/SGN.Michal Krol2008-11-261-1/+29
|
* tgsi: Implement OPCODE_ARR.Michal Krol2008-11-261-1/+6
|
* tgsi: Implement OPCODE_ROUND for SSE2 backend.Michal Krol2008-11-261-1/+28
|
* tgsi: Fix a bug with saving/restoring xmm registers upon func call.Michal Krol2008-11-121-3/+3
|
* gallium: use PIPE_ARCH_SSE to protect use of SSE instrinsics onlyBrian2008-11-091-9/+33
| | | | | | This allows us to use SSE codegen with debug builds again. When PIPE_ARCH_SSE is set (w/ gcc -msse -msse2) we will also use the gcc SSE intrinsic functions.
* gallium: implement SSE codegen for TGSI_OPCODE_NRM/NRM4Brian2008-11-081-1/+33
|
* gallium: added SSE for DP2, DP2ABrian Paul2008-11-071-2/+22
|
* Merge commit 'origin/gallium-0.1' into gallium-0.2Brian Paul2008-11-051-3/+32
|\ | | | | | | | | | | | | | | | | Conflicts: src/gallium/auxiliary/rtasm/rtasm_execmem.c src/mesa/shader/slang/slang_emit.c src/mesa/shader/slang/slang_log.c src/mesa/state_tracker/st_atom_framebuffer.c
| * gallium: call tgsi_set_exec_mask() and use exec mask in SSE ARL codeBrian Paul2008-11-051-3/+32
| | | | | | | | | | This prevents vertex shaders from referencing invalid memory locations when the shader is operating on less than four vertices or fragments.
| * tgsi: Implement OPCODE_TRUNC.michal2008-11-051-1/+17
| |
* | tgsi: Implement OPCODE_TRUNC.michal2008-11-051-1/+17
| |
* | gallium: Introduce PIPE_ARCH_SSE define for SSE support.José Fonseca2008-10-071-1/+1
| | | | | | | | | | | | | | Besides meaning x86 and x86-64 architecture, it also depends on SSE2 support enabled on gcc. This fixes the linux-debug build.
* | tgsi: Include p_config.h.José Fonseca2008-10-011-0/+2
| |
* | cell: Moved X86 checks to wrap #include section so that Cell targets will ↵Jonathan White2008-09-301-2/+2
| | | | | | | | compile again.
* | tgsi: SSE2 optimized exp2, log2 and pow implementations.José Fonseca2008-09-301-76/+211
|/ | | | | | | | | | | | Special care must be taken when calling compiler generated SSE2 functions from the runtime generated SSE2: saving the xmm registers, and notify gcc the stack is not 16byte aligned. It would be more efficient to keep the stack pointer 16byte aligned, but too hairy, and not consistent in all x86 architectures. This has been tested in linux x86 and windows x86 userspace. Not tested on x86-64 because it is broken for other reasons (even without this change).
* tgsi: Cleanup code.Michal Krol2008-09-081-50/+37
|
* gallium: refactor/replace p_util.h with util/u_memory.h and util/u_math.hBrian Paul2008-08-241-1/+1
| | | | Also, rename p_tile.[ch] to u_tile.[ch]
* gallium: replace LOG2() macro with util_fast_log2() inline funcBrian Paul2008-08-221-4/+4
|