| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
These haven't been used by the mesa state tracker since the
conversion to tgsi_ureg, and it seems that none of the
other state trackers are using it either.
This helps simplify one of the biggest suprises when starting off with
TGSI shaders.
|
|\
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
src/mesa/drivers/dri/r600/r700_assembler.c
src/mesa/drivers/dri/r600/r700_chip.c
src/mesa/drivers/dri/r600/r700_render.c
src/mesa/drivers/dri/r600/r700_vertprog.c
src/mesa/drivers/dri/r600/r700_vertprog.h
src/mesa/drivers/dri/radeon/radeon_span.c
|
| |
| |
| |
| | |
This fixes the glean/glsl1 "texture2D(), with bias" test when using SSE.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Src/Dst aliasing (aka SOA dependencies) requires some care to ensure
intermediate results do not overwrite yet-to-be read source registers.
This change ensures that MOV/SWZ handle this correctly, which is poor but
no worse than the current tgsi_exec.c path. Remove the fallback as there
is nothing to be gained correctness-wise between the two implementations now.
Fixing this properly looks like a bit of work in this code, but might be
easily achieved by sending destination writes to temporary storage.
|
|/
|
|
| |
Fix recent performance regression.
|
|
|
|
| |
Can be implemented with CMP src2, src1, src0
|
|
|
|
| |
Fall back to interpreter for now. This doesn't happen very often.
|
|\ |
|
| |
| |
| |
| | |
Fixes piglit fp-generic tests/shaders/generic/lrp_sat.fp, bug 23316.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| | |
The LOOP/ENDLOOP pair is renamed to BGNFOR/ENDFOR as its behaviour
is similar to a C language for-loop.
The BGNLOOP2/ENDLOOP2 pair is renamed to BGNLOOP/ENDLOOP as now
there is no name collision.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
When sampling a 2D shadow map we need 3 texcoord components, not 2.
The third component (distance from light source) is compared against
the texture sample to return the result (visible vs. occluded).
Also, enable proper handling of TGSI_TEXTURE_SHADOW targets in Mesa->TGSI
translation. There's a possibility for breakage in gallium drivers if
they fail to handle the TGSI_TEXTURE_SHADOW1D / TGSI_TEXTURE_SHADOW2D /
TGSI_TEXTURE_SHADOWRECT texture targets for TGSI_OPCODE_TEX/TXP instructions,
but that should be easy to fix.
With these changes, progs/demos/shadowtex.c renders properly again with
softpipe.
|
| |
| |
| |
| |
| |
| | |
Various opcodes which can be implemented trivially with other TGSI opcodes,
such as matrix multiplication and negation. These were not used by any
state tracker or implemented by any of the drivers.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This is a source of ongoing confusion. TGSI has multiple names for
opcodes where the same semantics originate in multiple shader APIs.
For instance, TGSI includes both Mesa/GLSL and DX/SM30 names for
opcodes with the same semantics, but aliases those names to the same
underlying opcode number.
This makes it very difficult to visually inspect two sets of opcodes
(eg in state tracker & driver) and check if they implement the same
functionality.
This patch arbitarily rips out the versions of the opcodes not currently
favoured by the mesa state tracker and leaves us with a single name
for each distinct operation.
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
Remove the need to have a pointer in this struct by just including
the immediate data inline. Having a pointer in the struct introduces
complications like needing to alloc/free the data pointed to, uncertainty
about who owns the data, etc. There doesn't seem to be a need for it,
and it is unlikely to make much difference plus or minus to performance.
Added some asserts as we now will trip up on immediates with more
than four elements. There were actually already quite a few such asserts,
but the >4 case could be used in the future to specify indexable immediate
ranges, such as lookup tables.
|
| | |
|
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| | |
This function was calling get_input_base() and get_output_base() to
get the names of a couple of register to use as temps. Those
functions no longer return registers, so adjust it to get the
registers elsewhere.
This change doesn't address the issue that it's a fairly poor way to
grab a register name by calling a function with an apparently
unrelated meaning.
|
| |
| |
| |
| |
| |
| | |
Use sse_movmskps to extract the correct bits of the comparison result
for use in updating the killmask. Simplify some logic around
identifying the set of necessary comparisons to make.
|
| |
| |
| |
| |
| | |
Most obvious problem is drawpixels comes out blocky, but this may be
an existing issue of KIL on the sse path.
|
| |
| |
| |
| | |
Take a list of arguments rather than hardcoding TEMP_R0.
|
| |
| |
| |
| |
| |
| | |
Pass the tgsi_exec_machine struct in directly and just hold a single
pointer to this struct, rather than keeping one for each of its
internal members.
|
|/
|
|
|
|
| |
Explictly pass src and dst arguments (previously dst argument was also
being used as a src). Separate argument handling from the rest of
the function call emit.
|
|
|
|
| |
Fall back to interpreter in this case.
|
|
|
|
|
|
| |
Fix comments.
Make sure .w is set to 1.0 for NRM.
Optimise for non-.xyzw writemasks.
|
| |
|
| |
|
| |
|
|
|
|
|
| |
The debug functions depend on several util function for os abstractions, and
these depend on debug functions, so a seperate module is not possible.
|
|
|
|
| |
RSQ test 2 (reciprocal square toot of negative value)
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
| |
This allows us to use SSE codegen with debug builds again.
When PIPE_ARCH_SSE is set (w/ gcc -msse -msse2) we will also use the
gcc SSE intrinsic functions.
|
| |
|
| |
|
|\
| |
| |
| |
| |
| |
| |
| |
| | |
Conflicts:
src/gallium/auxiliary/rtasm/rtasm_execmem.c
src/mesa/shader/slang/slang_emit.c
src/mesa/shader/slang/slang_log.c
src/mesa/state_tracker/st_atom_framebuffer.c
|
| |
| |
| |
| |
| | |
This prevents vertex shaders from referencing invalid memory locations when
the shader is operating on less than four vertices or fragments.
|
| | |
|
| | |
|
| |
| |
| |
| |
| |
| |
| | |
Besides meaning x86 and x86-64 architecture, it also depends on SSE2
support enabled on gcc.
This fixes the linux-debug build.
|
| | |
|
| |
| |
| |
| | |
compile again.
|
|/
|
|
|
|
|
|
|
|
|
|
| |
Special care must be taken when calling compiler generated SSE2 functions
from the runtime generated SSE2: saving the xmm registers, and notify gcc
the stack is not 16byte aligned.
It would be more efficient to keep the stack pointer 16byte aligned, but
too hairy, and not consistent in all x86 architectures.
This has been tested in linux x86 and windows x86 userspace. Not tested on
x86-64 because it is broken for other reasons (even without this change).
|
| |
|
|
|
|
| |
Also, rename p_tile.[ch] to u_tile.[ch]
|
| |
|