| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
|
|
|
|
| |
This commits add shader cache, shader state, shader variant, and etc. It does
not add the shader compiler though.
|
| |
|
|
|
|
| |
The command parser manages batch buffers and command submissions.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
| |
This commit adds some boilerplate code. The header files found under include/
are copied from i965.
|
|
|
|
|
| |
This is a wrapper for libdrm_intel to allow the pipe driver to stay OS
agnostic.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Courtesy of clang:
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1483:10: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
tmp[2] = lp_build_swizzle_aos(coord_bld, ddx_ddy[1], swizzle02);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1487:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_add(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1491:56: warning: array index of '2' indexes past the end of an array (that contains 2 elements) [-Warray-bounds]
rho_vec = lp_build_max(coord_bld, rho_vec, tmp[2]);
^ ~
src/gallium/auxiliary/gallivm/lp_bld_sample.c:1430:10: note: array 'tmp' declared here
LLVMValueRef ddx_ddy[2], tmp[2], rho_vec;
^
|
|
|
|
|
|
|
|
|
|
| |
Only 13 affected programs in shader-db, but they were all helped.
total instructions in shared programs: 368877 -> 368851 (-0.01%)
instructions in affected programs: 1576 -> 1550 (-1.65%)
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Three-source instructions have a vertical stride overloaded to 4, which
prevents directly using vec4 uniforms as arguments. Instead we need to
insert a MOV instruction to do the replication for the three-source
instruction.
With this in place, we can use three-source instructions in the vertex
shader. While some thought needs to go into deciding whether its better
to use a three-source instruction rather than a sequence of equivalent
instructions (when one or more sources are uniforms or immediates), this
will allow us to skip a lot of ugly lowering code and use the BFE and
BFI2 instructions directly.
Reviewed-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This move the tracing timeout and printing into winsys and add
an debug environement variable for it (R600_DEBUG=trace_cs).
Lot of file touched because of winsys API changes.
v2: Do not write lockup file if ib uniq id does not match last one
Signed-off-by: Jerome Glisse <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
There was a lot of code in evergreen_compute_internal.c that was not
being used at all and most of it was duplicating code from other parts
of the driver.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2:
- Fix usage of set_constant_buffer()
- Fix typo in comment
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
v2:
- Bump R600_NUM_ATOMS
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
The state tracker should be responsible for waiting for the kernel to
finish.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Buffer size should be in bytes not dwords.
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We don't support pre-2.6 kernels anyway - the install docs say 2.6.28
for DRI - and apparently this confuses ld.so's sorting when multiple
libGLs are installed. Just remove it.
Note: this is a candidate for the stable branches.
Acked-by: Kenneth Graunke <[email protected]>
Signed-off-by: Adam Jackson <[email protected]>
|
|
|
|
|
|
|
| |
Better than uncached for writes, which are common for vertex buffer
upload, etc.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
New textures or vertex buffers don't always require patching and
re-emitting the shaders. So do a better job of figuring out when we
actually have to patch the shader.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Removes 75/78 state-dependent recompiles in GLB2.7 (the remaining 3 are
due to FBO-rendering size predictions). We currently expose
GL_ARB_color_buffer_float on GL core, so we may mis-predict there, but I'm
about to send a patch for removing that silly extension in that case.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Note: this is a candidate for the 9.1 branch.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Note: this is a candidate for the stable branches.
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
| |
If a bug in an app/stater-tacker causes vertex buffer to fetch vertex
elements that are not bound, simply return zeros instead of crashing.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Many applications don't exit cleanly, others may create and destroy a
screen multiple times, so we only write </trace> tag and close at exit
time.
|
|
|
|
| |
We were setting the fragment shader, which wasn't needed.
|
|
|
|
|
| |
Useful for core dumps, where calling tgsi_dump() from gdb is not an
alternative.
|
|
|
|
|
|
|
|
|
|
|
| |
clang is supports most gcc options / extensions, with a some exceptions.
The biggest advantage of using clang is that compilation times are much
short.
One can tell scons to use clang when building by invoking it as
CC=clang CXX=clang++ scons libgl-xlib
|
|
|
|
|
| |
Clang does not support __artificial__. Instead match precisely what's
in the clang headers.
|
|
|
|
| |
-fvisibility=hidden is already elsewhere for the whole tree.
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
From low to high bits, the sample positions are packed y0,x0,y1,x1...
Fixes arb_texture_multisample-sample-position piglit.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
We were assigning incorrect const register for immediates, and
potentially writing immediate const to the wrong location. This fixes
an incorrect-rendering bug with xonotic.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Set a few extra registers to make sure we are in proper state for
clearing. And also add some debug options to mark all state dirty in
clear and gmem operations to aid in debugging.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
There is a bit we need to set for 2D vs 3D fetch, to tell the hw whether
there are two or there valid input components.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous approach of using the dst register as an intermediate
temporary doesn't work in a lot of cases. For example, if the dst
register is the same as one of the src registers.
For now, just simplify it and always allocate a new register to use as
an intermediate. In some cases this will result in more registers used
than required. I think the best solution would be to implement an
optimization pass to reduce the number of registers used, which would
also solve the problem we have now of not being able to use GPRs that
are assigned for TGSI_FILE_INPUT.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
It is useful for debugging.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Get rid of a few self-defined macros:
ALIGN() -> align()
min() -> MIN2()
max() -> MAX2()
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Opps, didn't notice that I had left it stubbed out.
Also, make things fail a bit more gracefully when things go wrong.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Really this should be set based on buffer format, not on color vs
depth/stencil. Probably there should be more formats that set the bit
as we add support for more render target formats.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
The libelf implementation that is distributed here:
http://www.mr511.de/software/english.html
requires calling elf_version() prior to calling elf_memory()
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Lighter weight then using streamout. Only evergreen
and newer asics support embedded data as src with
CP DMA.
Reviewed-by: Jerome Glisse <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unlike GEN6, the bits of entry count are distributed like this
width = (entry_count & 0x0000007f); /* bits [6:0] */
height = (entry_count & 0x001fff80) >> 7; /* bits [20:7] */
depth = (entry_count & 0x7fe00000) >> 21; /* bits [30:21] */
The maximum entry count is still limited to 2^27.
This was noted while going over the PRM. No test is impacted, because
1<<20 (the bit that moved) is much larger than GL_UNIFORM_BLOCK_MAX_SIZE,
GL_MAX_TEXTURE_BUFFER_SIZE, or MAX_*_UNIFORM_COMPONENTS.
v2: Explain more in the commit message (by anholt)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The inverse repeat count should taks up bits 31:15 and is in U1.16. Fixes
the "Restarting lines within a single Begin/End block" subtest of piglit
linestipple, and gets the other failing subtests much closer to passing.
v2: Rewrite commit message with more detailed piglit info (by anholt)
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
Wrong fields were used when dumping width and height.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Never existed? At least never supported. Doesn't appear in 965, G45,
or ILK documentation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, the only kind of ir_jump that would terminate a basic
block was "return". However, the other possible types of ir_jump
("break", "continue", and "discard") should terminate a basic block
too. This patch modifies basic block analysis so that it terminates a
basic block on any type of ir_jump, not just ir_return.
Fixes piglit test dead-code-break-interaction.shader_test.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|