| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
rc_find_free_temporary_list() returns signed integer
(in case of lack of free temporary registers returns -1),
so new_index in radeon_rename_regs() should be signed.
https://bugs.freedesktop.org/show_bug.cgi?id=54867
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Fixes "Uninitialized scalar field" defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't perform DCE using the liveness pass between GVN and GCM
because it relies on the correct schedule, but GVN doesn't care about
preserving correctness - it's rescheduled later by GCM.
This patch makes dce_cleanup pass perform simple DCE
between GVN and GCM instead of relying on liveness pass.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a
few problems:
- it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there
- it is merging not only rtti, but the whole cxxflags (defines etc)
which has proven to be a source of troubles (breaks debugging etc.)
|
|
|
|
|
|
|
|
| |
LLVM 3.3 does not know about CIK processors, and the codes paths for SI
and CIK are the same.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "9.2" <[email protected]>
|
| |
|
|
|
|
|
| |
Fix debug error message. Add switch case for PIPE_SHADER_COMPUTE.
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
| |
* The rtti fix actually dug up a bug in the scons build scripts.
* Autotools took the LLVM cpp and cxx flags, while scons only took
the cpp flags.
* This grabs the cxx flags and applies them where needed. We may
want to make the same change for the llvm cpp flags in scons.
* The only linux platform I can find with LLVM no-rtti is Ubuntu.
* Fixes bug #70471
Tested-by: Vinson Lee <[email protected]>
|
|
|
|
|
|
|
|
| |
Actually implemented by draw module.
Tested piglit ARB_depth_clamp tests, which pass 100%.
Trivial.
|
|
|
|
|
|
|
|
| |
Textures that likely reside in VRAM, are mapped for reading and
don't require direct mapping should be staged into GTT, to avoid bad
performance. This fixes readback performance of VDPAU surfaces.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
This new bind flag forces linear storage, but does not have other
side effects like R600_RESOURCE_FLAG_TRANSFER.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
| |
This fixes a crash in Unigine Heaven 3.0, and probably in some
others apps.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently it's hardcoded in the shader, so every change requires
compilation of the shader variant, killing the performance
in Serious Sam 3 and probably other apps.
This patch passes alpha_ref in the user sgpr and removes it from
the shader key.
Signed-off-by: Vadim Girlin <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes the issue when dst and src is the same reg and operation on one
channel overwrites the source for other channels, e.g.:
UMUL TEMP[2].xyz, TEMP[0].xyzz, TEMP[2].xxxx
In this example the result of the operation on channel x is written in
TEMP[2].x and then used as a second source operand for channels y and z
instead of original value in TEMP[2].x.
This patch stores the results in temp reg and moves them to
dst after performing operation on all channels.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70327
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
| |
Now that we support start, assert on start + num < max samplers
Reported by xexaxo
|
|
|
|
|
|
|
|
| |
With code dump enabled LLVM may generate disassembly during compilation.
Show this disassembly when available and prefer it to SI bytecode dump.
Reviewed-by: Tom Stellard <[email protected]>
Signed-off-by: Jay Cornwall <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix coord wrapping (and face selection too) in case of edges.
Unfortunately, the coord wrapping is way more complicated than what
the code did, as it depends on the face and the direction where the
texel falls off the face (the logic needed to get this right in fact
seems utterly ridiculous).
Also fix a bug in (y direction under/overflow) face selection.
And get rid of complicated cube corner handling. Just like edge case,
the coord wrapping was wrong and it seems very difficult to fix.
I'm near certain it can't always work anyway (though ordinary seamless
filtering on edge has actually a similar problem but not as severe)
because we don't have per-pixel face, hence could have multiple corner
texels which would make it very difficult to average the remaining texels
correctly. Hence simply pick a texel which would only have fallen off one
edge but not both instead, which is not quite accurate but actually I think
should be enough to meet OpenGL (but not d3d10) requirements.
v2: small fixes suggested by Brian, add some comments.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous limit of of 128*1024 was reported to cause frequent recompiles
in some apps due to shader variant thrashing on IRC in some apps leading
to noticeable lags.
Note that the LP_MAX_SHADER_VARIANTS limit (1024) was more or less impossible
to reach, since even simple fragment shaders without texturing (glxgears) used
more than twice than 128 instructions, hence the instruction limit would have
always been reached first (excluding things like trivial shaders not writing
color). Even with the new limit it is VERY likely the instruction limit is hit
first.
Should help with such lags due to recompiles (though other shader types have
their own limits, LP_MAX_SETUP_VARIANTS and DRAW_MAX_SHADER_VARIANTS, in
particular the latter seems a bit small (128)).
Reviewed-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
|
|
|
|
| |
We should be able to safely set the framebuffer state without a
fragment shader bound. bind_ps_state will take care of updating the
necessary state bits later.
v2: check in update_db_shader_control
|
|
|
|
|
|
|
|
| |
Unless the polygon fill mode is different from PIPE_POLYGON_MODE_FILL,
so checking the the polygon mode is sufficient.
Testing done: no regression in polygon-mode-offset
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
As we're moving towards expanding the number of subpixel
bits and the width of the variables used in the computations
we need to make this code a bit more centralized.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
It doesn't work (decodes to garbage) with most videos on UVD 3.0. Worse
yet, it often results in random memory corruption or GPU hangs. Rumor
has it only the newest UVD hardware could do it anyway.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The DPB size calculations seem to be off; there is various random
corruption happening, even with advanced profile. Always assuming
a minimum number of references appears to fix it, similarly to
H.264. This might overallocate the DPB. Also clean up the SPS/PPS
field setup so that it matches VC-1 specifications better.
With these changes, all advanced profile VC-1 files I could get my
hand on work fine.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
UVD can only support NV12 in the case of hardware decoding, but we
can still use all other formats for software decoding. Use the UNKNOWN
profile to signal that we're not interesting in hardware decoding.
v2: use profile instead of entrypoint
Reviewed-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
This doesn't fix any known issue. I'm just following the docs.
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Otherwise it is fairly confusing.
|
|
|
|
|
|
|
|
| |
The new sampler bind sends us NULL samplers, so we need to count
the number of valid samplers ourselves. This fixes ~500 piglit
regressions from the sampler rework.
While we're at it, let's also support start.
|
|
|
|
|
|
|
| |
No need to keep a copy of the message in system memory anymore,
since it should now be in GART memory on newer chips.
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
operations"
This reverts commit 7948ed1250cae78ae1b22dbce4ab23aceacc6159.
It caused graphical corruption. I've got no idea why.
Bugzilla:
https://bugs.freedesktop.org/show_bug.cgi?id=70042
https://bugs.freedesktop.org/show_bug.cgi?id=68451
Conflicts:
src/gallium/drivers/r600/evergreen_hw_context.c
src/gallium/drivers/r600/r600_hw_context.c
src/gallium/drivers/r600/r600_pipe.h
|
|
|
|
|
|
|
| |
All texture instructions can use offsets, not just TXF. Offsets into
the literals array were wrong, too.
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Compute samplers are advertised, but not implemented.
I think that's intentional.
|
| |
|
|
|
|
|
|
|
|
|
| |
As we march over the source buffer we're uploading in pieces, we
need to memcpy from the current offset, not the start of the buffer.
Fixes graphical corruption when drawing very large vertex buffers.
Cc: "9.2" <[email protected]>
Reviewed-by: Matthew McClure <[email protected]>
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Tom Stellard <[email protected]>
https://bugs.freedesktop.org/show_bug.cgi?id=70106
|
|
|
|
| |
Remove the assignment and the no-op function.
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|
| |
|