| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the input's xyz are 0.0, the output
should be 0.0. This is due to the fact that
Inf * 0 = 0 for dx9. To handle this case,
cap the result of RSQ to FLT_MAX. We have
FLT_MAX * 0 = 0.
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We should use the absolute value of the input as input to ureg_RSQ.
Moreover, an input of 0.0 should return FLT_MAX.
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
POW doesn't match directly TGSI, since we should
take the absolute value of src0.
Fixes black textures in some games
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
| |
Cc: "10.4" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Let's say we have c1 and c2 declared in the shader and c0 given by the app
Then here we would have read c0, c1 and c2 given by the app, instead
of the correct c0, c1, c2.
This correction fixes several issues in some games.
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
According to docs and Wine, these two vs outputs have
to be saturated.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
don't support integers
The shader code is already behaving as if they are floats when the the card doesn't support integers
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Convert them to shader booleans at earlier stage.
Previous code is fine, but later patch will make
integers being converted at earlier stage, so do
the same for booleans
Reviewed-by: Tiziano Bacocco <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Adds ATI1 and ATI2 support to nine.
They map to PIPE_FORMAT_RGTC1_UNORM and PIPE_FORMAT_RGTC2_UNORM,
but need special handling.
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Signed-off-by: Xavier Bouchoux <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
According to msdn, we must act as if user didn't ask srgb if we don't
support it.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Buffers in the MANAGED pool are supposed to have the content in a ram buffer,
a copy in VRAM if there is enough memory (driver manages memory and decide when
to delete the buffer in VRAM).
This is not implemented properly in nine, and a VRAM copy is going to be created
when the RAM memory is filled, and the VRAM copy will get synced with the RAM
memory updates.
Due to some issues (in the implementation or in app logic), it can happen
we try to create a sampler view of the resource while we haven't created the
VRAM resource. This hack creates the resource when we hit this case, which prevents
crashing, but doesn't help with the resource content.
This fixes several games crashing at launch.
Acked-by: Axel Davy <[email protected]>
Acked-by: David Heidelberg <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Stanislaw Halik <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
While previous code was having the correct behaviour in general,
this new code is more readable (without checking all gallium formats
manually) and has a more defined behaviour for depth stencil resources.
Reviewed-by: Tiziano Bacocco <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The implicit swapchains are destroyed when the device instance is
destroyed. However for non-implicit swapchains, it is not the case,
and the application can have kept an reference on the swapchain
buffers to reuse them.
Fixes problems with battle.net launcher.
Cc: "10.4" <[email protected]>
Tested-by: Nick Sarnie <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This->surfaces contains the surfaces associated to the levels
and faces. This->surfaces[6*Level] is what we want here,
since it gives us a face descriptor for the level 'Level'.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Signed-off-by: Xavier Bouchoux <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use same similar settings as u_sampler_view_default_template
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The cap means D3DFVF_XYZRHW vertices will see clipping.
This is not the case when
PIPE_CAP_TGSI_VS_WINDOW_SPACE_POSITION is supported, since
it'll disable clipping.
Reviewed-by: Tiziano Bacocco <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
It's done by testing the existence of the point sprite output register *after* parsing the vertex shader.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Reviewed-by: Axel Davy <[email protected]>
Signed-off-by: Xavier Bouchoux <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
Cc: "10.4" <[email protected]>
|
|
|
|
|
|
|
|
| |
The clip state was reset everytime, incurring an overhead.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: David Heidelberg <[email protected]>
Signed-off-by: Axel Davy <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: David Heidelberg <[email protected]>
|
| |
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
This was inadvertently disabled by
761e36b4caab4e8e09a4c2b1409a825902fc7d2c.
|
|
|
|
|
|
|
|
|
| |
Instead of passing a pointer to the scratch buffer via user sgprs, we
now patch the shader with the buffer address using reloc information
from the LLVM generated ELF.
v2:
- Make sure not to break older LLVM.
|
|
|
|
|
|
|
|
|
| |
v2:
- Use strdup for copying reloc names.
- Free reloc memory.
v3:
- Add free_relocs parameter to radeon_shader_binary_free_members()
|
| |
|
|
|
|
|
|
|
| |
This should fix this performance regression:
https://bugs.freedesktop.org/show_bug.cgi?id=88227
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
|
|
|
| |
I wanted to read it, so I wrote parsing.
|
|
|
|
|
| |
Execution will end at the cl->next, because that's what ct0ea/ct1ea get
programmed to.
|
|
|
|
|
| |
Everything from ETC1 to RGBA64 was getting its top bit dropped, but we
didn't use any of those formats.
|
|
|
|
|
| |
Theoretically it should apply after dithering as well, but ditehring for
565 happens in fixed function in the TLB store.
|
|
|
|
|
| |
Since unpack only happens on things read from the A register file, we have
to leave them as something that can be allocated to A (temp or uniform).
|
|
|
|
| |
I want it from another location.
|
|
|
|
|
| |
It would mean different unpacking behavior, since only the A file does
unpack (with PM==0).
|
|
|
|
|
| |
No difference on shader-db, but prevents definite regressions in the
blending changes.
|
|
|
|
|
| |
No difference on shader-db, but will become more important as I introduce
more use of pack flags with the blending changes.
|
|
|
|
|
|
| |
It turns out the simulator was not treating this bit the same as the RPi,
and I'd forgotten to remove it when turning on early Z. The result was
that you'd get big chunks of your rendering missing.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0543630d0b0d9d9f6eefbc14fbd3385d4de37ba0.
It caused flickering artifacts in Steam games such as Team Fortress 2 or
Left 4 Dead 2.
We could probably only enable this optimization by also making sure the
shader code only uses either SI_PARAM_LINEAR_CENTROID or
SI_PARAM_LINEAR_CENTER, not both. This would probably require a shader
variant.
Sorry I didn't remember this when reviewing the reverted change.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
| |
If, for example, only the x/y/w components of in.xyzw are actually used,
we still need to have a group of four registers and assign all four
components. The hardware can't write in.xy and in.w to discontiguous
registers. To handle this, pad with a dummy NOP instruction, to keep
the neighbor chain contiguous.
This fixes a problem noticed with firefox OMTC.
Signed-off-by: Rob Clark <[email protected]>
|
| |
|
|
|
|
| |
Fixes the remaining ARB_color_buffer_float rendering tests.
|
| |
|
|
|
|
|
|
|
|
| |
No need to recheck the FS compile when the VS source has changed, but
there *is* a need to recheck the VS compile when the compiled VS has
changed (since the live inputs may change).
Fixes es3conform's blend test.
|
|
|
|
|
|
|
| |
The util_pack_color() thing only sets up the low bits of the union, so
only return them, too. Fixes intermittent failure on
fbo-alphatest-formats and es3conform's framebuffer-objects test under
simulation.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Turns out this was harmful in code quality:
total instructions in shared programs: 39487 -> 38845 (-1.63%)
instructions in affected programs: 22522 -> 21880 (-2.85%)
This costs us yet another register, which is painful since it means more
programs might fail to compile). However, the alternative was causing us
trouble where we'd save/restore r3 while it contained a MIN-ed direct
texture offset, causing the kernel to fail to validate our shaders (such
as in GLB2.7).
|