| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
The start_compute_cs atom initializes some config and context registers
to the values needed for running compute shaders. When a compute shader
is dispatched, this atom is emitted after the start_cs_cmd atom, which
initializes registers that are common to both 3D and compute.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Some packets require the shader type bit (bit 1) to be set when
used for compute shaders. The pkt_flag will be initialized to
RADEON_CP_PACKET3_COMPUTE_MODE for any struct r600_command_buffer used
for dispatching compute shaders and it will be or'd against the result of
the PKT3 macro when adding a new packet to a struct r600_command buffer.
Reviewed-by: Marek Olšák <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
No lockups here.
|
| |
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
| |
For copy propgation, we've dropped the use of a GRF in favor of a
(probably later) use of a different GRF. This definitely requires
invalidating intervals.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since live intervals are based on ip, removing an instruction trashes
the intervals unless we were to go do some surgery. These happen to
usually remove a use of a grf, so it's time to recalculate, anyway.
Reviewed-by: Kenneth Graunke <[email protected]>
NOTE: This is a candidate for the 8.0 release branch.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This has less impact than for the FS (4k savings), because it was partially
done already, but makes things more consistent.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Cuts compile time for brw_fs.h changes from 2.7s to .7s and reduces
i965_dri.so size by 70k.
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
| |
|
| |
|
| |
|
| |
|
|
|
|
| |
underlying pipe driver.
|
|
|
|
| |
stderr is not visible on windows.
|
|
|
|
| |
Let galahad warnings be true warnings.
|
|
|
|
| |
As the wrapped pipe driver may hold internal references.
|
|
|
|
| |
And not the wraped driver's objects.
|
| |
|
|
|
|
| |
Some of these helpers use debug_get_option, which works also on releases.
|
|
|
|
|
|
| |
I don't think it's possible or even useful to use the extension with GLSL 1.2.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
We factor out all the EGL book-keeping into dri2_create_image() and
simplify the wayland case by using dupImage.
Signed-off-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
We have the same switch and allocation code in two places.
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
This reverts commit cbffaf20e9e6154310ba68bb2b44adc37ba83bcd.
Use the PRIx64 macro in the fprintf() call instead, as suggested
by Dylan Noblesmith.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
| |
We'll revert the #define fprintf __mingw_fprintf change next.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
ROUND and TRUNC are implemented with one function to reduce code duplication.
Note: ROUND isn't actually used yet, but probably will be soon.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Converting CMP to SLT+LRP didn't work when src2 or src3 was Inf/NaN.
That's the case for GLSL sqrt(0). sqrt(0) actually happens in many
piglit auto-generated tests that use the distance() function.
v2: remove debug/devel code, per Jose
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Was previously implemented with FLOOR.
Fixes quite a few piglit tests of float->int conversion, integer
division, etc.
v2: clean up left over debug/devel code, per Jose
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
If the 'dst' register is the same as the 'pass' register we'll generate
invalid code. Use a temporary register in that case.
Reviewed-by: José Fonseca <[email protected]>
|
| |
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Redo this commit, and remove the inclusion of gl2ext.h
from src/mapi/glapi/glapi_priv.h. The include was added in
8f3be339850ead96f9c6200db4e0db1f74e39d13 to fix a missing prototype for
glDrawBuffersNV and others, but it's not possible to include both
glext.h and gl2ext.h from the same file.
I don't see the missing prototype here (with or without shared glapi)
so I'm just removing the offending #include.
Also, since we're redoing this, update to the most recent gl2ext.2.
Signed-off-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
That old bug was hidden but the clipper always interpolating in 3d space
no matter what it should have been doing. Now that the interpolation
has been fixed, the bug shows up.
Fixes fdo 51364.
Signed-off-by: Olivier Galibert <[email protected]>
Signed-off-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Calling glGenerateMipmap could overwrite vertex buffer state, leading
to incorrect rendering or crashes depending on the Gallium driver.
This was happening on WebGL Conformance test texture-size.
Before 784dd51198433e5c299da4a7742c68d21d68d1c1 this was covered up
by redundant vertex buffer validation.
Reviewed-by: Stéphane Marchesin <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
| |
This reverts commit d1665388ce53d23ee7853e5083ce6f7192061109.
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a big win for savage2, hon and yofrankie. 62 new programs for
savage2/hon get 16-wide mode, along with one for humus demos and two
for tropics. Even a few shaders from tropics see reductions of 15% or
more.
total instructions in shared programs: 216536 -> 207353 (-4.24%)
instructions in affected programs: 123941 -> 114758 (-7.41%)
In benchmarking Tropics, only a .040% +/- 034% performance improvement
was observed (n=90). Rather disappointing, but I was primarily
motivated to do this patch by a regression in the number of 16-wide
shaders compiled after a GRF texturing on IVB patch I'm working on.
Hopefully this helps avoid that regression.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This shaves a few instructions off of a ton of programs. For 12
shaders from tropics and sanctuary, it's enough reduction in register
pressure to get 16-wide mode. 7 shaders from heroes of newerth and
savage2 are hurt by about 1.1%, where copy propagation of negates ends
up preventing coalescing, but we could regain that by doing dataflow
analysis in our copy propagation.
No significant performance difference in tropics (n=11)
Reviewed-by: Kenneth Graunke <[email protected]>
|