| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
| |
This adds a tgsi utility tgsi_add_point_sprite to transform a geometry
shader to emulate wide points by drawing quads. This utility adds an
extra output for the original point position if the point position is
to be written to a stream output buffer. It also assumes the driver will
add a constant for inverse viewport scale after the user defined constants.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This could be used by any driver where the device doesn't directly
support two-sided lighting. This code modifies a fragment shader
to accecpt back-face colors and choose between the front/back colors
depending on the triangle's front-face sign.
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These functions deal with inclusive coordinates, hence a 0/0/0/0 rect
returned when there's no intersection doesn't actually represent an empty
rectangle. Hence return 0/-1/0/-1 instead.
This fixes some problems in llvmpipe with empty scissor rects (which up
to now didn't really matter because while the intersect test returned the
wrong result all pixels were scissored away later anyway).
|
|
|
|
|
|
|
|
|
|
|
|
| |
It isn't really obvious if intersection test should take into account empty
rectangles or if the caller should do it. But it looks like most callers
actually verified one of the rects but not the other, but since correctly
returning an empty rect that other rect could actually be empty leading to
more bugs. Hence just verify both rects for emptyness in the intersection
test itself which makes the code easier in the caller (though it will be
slower if the caller knows the rectangles are non-empty).
Reviewed-by: Zack Rusin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This patch adds some more helper functions such as
. tgsi_transform_temps_decl
. tgsi_transform_output_decl
. tgsi_transform_dst_reg
. tgsi_transform_src_reg
Reviewed-by: Brian Paul <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
v2: fix errant _GNU_SOURCE test, per Matt Turner.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Like util_set_vertex_buffers_count(), this basically just copies a
pipe_index_buffer object, taking care of refcounting.
|
|
|
|
|
| |
Reviewed-by: Alex Deucher <[email protected]>
Acked-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
| |
On 32-bit we need to use PRIu64 flags for printfs,
otherwise this segfaults in R600_DEBUG=help otherwise.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "11.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NIR cursor API is exactly what we want for the builder's insertion
point. This simplifies the API, the implementation, and is actually
more flexible as well.
This required a bit of reworking of TGSI->NIR's if/loop stack handling;
we now store cursors instead of cf_node_lists, for better or worse.
v2: Actually move the cursor in the after_instr case.
v3: Take advantage of nir_instr_insert (suggested by Connor).
v4: vc4 build fixes (thanks to Eric).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> [v1]
Reviewed-by: Jason Ekstrand <[email protected]> [v4]
Acked-by: Connor Abbott <[email protected]> [v4]
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: lots of improvements
This is like identity or trace, but simpler. It doesn't wrap most states.
Run with:
GALLIUM_DDEBUG=1000 [executable]
where "executable" is the app and "1000" is in miliseconds, meaning that
the context will be considered hung if a fence fails to signal in 1000 ms.
If that happens, all shaders, context states, bound resources, draw
parameters, and driver debug information (if any) will be dumped into:
/home/$username/dd_dumps/$processname_$pid_$index.
Note that the context is flushed after every draw/clear/copy/blit operation
and then waited for to find the exact call that hangs.
You can also do:
GALLIUM_DDEBUG=always
to do the dumping after every draw/clear/copy/blit operation without
flushing and waiting.
Examples of driver states that can be dumped are:
- Hardware status registers saying which hw block is busy (hung).
- Disassembled shaders in a human-readable form.
- The last submitted command buffer in a human-readable form.
v2: drop pipe-loader changes, drop SConscript
rename dd.h -> dd_pipe.h
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
| |
This allows creating compute-only and debug contexts.
Reviewed-by: Brian Paul <[email protected]>
Acked-by: Christian König <[email protected]>
Acked-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I used this as some testing ground for investigating some compiler
bits initially (e.g. lrint calls etc.), figured I could do much better
in the end just for fun...
This is mathematically equivalent, but uses some tricks to avoid
doubles and also replaces some float math with ints. Good for another
performance doubling or so. As a side note, some quick tests show that
llvm's loop vectorizer would be able to properly vectorize this version
(which it failed to do earlier due to doubles, producing a mess), giving
another 3 times performance increase with sse2 (more with sse4.1), but this
may not apply to mesa.
No piglit change.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This code (lifted straight from the extension) was doing things the most
inefficient way you could think of.
This drops some of the more expensive float operations, in particular
- int-cast floors (pointless, values always positive)
- 2 raised to (signed) integers (replace with simple exponent manipulation),
getting rid of a misguided comment in the process (implement with table...)
- float division (replace with mul of reverse of those exponents)
This is like 3 times faster (measured for float3_to_rgb9e5), though it depends
(e.g. llvm is clever enough to replace exp2 with ldexp whereas gcc is not,
division is not too bad on cpus with early-exit divs).
Note that keeping the double math for now (float x + 0.5), as the results may
otherwise differ.
Acked-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I intend to remove nir_builder::cf_node_list, so I can't have this code
poking at it directly. The proper way is to set the insertion point and
then simply insert things there.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This makes it easy for NIR passes to inspect what kind of shader they're
operating on.
Thanks to Michel Dänzer for helping me figure out where TGSI stores the
shader stage information.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We want to start reworking and expanding this code, but it'll be a lot
easier to do once we disentangle it from the rest of the stuff in nir.c.
Unfortunately, there are a few unavoidable dependencies in nir.c on
methods we'd rather not expose publicly, since if not used in very
specific situations they can cause Bad Things (tm) to happen. Namely, we
need to do some magical control flow munging when adding/removing jumps.
In the future, we may disallow adding/removing jumps in
nir_instr_insert_*() and nir_instr_remove(), and use separate functions
that are part of the control flow modification code, but for now we
expose them and put them in a separate, private header.
Signed-off-by: Connor Abbott <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Tessellation control shaders write to outputs as OUT[ADDR[0].x][0], make
sure to parse the indirect dimension on outputs.
Also tess control inputs/outputs and tess eval input declarations need
to receive the same treatment as geometry shader inputs.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: - lots of changes according to Emil Velikov's comments
- implemented radeon_winsys::read_registers
v3: - a lot of new work, many of them adapt to libdrm interface changes
Squashed patches:
winsys/amdgpu: implement radeon_winsys context support
winsys/amdgpu: add reference counting for contexts
winsys/amdgpu: add userptr support
winsys/amdgpu: allocate IBs like normal buffers
winsys/amdgpu: add IBs to the buffer list, adapt to interface changes
winsys/amdgpu: don't use KMS handles as reloc hash keys
winsys/amdgpu: sync buffer accesses to different rings
winsys/amdgpu: use dependencies instead of waiting for last fence v2
gallium/radeon: unify buffer_wait and buffer_is_busy in the winsys interface (amdgpu part)
winsys/amdgpu: track fences per ring and be thread-safe
winsys/amdgpu: simplify waiting on a variable in amdgpu_fence_wait
gallium/radeon: allow the winsys to choose the IB size (amdgpu part)
winsys/amdgpu: switch to new amdgpu_cs_query_fence_status interface
winsys/amdgpu: handle fence and dependencies merge
winsys/amdgpu follow libdrm change to move user fence into UMD
winsys/amdgpu: use amdgpu_bo_va_op for va map/unmap v2
winsys/amdgpu: use the new tiling flags
winsys/amdgpu: switch to new GTT_USWC definition
winsys/amdgpu: expose amdgpu_cs_query_reset_state to drivers
winsys/amdgpu: fix valgrind warnings
winsys/amdgpu: don't use VRAM with APUs that don't have much of it
winsys/amdgpu: require LLVM 3.6.1 for VI because of bug fixes there
winsys/amdgpu: remove amdgpu_winsys::num_cpus
winsys/amdgpu: align BO size to page size
winsys/amdgpu: reduce BO cache timeout
winsys/amdgpu: remove useless flushing and waiting in amdgpu_bo_set_tiling
winsys/amdgpu: use amdgpu_device_handle as a unique device ID instead of fd
winsys/amdgpu: use safer access to amdgpu_fence_wait::signalled
winsys/amdgpu: allow maximum IB size of 4 MB
winsys/amdgpu: add ip_instance into amdgpu_fence
gallium/radeon: add RING_COMPUTE instead of RADEON_FLUSH_COMPUTE
winsys/amdgpu: set the ring type at CS initilization
winsys/amdgpu: query the GART page size from the kernel
winsys/amdgpu: correctly wait for shared buffers to become idle
winsys/amdgpu: set the amdgpu_cs_fence structure only once at fence creation
winsys/amdgpu: add a specific error message for cs_submit -> -ENOMEM
winsys/amdgpu: check num_active_ioctls before calling amdgpu_bo_wait_for_idle
winsys/amdgpu: clear user fence BO after allocating it
winsys/amdgpu: fix user fences
winsys/amdgpu: make amdgpu_winsys_create public
winsys/amdgpu: remove thread offloading
winsys/amdgpu: flatten the amdgpu_cs_context structure and simplify more
v4: require libdrm 2.4.63
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Leo Liu <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Ilia Mirkin <[email protected]>
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The cumulative value is useful for queries like the number of shader
compilations.
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
vl/vl_mpeg12_bitstream.c: In function 'decode_slice':
vl/vl_mpeg12_bitstream.c:928:19: warning: unused variable 'extra' [-Wunused-variable]
unsigned extra = vl_vlc_get_uimsbf(&bs->vlc, 1);
^
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
lp_bld_tgsi_soa.c: In function 'lp_emit_immediate_soa':
lp_bld_tgsi_soa.c:3065:18: warning: unused variable 'size' [-Wunused-variable]
const uint size = imm->Immediate.NrTokens - 1;
^
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
This will help remove some duplicated code from radeon.
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
| |
Reviewed-by: Kai Wasserbäch <[email protected]>
|
|
|
|
|
|
| |
As it is needed for exp2.
Trivial.
|
|
|
|
|
| |
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
| |
This reverts commit a27ec5dc460b91dc44675f48cddbbb2631ee824f. It
breaks the intended behaviour of pipe_loader_probe() with ndev==0 as
relied upon by clover to query the number of devices available to the
pipe loader in the system.
Acked-by: Emil Velikov <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
Use MSVCRT functions instead. Their semantics are slightly
different but they can be made to work as expected.
Also, use the same code paths for both MSVCRT and MinGW.
https://bugs.freedesktop.org/show_bug.cgi?id=91418
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Its only use is to implement a custom version of LLVMDumpValue
on some Windows and embedded platforms.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
All LLVM API calls that require an ostream object have been removed from
the disassemble() function, so we don't need to use this class to wrap
_debug_printf() we can just call this function directly.
Reviewed-by: Jose Fonseca <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
| |
There is no need for this.
v2: handle redundant clip state changes in st/mesa
|
| |
|