| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
All the udev code is in the loader, so there is no
reason for us to include this header.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
Signed-off-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In the specification text of NV_vertex_program1_1, the upper
limit of the RCC instruction is written as 1.884467e+19 in
scientific notation, but as 0x5F800000 in binary. But the binary
version translates to 1.84467e+19 rather than 1.884467e+19 in
scientific notation.
Since the lower-limit equals 2^-64 and the binary version equals
2^+64, let's assume the value in scientific notation is a typo
and implement this using the value from the binary version
instead.
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Erik Faye-Lund <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
| |
Signed-off-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
ureg_program is allocated on the heap so we can just bump the
number of immediates that it can handle. It's needed for d3d10.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to handle a lot more immediates and in order to do that
we also switch from allocating this structure on the stack to
allocating it on the heap.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We only supported up to 256 immediates, which isn't enough. We had
code which was allocating immediates as an allocated array, but it
was always used along a statically backed array for performance
reasons. This commit adds code to skip that performance optimization
and always use just the dynamically allocated immediates if the
number of them is too great.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The number of allowed temporaries increases almost with every
iteration of an api. We used to support 128, then we started
increasing and the newer api's support 4096+. So if we notice
that the number of temporaries is larger than our statically
allocated storage would allow we just treat them as indexable
temporaries and allocate them as an array from the start.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we were really doing F2I. And also move it to generic section.
(Note that for llvmpipe the code generated is definitely bad, due to lack
of unsigned conversions with sse. I think though what llvm does (using scalar
conversions to 64bit signed either with x87 fpu (32bit) or sse (64bit)
including lots of domain changes is quite suboptimal, could do something like
is_large = arg >= 2^31
half_arg = 0.5 * arg
small_c = fptoint(arg)
large_c = fptoint(half_arg) << 1
res = select(is_large, large_c, small_c)
which should be much less instructions but that's something llvm should do
itself.)
This fixes piglit fs/vs-float-uint-conversion.shader_test (maybe more, needs
GL 3.0 version override to run.)
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Zack Rusin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
When we clipped a line weren't copying the provoking vertex
color to the second vertex. We also weren't checking for
first vs. last provoking vertex.
Fixes failures found with the new piglit line-flat-clip-color test.
Cc: "10.0, 10.1" <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
To match the CALLOC_STRUCT() call.
Cc: "10.0, 10.1" <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gallivm soa code supported only a single level of nesting for
control flow opcodes (if, switch, loops...) but the d3d10 spec
clearly states that those are nested within functions. To support
nesting of conditionals inside functions we need to store the
nesting data inside function contexts and keep a stack of those.
Furthermore we make sure that if nesting for subroutines is deeper
than 32 then we simply ignore all subsequent 'call' invocations.
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have code generation paths that carry out swizzles of AoS vectors via
bitwise shifts, as these tend to generate more efficient code than
straightforward byte shuffles. But when the input is a constant the
additional bitwise arithmetic operations somehow don't really get
constant propagated properly, evenutally causing assertion failure in
InstCombine pass.
Therefore avoid the bug by using the trivial shuffles for constant
inputs.
Although the sample LLVM IR can cause a crash with any LLVM version,
this was only seen in practice with LLVM 3.2.
Reviewed-by: Matthew McClure <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
When the min_index is very large (or very negative), the multipliation
can overflow 32 bits and result in an incorrect map pointer
modification.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This was discovered as a result of the draw-elements-base-vertex-neg
piglit test, which passes very negative offsets in, followed up by large
indices. The nouveau code correctly adjusts the pointer, but the
translate code needs to do the proper inverse correction. Similarly fix
up the SSE code to do a 64-bit multiply to compute the proper offset.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For a variety of reasons mmap (selinux and pax to name
a few) and can fail and with current code. This will
result in a crash in the driver, if not worse.
This has been the case since the inception of the
gallium copy of rtasm.
Cc: 9.1 9.2 10.0 <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=73473
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
| |
Otherwise they will be NULL when stage destroy is invoked prematurely,
(i.e, on out of memory).
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
| |
Whitelist platforms instead of blacklisting, as several pthread
implementations are missing pthread_barrier_t, in particular MacOSX.
|
|
|
|
|
|
|
| |
Note that PIPE_ROUTINE now returns an int.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
Never used.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a serious regression introduced
in 4e549ddb500cf677b6fa16d9ebdfa67cc23da097.
Cc: 9.2 10.0 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
It's unused and shouldn't be used at all in my opinion.
If some driver doesn't support the unsynchronized flag, u_upload_mgr should
avoid the synchronization by other means, e.g. by using the DONTBLOCK flag.
|
| |
|
|
|
|
|
|
|
| |
This is the recommended way for streaming vertices. Always use this if you
need to upload vertices every frame.
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Even with depth clipping disabled, vertices which have negative w coords
must be discarded. And since we don't have a proper guardband implementation
yet (relying on driver to handle all values except infs/nans in rasterization
for such points) we need to kill them off manually (as they can end up with
coordinates inside viewport otherwise).
v2: use 0.0f instead of 0 (spotted by Brian).
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Trivial.
|
|
|
|
| |
Trivial.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We were calling draw_total_vs_outputs() too early. The call to
draw_pt_emit_prepare() could result in the vertex size changing.
So call draw_total_vs_outputs() after draw_pt_emit_prepare().
This fix would seem to be needed for the non-LLVM code as well,
but it's not obvious. Instead, I added an assertion there to
try to catch this problem if it were to occur there.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=72926
Cc: 10.0 <[email protected]>
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Instead of skipping x/y clipping completely if there's point_tri_clip points
use guard band clipping. This should be easier (previously we could not disable
generating the x/y bits in the clip mask for llvm path, hence requiring custom
clip path), and it also allows us to enable this for tris-as-points more easily
too (this would require custom tri clip filtering too otherwise). Moreover,
some unexpected things could have happen if there's a NaN or just a huge number
in some tri-turned-point, as the driver's rasterizer would need to deal with it
and that might well lead to undefined behavior in typical rasterizers (which
need to convert these numbers to fixed point). Using a guardband should hence
be more robust, while "usually" guaranteeing the same results. (Only "usually"
because unlike hw guardbands draw guardband is always just twice the vp size,
hence small vp but large points could still lead to different results.)
Unfortunately because the clipmask generated is completely unaffected by guard
band clipping, we still need a custom clip stage for points (but not for tris,
as the actual clipping there takes guard band into account).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
pipe_loader_drm.c: In function 'pipe_loader_drm_probe_fd':
pipe_loader_drm.c:120:4: error: implicit declaration of function 'loader_get_pci_id_for_fd' [-Werror=implicit-function-declaration]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Culled out of the "loader: refactor duplicated code into loader util lib"
patch by Rob Clark.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
All the various window system integration layers duplicate roughly the
same code for figuring out device and driver name, pci-id's, etc. Which
is sad. So extract it out into a loader util lib.
v2 (Emil)
* Separate the introduction of libloader from the code de-duplication.
* Strip out non-pci devices support.
* Add scons + Android build system support.
* Add VISIBILITY_CFLAGS to avoid exporting the loader funcs.
v3 (Emil)
* PIPE_OS_ANDROID is undefined at this scope, use ANDROID
* Make sure we define _EGL_NO_DRM when building only swrast
Signed-off-by: Rob Clark <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.
This was the sed script I used:
$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#
# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g
# Rename emails
s/[email protected]/[email protected]/
s/[email protected]/[email protected]/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\[email protected]/[email protected]/g
s/keithw\[email protected]/[email protected]/g
s/[email protected]/[email protected]/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/[email protected]/[email protected]/
# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g
# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
OpenGL does whole-point clipping, that is a large point is either fully
clipped or fully unclipped (the latter means it may extend beyond the
viewport as long as the center is inside the viewport). d3d9 (d3d10 has
no large points) however requires points to be clipped after they are
expanded to a rectangle. (Note some IHVs are known to ignore GL rules at
least with some hw/drivers.)
Hence add a rasterizer bit indicating which way points should be clipped
(some drivers probably will always ignore this), and add the draw interaction
this requires. Drivers wanting to support this and using draw must support
large points on their own as draw doesn't implement vp clipping on the
expanded points (it potentially could but the complexity doesn't seem
warranted), and the driver needs to do viewport scissoring on such points.
Conflicts:
src/gallium/drivers/llvmpipe/lp_context.c
src/gallium/drivers/llvmpipe/lp_state_derived.c
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It's possible to bind a smaller buffer as a constant buffer, than
what the shader actually uses/requires. This could cause nasty
crashes. This patch adds the architecture to pass the maximum
allowable constant buffer index to the jit to let it make
sure that the constant buffer indices are always within bounds.
The behavior follows the d3d10 spec, which says the overflow
should always return all zeros, and overflow is only defined
as access beyond the size of the currently bound buffer. Accesses
beyond the declared shader constant register size are not
considered an overflow and expected to return garbage but consistent
garbage (we follow the behavior which some wlk tests expect which
is to return the actual values from the bound buffer).
Signed-off-by: Zack Rusin <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
An example why it is required:
Let's say there's a fragment shader writing to gl_FragData[0..1].
The user calls: glDrawBuffers(2, {GL_NONE, GL_COLOR_ATTACHMENT0});
That means gl_FragData[0] is unused and gl_FragData[1] is written
to GL_COLOR_ATTACHMENT0.
st/mesa was skipping the GL_NONE draw buffer, therefore gl_FragData[0]
was written to GL_COLOR_ATTACHMENT0, which was wrong.
This commit fixes it, but drivers must also be fixed not to crash when
binding NULL colorbuffers. There is also a new set of piglit tests for this.
The MSAA state also had to be fixed not to crash when reading fb->cbufs[0].
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The initial value of cso_context::sample_mask_saved is irrelevant as it
will be overwritten with cso_context::sample_mask in
cso_save_sample_mask. Therefore it is cso_context::sample_mask that
needs to be properly initialized.
This fixes regressions in blits and mipmap generation after adding
support for sample_mask to llvmpipe.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
| |
code cleanup.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
| |
Prevents a memory leak.
Reviewed-by: Tom Stellard <[email protected]>
CC: "10.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
For scaled resolve. The filter is only good for magnification.
If somebody has an idea how to implement a good filter for minification,
I'm all ears. I'd have to use derivatives probably.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need this for integer formats and upside-down blits, which Radeons don't
support for MSAA resolving.
It can be used by calling util_blitter_blit.
Reviewed-by: Brian Paul <[email protected]>
|