| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
helper class.
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
|
|
| |
Identifiers with double underscores are reserved, and using them has
undefined behavior according to the C++ spec. It's unlikely to make
any difference, but...
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
| |
Tested-by: Tom Stellard <[email protected]>
|
|
|
|
|
|
| |
This prevents a build failure on some systems.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For seamless cube filtering it is necessary to determine new faces and new
coords per sample. The logic for this is _seriously_ complex (what needs
to happen is very "asymmetric" wrt face, x/y under/overflow), further
complicated by the fact that if the 4 samples are in a corner (meaning we
only have actually 3 samples, and all 3 are on different faces) then
falling off the edge is happening _both_ on x and y axis simultaneously.
There was a noticeable performance hit in mesa's cubemap demo when seamless
filtering was forced on (just below 10 percent or so in a debug build, when
disabling all filtering hacks, otherwise it would probably be a bit more) and
when always doing the logic, hence use a branch which it only does it if any
of the pixels in a quad (or in two quads) actually hit this. With that there
was no measurable performance hit in the cubemap demo (neither in a debug nor
release buidl), but this will vary (cubemap demo very rarely hits edges).
Might also be different on other cpus, as this forces SoA sampling path which
potentially can be quite a bit slower.
Note that as for corners, this code gets all the 3 samples which actually
exist right, and the 4th texel will simply be the same as one of the others,
meaning that filter weights will be a bit wrong. This however should be
enough for full OpenGL (but not d3d10) compliance.
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Using atomic function for ncs is superfluous since it is
protected by a mutex anyway. Also lock the mutex only once
while retrieving the next CS for submission.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Rico Schüller <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Rico Schüller <[email protected]>
|
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Rico Schüller <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: fix commit message
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Rico Schüller <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Rico Schüller <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
rc_find_free_temporary_list() returns signed integer
(in case of lack of free temporary registers returns -1),
so new_index in radeon_rename_regs() should be signed.
https://bugs.freedesktop.org/show_bug.cgi?id=54867
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
Fixes "Uninitialized scalar field" defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Vadim Girlin <[email protected]>
|
|
|
|
|
|
|
| |
Fixes a regression since 02b632d8e8f2b14c155740d28c276b5869305c60.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Fixes "Uninitialized pointer field" defect reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Everything necessary for these appears to be implemented. We'll want to
add more tests to guard against bugs, but it should be functionally
complete.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
translate_sse.c contains code for msabi on x86_64, but it appears to be
untested.
Currently arguments 1 and 2 passed to the generated code are moved as 32-bit
quantities into the registers used by sysvabi, irrespective of the architecture.
Since these may be pointers, they must be moved as 64-bit quantities to avoid
truncation.
Commit f4dd0991719ef3e2606920c5100b372181c60899 disabled tranlate_sse.c on MinGW
x86_64, I don't know if was due to this issue, or a different one...
Signed-off-by: Jon TURNEY <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Cygwin also uses the msabi calling convention on x86_64, not the sysvabi calling
convention
Signed-off-by: Jon TURNEY <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
ignored, and an empty message aborts the commit.
|
|
|
|
|
|
|
|
|
|
|
| |
implementation which uses mmap()
The heap is NX on 64-bit Cygwin, so use the rtasm_exec_malloc() implementation
which uses mmap() to allocate an anonymous page with execute permission, rather
than the one which just uses malloc().
Signed-off-by: Jon TURNEY <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
MSVC doesn't have a strcasecmp() function; it uses _stricmp() instead.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
With most of the virtual functions gone, brwInitVtbl() is now tiny.
Merging it into the caller allows us to delete the entire file.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that i915 and i965 have been split, the separation between
intelDestroyContext and brw_destroy_context is kind of arbitrary.
This patch replaces the only brw->vtbl.destroy() call with the body
of brw_destroy_context (the only implementation of that virtual
function).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
dri_bo_release is a helper function that calls drm_intel_bo_unreference
but then also sets the pointer to NULL. This is unnecessary, since
brw_destroy_context is called from intelDestroyContext, which also frees
brw completely.
If you're still trying to access them, you've got bigger problems.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Having almost the entire body of the function indented one level for a
check that should never happen seems silly. Just early return.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Since the i915/i965 split, there's only one implementation of this
virtual function. We may as well just call it directly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
Since the i915/i965 split, there's only one implementation of this
virtual function. We may as well just call it directly.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit f878d20 (glsl: Update ir_variable::max_ifc_array_access
properly), I accidentally used the wrong kind of check to determine
whether the variable being accessed was an interface instance (I used
var->get_interface_type() != NULL when I should have used
var->is_interface_instance()). As a result, if an unnamed interface
block contained a struct which contained an array,
update_max_array_access() would mistakenly interpret the struct as a
named interface block and try to dereference a null
var->max_ifc_array_access.
This patch corrects the check, fixing the null dereference.
Fixes piglit test interface-block-struct-nesting.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=70368
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In desktop GLSL, location qualifiers are case-insensitive. In GLSL
ES, they are case-sensitive. This patch handles the difference by
using a new function to match layout qualifiers,
match_layout_qualifier(), which calls either strcmp() or strcasecmp()
as appropriate.
Fixes piglit tests:
- layout-not-case-sensitive-in.geom
- layout-not-case-sensitive-max-vert.geom
- layout-not-case-sensitive-out.geom
- layout-not-case-sensitive.frag
Reviewed-by: Kenneth Graunke <[email protected]>
|
| |
|
|
|
|
|
|
|
| |
This fixes a number of piglit crashes when running on a hacked up llvmpipe.
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
This just adds the missing bits so the ubo tests don't crash.
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
From the ARB_geometry_shader4 spec (section Geometry Shader outputs):
"The built-in special variable gl_Position is intended to hold the
homogeneous vertex position. Writing gl_Position is optional."
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
v2 (Bryan Cain <[email protected]>): fix 2D array indexing order.
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2 (Paul Berry <[email protected]>: Split out to separate patch
(previously this was part of "glsl: add builtins for geometry
shaders.")
Signed-off-by: Dave Airlie <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Exposing 10-bit color configs confuses too many applications that try to
use the chooser to pick an 8 bit config. The chooser consider an fbconfig
with more bits a better match and will thus give a 10 bit config when an
application asks for a config with GLX_RED_SIZE 1 or 8.
One key example is glxinfo, which does this, and then doesn't specify that
it needs a config where GLX_DRAWABLE_TYPE has the GLX_WINDOW_BIT set.
This way it ends up with a 10 bit config that it can't use to create a
GLX window and fails to log extensions.
This reverts commit f354bcc1770e9df88db51eba5a4543a09ca6d128.
https://bugs.freedesktop.org/show_bug.cgi?id=70557
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We can't perform DCE using the liveness pass between GVN and GCM
because it relies on the correct schedule, but GVN doesn't care about
preserving correctness - it's rescheduled later by GCM.
This patch makes dce_cleanup pass perform simple DCE
between GVN and GCM instead of relying on liveness pass.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=70088
Signed-off-by: Vadim Girlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Andreas Boll <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Two extra instructions in some heroesofnewerth shaders, but a win for
everything else.
total instructions in shared programs: 1531352 -> 1530815 (-0.04%)
instructions in affected programs: 121898 -> 121361 (-0.44%)
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Scheduling debugging now prints:
Instructions before scheduling (reg_alloc 1)
0: linterp vgrf20, hw_reg2, hw_reg3, hw_reg4,
1: linterp vgrf21, hw_reg2, hw_reg3, hw_reg4+16,
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Useful for tracking down problems in dependency calculations.
Scheduling debugging now prints:
clock 2, scheduled: linterp vgrf5, hw_reg2, hw_reg3, hw_reg0,
child 0, 53 parents: fb_write (null), (null), (null), (null),
child 1, 2 parents: tex vgrf4, vgrf5, (null), (null),
child 2, 52 parents: placeholder_halt (null), (null), (null), (null),
clock 4, scheduled: linterp vgrf5+1, hw_reg2, hw_reg3, hw_reg0+16,
child 0, 52 parents: fb_write (null), (null), (null), (null),
child 1, 1 parents: tex vgrf4, vgrf5, (null), (null),
now available
child 2, 51 parents: placeholder_halt (null), (null), (null), (null),
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 94d05bf87a21bd364e84f699a0064e5fba58a6f9 as it has a
few problems:
- it breaks windows builds becuase env[LLVM_CXXFLAGS] is never set there
- it is merging not only rtti, but the whole cxxflags (defines etc)
which has proven to be a source of troubles (breaks debugging etc.)
|
|
|
|
|
|
|
|
| |
LLVM 3.3 does not know about CIK processors, and the codes paths for SI
and CIK are the same.
Reviewed-by: Marek Olšák <[email protected]>
Cc: "9.2" <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required in order for clang to correctly handle the OpenCL C
barrier() builtin which has the following restrictions acording to
the OpenCL 1.1 Specification:
If barrier is inside a conditional statement, then all work-items must
enter the conditional if any work-item enters the conditional statement
and executes the barrier.
If barrier is inside a loop, all work-items must execute the barrier for
each iteration of the loop before any are allowed to continue execution
beyond the barrier.
By linking before otimizations, we can replace calls to barrier() with
calls to a target specific intrinsic which has the noduplicate attribute
This attribute prevents clang from performing optimizations which could
violate the above rules.
This attribute must be applied to the call instruction that invokes
the function, so it is not enough to add this attribute the barrier()
declaration.
As a bonus this will probably speed up compile times since we will no
longer need to run link-time optimizations.
|