| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
| |
This factors out the validation that is common with clBuildProgram().
v2:
- Code cleanups.
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
- Rename si_create_shader() to si_shader_binary_read()
|
|
|
|
|
| |
v2:
- Drop dependency on LLVM >= 3.5.1
|
|
|
|
|
|
|
| |
This adds a query which allows drivers to access the config
information of a specific function within the LLVM generated ELF
binary. This makes it possible for the driver to handle ELF
binaries with multiple kernels / global functions.
|
|
|
|
|
|
| |
It's annoying with octave. Reported by Michael Burian.
Cc: 10.2 10.3 <[email protected]>
|
|
|
|
| |
Signed-off-by: Dieter Nützel <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We are about to change mesa to spawn threads for deferred glCompileShader and
glLinkProgram, and we need to make sure those threads can send compiler
warnings/errors to the debug output safely.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
glsl_type has several static hash tables and a static ralloc context. They
need to be protected by a mutex as they are not thread-safe.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=69200
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There may be two contexts compiling shaders at the same time, and we want the
anonymous struct id to be globally unique.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_strtod and _mesa_strtof may be called from multiple threads. They need
to be thread-safe.
v2: platform checks are now done in configure.ac
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
With the assumptions that xlocale.h implies newlocale and strtof_l. SCons is
updated to define HAVE_XLOCALE_H on linux and darwin.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Both core mesa and glsl have their own wrappers for strtof_l. Merge
and move them to util/. They are compiled with a C++ compiler so that
we can make them thread-safe in a following commit.
Signed-off-by: Chia-I Wu <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This removes the need for the gallium rasterizer state
to listen to viewport changes.
Thanks to Marek Olšák <[email protected]>.
Reviewed-by: Marek Olšák <[email protected]>
Signed-off-by: Mathias Froehlich <[email protected]>
|
|
|
|
|
|
|
|
| |
This reverts commit cabc93c5adc9ea62be901621eff5ce4cb9574791.
Mark thinks the failures on the SNB GT2 in the lab are actually because
of faulty hardware, not instruction compaction. The GT1 didn't see any
problems after changes to the compaction code.
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Multiplication is commutative.
instructions in affected programs: 48314 -> 47954 (-0.75%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Helps a small number of vertex shaders in the games Dungeon Defenders
and Shank, as well as an internal benchmark.
instructions in affected programs: 2801 -> 2719 (-2.93%)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Use the UST value provided in the PRESENT_COMPLETE_NOTIFY event
rather than gettimeofday(), which gives us the presentation time
instead of the time when SwapBuffers was called. Suggested by
Keith Packard. This relies on the fact that the X DRI3/Present
implementations use microseconds for UST.
v3: Properly ignore PresentCompleteKindMSCNotify; multiply in 64 bits
(caught by Keith Packard).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Keith Packard <[email protected]> [v3]
Reviewed-by: Marek Olšák <[email protected]> [v1]
|
|
|
|
|
|
|
|
|
|
|
| |
These source files support actual geometry shaders, so using "gs" for
the name makes a lot of sense. We're going to be adding SIMD8 geometry
shader support as well, at which point "vec4_gs" will be a misnomer.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The brw_gs.[ch] and brw_gs_emit.c source files contain code for
emulating fixed-function unit functionality (VF primitive decomposition
or SOL) using the GS unit. They do not contain code to support proper
geometry shaders.
We've taken to calling that code "ff_gs" (see brw_ff_gs_prog_key,
brw_ff_gs_prog_data, brw_context::ff_gs, brw_ff_gs_compile,
brw_ff_gs_prog). So it makes sense to make the filenames match.
Signed-off-by: Kenneth Graunke <[email protected]>
Acked-by: Matt Turner <[email protected]>
Acked-by: Jason Ekstrand <[email protected]>
Acked-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GL functions and driver hooks use corresponding names---for example,
glMapBufferRange and Driver.MapBufferRange. But our implementation was
called "intel_bufferobj_map_range," which has the words "map" and
"buffer" swapped, as well as randomly adding "obj."
FlushMappedBufferRange was even trickier: it ordered the words
3, "obj", 1, 2, 4: intel_bufferobj_flush_mapped_range.
Even though the old names were consistent, I always had trouble
rearranging the jumble of words when searching for a function,
and it took a few tries to eventually land there.
The new names match the word order of GL and the driver hooks;
FlushMappedBufferRange is simply brw_flush_mapped_buffer_range.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents us from silently overflowing the stack arrays, and allows
arbitrary stack depths.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=85454
Cc: [email protected]
Reported-and-Tested-by: Nick Sarnie <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The OpenGL 4.0 core profile specification, section 2.17.3
Transform Feedback Draw Operations says:
"The error INVALID_VALUE is generated if <stream> is greater
than or equal to the value of MAX_VERTEX_STREAMS.
...
The error INVALID_OPERATION
is generated if EndTransformFeedback has never been called
while the object named by id was bound."
Fixes the piglit test:
ARB_transform_feedback3/arb_transform_feedback3-draw_using_invalid_stream_index
(with the test itself fixed to eliminate an unrelated failure)
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Fixes 14 ARB_vp tests (which had no lowering done), and should improve
performance of indirect uniform array access in GLSL.
|
| |
|
|
|
|
| |
This function is only called when it would return true.
|
|
|
|
|
| |
This signal doesn't terminate the program now, it terminates the program
soon. So you have to actually validate the code in the instruction.
|
| |
|
| |
|
|
|
|
|
|
|
| |
When we're checking if the framebuffer is sRGB capable, call
is_format_supported() with the PIPE_BIND_DISPLAY_TARGET flag.
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 20836c81851e0df29a8ee9c86e5e5388738c840b.
255 is a huge number. If you have a loop with 255 iterations, unrolling it
will exceed the SM3 instruction limit. Let's use the default again.
The comment about a SM3 limit doesn't make sense. For SM3, we generally
want 32 (default) or a lower number due to the SM3 instruction limit, which
is 512 instructions. For SM4, we can try higher numbers if needed, but
some shaders can end up being pretty huge and shader compilation can take
more time.
This fixes a shader compile failure on R500/SM3. Reported on IRC.
Cc: 10.2 10.3 <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Signed-off-by: David Heidelberger <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
I forgot that we cannot emit vertex shader state on a chip without VS.
In such a case, clip_halfz is handled by the Draw module.
|
|
|
|
|
| |
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
| |
Fixes piglit/polygon-mode-offset.
Cc: 10.2 10.3 [email protected]
Reviewed-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Caveat: Shaders using UBO/sampler indexing will
not be optimized by SB, due to SB not currently
supporting the necessary CF_INDEX_[01] index
registers.
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
| |
Requires evergreen/cayman
Signed-off-by: Glenn Kennard <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GL side of this extension just provides an accessor via glGetIntegerv for
the value of GL_CONTEXT_RELEASE_BEHAVIOR so it is trivial to implement. There
is a constant on the context for the value of the enum which is initialised to
GL_CONTEXT_RELEASE_BEHAVIOR_FLUSH. The extension is always enabled because it
doesn't need any driver interaction to retrieve the value.
If the value of the enum is anything but FLUSH then _mesa_make_current will
now refrain from calling _mesa_flush. This should only affect drivers that
explicitly change the enum to a non-default value.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Different platforms require the offset to be in different units. However,
the generator fixes all of this up for us and only requires an offset in
bytes. Previously, we were getting this wrong all over the place. Some
computed/used it correctly as bytes while others treated the offset as
whole registers or computed it as bytes or bytes*2 in SIMD16 mode. This
commit cleans all this up and makes us properly treat it as bytes
everywhere.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I thought this would be a clever way to make spilling less expensive.
However, it appears that the oword read/write messages we are using for
spilling ignore the execution size and assume SIMD16 whenever working with
more than one register.
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
| |
This enables ARB_conditional_render_inverted.
Signed-off-by: Tobias Klausmann <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fanin (merge) nodes require it's srcs to be "adjacent" in consecutive
scalar registers. Keep track of instruction neighbors in copy-
propagation step and avoid eliminating mov's which would cause an
instruction to need multiple distinct left and/or right neighbors.
This lets us not fall on our face when we encounter things like:
1: MOV TEMP[2], IN[0].xyzw
2: TEX OUT[0].xy, TEMP[2], SAMP[0], SHADOW2D
3: MOV TEMP[2].xy, IN[0].yxzz
4: TEX OUT[0].zw, TEMP[2], SAMP[0], SHADOW2D
5: END
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Always insert extra mov's for the tex coord into the fanin. This
simplifies things a bit, and avoids a scenario where multiple sam
instructions can have mutually exclusive input's to it's fanin, for
example:
1: TEX OUT[0].xy, IN[0].xyxx, SAMP[0], 2D
2: TEX OUT[0].zw, IN[0].yxxx, SAMP[0], 2D
The CP pass can always remove the mov's that are not actually needed,
so better to start out with too many mov's in the front end, than not
enough.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
dscis -> noscis
dbypass -> nobypass
a bit more consistant w/ nobin, etc. And IMO a bit more sensible names.
Signed-off-by: Rob Clark <[email protected]>
|