| Commit message (Collapse) | Author | Age | Files | Lines |
| |
|
| |
|
|
|
|
| |
between begin_query and end_query
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
v2: lots of cosmetic changes
|
|
|
|
|
|
|
| |
The pipe query interface is reused. The list of available queries can be
obtained using pipe_screen::get_driver_query_info.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
"Conditional jump or move depends on uninitialised value(s)"
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
https://bugs.freedesktop.org/show_bug.cgi?id=62573
Tested-by: Andreas Boll <[email protected]>
|
|
|
|
|
| |
Postprocessing is an internal meta op and should restore the states
it changes.
|
|
|
|
| |
Reviewed-by: Michel Dänzer <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
| |
Virtual address is used for PIPE_QUERY_SO* queries in
r600_emit_query_begin, but not in r600_emit_query_end.
This will trigger a GPU fault when one of those queries is
made and virtual address is enabled.
Note: this is a candidate for the 9.1 branch
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Since we are UMA, in most cases the GPU blit doesn't make much sense for
texture upload.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Optimize out parts of the render target that are scissored out by taking
into account maximal scissor bounds in fd_gmem_render_tiles().
This is a big win on things like gnome-shell which frequently do partial
screen updates.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
target-specific variables are undefined when used as pre-requisites.
instead, use secondary-expansion.
I noticed this when building the patch:
i965: Add a driconf option to disable flush throttling
Signed-off-by: Adrian Marius Negreanu <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since half of ir_validate uses asserts() (the other using printf() then
abort()), there's not much use to calling it in a release build. Cuts
6.3% of the startup time of TF2.
NOTE: This is a candidate for the stable branches.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
This is really not generic conversion stuff and the code very particular to
these formats.
|
|
|
|
|
|
|
| |
Fixes assign instead of compare defects reported by Coverity.
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch changes the arrays in brw_vue_map (which only ever contain
values from -1 to 58) from ints to signed chars. This reduces the
size of the struct from 488 bytes to 136 bytes.
Reviewed-by: Kenneth Graunke <[email protected]>
v2: fix STATIC_ASSERT to use 127 instead of 128.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
With the introduction of geometry shaders, fragment inputs will no
longer come exclusively from the vertex shader; sometimes they come
from the geometry shader. So the name "vp_outputs_written" will
become a misnomer. This patch renames vp_outputs_written to
input_slots_valid, to reflect the true meaning of the bitfield from
the fragment shader's point of view: it indicates which of the
possible input slots contain valid data that was written by the
previous shader stage.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch modifies post-GS pipeline stages (transform feedback, clip,
sf, fs) to refer to the VUE map through brw->vue_map_geom_out rather
than brw->vs.prog_data->vue_map. This ensures that when geometry
shader support is added, these pipeline stages will consult the
geometry shader output VUE map when appropriate, rather than the
vertex shader output VUE map.
v2: Fixed some stale "CACHE_NEW_VS_PROG" comments.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently, the GPU pipeline has one active VUE map in effect at any
given time--the one representing the layout of vertex data coming from
the vertex shader. However, when geometry shaders are added, they
will have their own independent VUE map. Later pipeline stages (clip,
sf, fs) will need to consult the geometry shader VUE map if a geometry
shader is in use, and the vertex shader VUE map otherwise.
This patch adds a new field to brw_context, vue_map_geom_out, which
contains the VUE map that should be used by later pipeline stages. It
also adds a new state flag, BRW_NEW_VUE_MAP_GEOM_OUT, which is
signalled whenever the contents of the VUE map changes.
Since we don't support geometry shaders yet, vue_map_geom_out is
currently set only by the brw_vs_prog state atom.
v2: Don't set vue_map_geom_out in do_vs_prog--that's redundant and
possibly problematic for precompiles. Only set it in
brw_upload_vs_prog. Also, make a copy instead of using a
pointer--this makes it possible to detect when the VUE map hasn't
changed, so we can avoid redundant state uploads.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Future patches will allow for there to be separate VUE maps when both
a geometry shader and a vertex shader are in use. When this happens,
we will want to have correspondingly separate outputs_written
bitfields. Moving outputs_written into the VUE map will make this
easy.
For consistency with the terminology used in the VUE map, the bitfield
is renamed to "slots_valid" in the process.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7 adds mask bits to the message header for a URB write which allow
the write to apply only to certain channels. We don't use this
functionality, so to ensure that the entire write always occurs, we
emit an OR instruction to set the mask bits.
With the advent of geometry shaders, URB writes won't just happen at
the end of a thread; they will happen in mid-thread too. Thus, we can
no longer rely on channel 0 being enabled, so we need to emit the OR
instruction in WE_all mode to ensure that it is executed.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
The new name clarifies that it represents *one more* than the maximum
possible brw_varying_slot value.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch removes the terminology "vert_result" from the i965 driver,
replacing it with "varying". The old terminology, "vert_result", was
confusing because (a) it referred to the enum gl_vert_result, which no
longer exists (it was replaced with gl_varying_slot), and (b) it
implied a vertex output, but with the advent of geometry shaders, it
could be either a vertex or a geometry output, depending what shaders
are in use. The generic term "varying" is less confusing.
No functional change.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
v2: Whitespace fixes.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bump MAX_DEPTH_TEXTURE_SAMPLES to match what GetInternalformativ is
claiming. Since that limit is what is actually enforced now, this
doesn't actually change anything except the queried value.
There's still no piglits verifying that multisample depth textures work,
but this works in the Unigine demos.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Extends _mesa_check_sample_count() to properly support the
TEXTURE_2D_MULTISAMPLE and TEXTURE_2D_MULTISAMPLE_ARRAY targets, which
have subtly different limits than renderbuffers.
This resolves the remaining TODO in the implementation of
TexImage*DMultisample.
V2: - Don't introduce spurious block.
- Do this in multisample.c instead.
- Fix typo in error message.
- Inline spec quotes
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pulls the checking of the sample count into a helper function, and
extends the existing logic to include the interactions with both
ARB_texture_multisample and ARB_internalformat_query.
_mesa_check_sample_count() checks a desired sample count against a
a combination of target/internalformat, and returns the error enum
to be produced, if any. Unfortunately the conditions are messy and the
errors vary.
V2: - Tidy up spurious block.
- Move _mesa_check_sample_count() to multisample.c instead; It
doesn't really belong in fbobject.c or teximage.c.
- Inlined spec quotes
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Now that we support ARB_texture_multisample, there are multiple targets
accepted for this query, and they may have target-dependent limits, so
pass the target to the driverfunc.
For example, the sampling hardware may not be able to do general
texelFetch() for some format/sample count combination, but the driver
may still be able to implement a reasonable resolve operation, so it can
be supported for renderbuffers.
V2: - Don't break Gallium compile.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
path.
Signed-off-by: Dmitry Cherkassov <[email protected]>
Signed-off-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
And use this (and the code for r11g11b10 packed float to float conversion)
in the soa texturing code (the generated code looks quite good).
Should be an order of magnitude faster probably than using the fallback
(not measured).
Tested with piglit texwrap GL_EXT_packed_float and
GL_EXT_texture_shared_exponent respectively (didn't find much else using
it).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The blit-based paths for TexImage, GetTexImage, and ReadPixels aren't very
fast with software rasterizer. Now Gallium drivers have the ability to turn
them off.
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Initial version contributed by: Martin Andersson <[email protected]>
This is only used if the memcpy path cannot be used and if no transfer ops
are needed. It's pretty similar to our TexImage and GetTexImage
implementations.
The motivation behind this is to be able to use ReadPixels every frame and
still have at least 20 fps (or 60 fps with a powerful GPU and CPU)
instead of 0.5 fps.
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I'll need the _mesa_readpixels_needs_slow_path function for the blit-based
version, but it's also useful to have this memcpy-based path in one place
and not scattered across several functions.
v2: add "const" to function parameters
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I'll need both new functions for later. For now, it consolidates the code
for determining what the transfer ops should be and makes it a little bit
smarter.
v2: added "const"
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
NOTE: This is a candidate for the stable branches.
Reviewed-by: Brian Paul <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New conversion code to handle conversion from/to r11g11b10 AoS to/from
SoA floats, and also add code for conversion from rgb9e5 AoS to float SoA
(which works pretty much the same as r11g11b10 except for the packing).
(This code should also be used for texture sampling instead of
relying on u_format conversion but it's not yet, so rgb9e5 is unused.)
Unfortunately a crazy amount of hacks is necessary to get the conversion
code running in llvmpipe's generate_unswizzled_blend, which isn't well
suited for formats where the storage representation has nothing to do
with what's needed for blending (moreover, the conversion will convert
from packed AoS values, which is the storage format, to float SoA values,
because this is much more natural for the conversion, and likewise from
SoA values to packed AoS values - but the "blend" (which includes
trivial things like partial mask) works on AoS values, so incoming fs
values will go SoA->AoS, values from destination will go packed
AoS->SoA->AoS, then do blend, then AoS->SoA->packed AoS which probably
isn't the most efficient way though the shuffles are probably bearable).
Passes piglit fbo-blending-formats (with GL_EXT_packed_float parameter),
still need to verify Inf/NaNs (where most of the complexity in the
conversion comes from actually).
v2: drop the (very bogus) rgb9e5 part, and do component extraction
in the helper code for r11g11b10 to float conversion, making the code
slightly more compact (suggested by Jose), now that there are no other
callers left this works quite well. (Could do the same for the
opposite way but it's less than ideal there, final part of packing
needs to be done in caller anyway and there'd be another conditional.)
v3: minor style and comment fixes. Also fix a potential issue with
negative zero being potentially returned by max(src, zero) as we
don't have well-defined min/max behavior (fortunately no additonal cost).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
This helps minimize confusion / effort when moving between branches or
helping others.
Reviewed-by: Alex Deucher <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally when submitting the first batch buffer after a flush, we
check whether the GPU has completed processing of the first batch
buffer of the previous frame. If it hasn't, we wait for it to finish
before submitting any more batches. This prevents GPU-heavy and
CPU-light applications from racing too far ahead of the current frame,
but at the expense of possibly lower frame rates. Sometimes when
benchmarking we want to disable this mechanism.
This patch adds the driconf option "disable_throttling" to disable the
throttling mechanism.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
NOTE: This is a candidate for the 9.1 branch.
Fixes piglit's texture-immutable-levels test.
Reported-by: Marek Olšák <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Note: This is a candidate for the stable branches.
Signed-off-by: Adam Jackson <[email protected]>
|