| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
More than half of the stuff in intel_reg.h had nothing whatsoever to do
with registers and really belongs in brw_defines.h anyway.
Signed-off-by: Jason Ekstrand <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
brw_emit_pipe_control_flush.
This hardware race condition has caused problems several times already
(see "i965: Fix cache pollution race during L3 partitioning set-up.",
"i965: Fix brw_render_cache_set_check_flush's PIPE_CONTROLs." and
"i965: intel_texture_barrier reimplemented"). The problem is that
whenever we attempt to both flush and invalidate multiple caches with
a single pipe control command the flush and invalidation happen in
reverse order, so the contents flushed from the R/W caches aren't
guaranteed to become visible from the invalidated caches after the
PIPE_CONTROL command completes execution if some concurrent rendering
workload happened to pollute any of the invalidated R/O caches in the
short window of time between the invalidation and flush.
This makes sure that brw_emit_pipe_control_flush() has the effect
expected by most callers of making the contents flushed from any R/W
caches visible from the invalidated R/O caches.
Cc: "12.0 11.1 11.2" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
* Declare loop index variable at loop site (idr)
* Make arrays of MI_MATH instructions 'static const' (idr)
* Remove commented debug code (idr)
* Updated comment in set_query_availability (Ken)
* Replace switch with if/else in hsw_result_to_gpr0 (Ken)
* Only divide GL_FRAGMENT_SHADER_INVOCATIONS_ARB by 4 on
hsw and gen8 (Ken)
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Its previous name was somewhat misleading, this really behaves like a
RW cache flush rather than an invalidation.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
readable.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
grep -lr 'sub license' | while read f; do \
sed --in-place -e 's/sub license/sublicense/' $f ;\
done
grep -lr 'NON-INFRINGEMENT' | while read f; do \
sed --in-place -e 's/NON-INFRINGEMENT/NONINFRINGEMENT/' $f ;\
done
As noted by Matt, both of these changes match the MIT license text found
at http://opensource.org/licenses/MIT.
Signed-off-by: Ian Romanick <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Why was that ever a thing?
Signed-off-by: Ian Romanick <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: Use macros for HW binding table edits (Topi)
v3: Add Broadwell support.
v4: Make hardware binding table bit definitions even more clearer (Ken)
Cc: [email protected]
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables using XY_FAST_COPY_BLT only for Yf/Ys tiled buffers.
It can be later turned on for other tiling patterns (X,Y) too.
V3: Flush in between sequential fast copy blits.
Fix src/dst alignment requirements.
Make can_fast_copy_blit() helper.
Use ffs(), is_power_of_two()
Move overlap computation inside intel_miptree_blit().
V4: Use _mesa_regions_overlap() function.
Add check for src_buffer == dst_buffer.
Simplify horizontal and vertical alignment computations.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously whenever a primitive is drawn the driver would call
_mesa_check_conditional_render which blocks waiting for the result of
the query to determine whether to render. On Gen7+ there is a bit in
the 3DPRIMITIVE command which can be used to disable the primitive
based on the value of a state bit. This state bit can be set based on
whether two registers have different values using the MI_PREDICATE
command. We can load these two registers with the pixel count values
stored in the query begin and end to implement conditional rendering
without stalling.
Unfortunately these two source registers were not in the whitelist of
available registers in the kernel driver until v3.19. This patch uses
the command parser version from intel_screen to detect whether to
attempt to set the predicate data registers.
The predicate enable bit is currently only used for drawing 3D
primitives. For blits, clears, bitmaps, copypixels and drawpixels it
still causes a stall. For most of these it would probably just work to
call the new brw_check_conditional_render function instead of
_mesa_check_conditional_render because they already work in terms of
rendering primitives. However it's a bit trickier for blits because it
can use the BLT ring or the blorp codepath. I think these operations
are less useful for conditional rendering than rendering primitives so
it might be best to leave it for a later patch.
v2: Use the command parser version to detect whether we can write to
the predicate data registers instead of trying to execute a
register load command.
v3: Simple rebase
v4: Changes suggested by Kenneth Graunke: Split the
load_64bit_register function out to a separate patch so it can be
a shared public function. Avoid calling
_mesa_check_conditional_render if we've already determined that
there's no query object. Some styling fixes.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recomendation [sic] is to set this field to 1 always. Programming it to default
value of 0, may have -ve impact on performance for MSAA WLs.
Another don't suck bit which needs to get set.
The patch wasn't as well tested as I would have liked, primarily I don't have
perf numbers for it, but it's getting to a point where it is in danger of being
lost.
v2: v1 was a mix of two patches. Since 0x7004 is masked, we only need to set it
once at initialization and make sure the pma workaround doesn't set the mask bit
(which it doesn't).
Move LRI to init gpu state (Ken)
Add a comment.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm not really sure of the origins of the existing flag names. Modern docs have
some slightly different names. Having the correct names makes it easier to
determine if existing PIPE_CONTROL flag settings are correct, as well as making
adding new PIPE_CONTROLs easier.
This originally came up while I was trying to implement workarounds and spotted
some things called, "flush" which should have been called "invalidate."
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NOTE: The implementation was initially one patch, this. All the history is kept
here, even though all the core mesa changes were moved to the parent of this
patch.
This patch implements ARB_pipeline_statistics_query. This addition to GL does
not add a new API. Instead, it adds new tokens to the existing query APIs. The
work to hook up the new tokens is trivial due to it's similarity to the previous
work done for the query APIs. I've implemented all the new tokens to some
degree, but have stubbed out the untested ones at the entry point for Begin().
Doing this should allow the remainder of the code to be left in.
The new tokens give GL clients a way to obtain stats about the GL pipeline.
Generally, you get the number of things going in, invocations, and number of
things coming out, primitives, of the various stages. There are two immediate
uses for this, performance information, and debugging various types of
misrendering. I doubt one can use these for debugging very complex applications,
but for piglit tests, it should be quite useful.
Tessellation shaders, and compute shaders are not addressed in this patch
because there is no upstream implementation. I've implemented how I believe
tessellation shader stats will work for Intel hardware (though there is a bit of
ambiguity). Compute shaders are a bit more interesting though, and I don't yet
know what we'll do there.
For the lazy, here is a link to the relevant part of the spec:
https://www.opengl.org/registry/specs/ARB/pipeline_statistics_query.txt
Running the piglit tests
http://lists.freedesktop.org/archives/piglit/2014-November/013321.html
(http://cgit.freedesktop.org/~bwidawsk/piglit/log/?h=pipe_stats)
yield the following results:
> piglit-run.py -t stats tests/all.py output/pipeline_stats
> [5/5] pass: 5 Running Test(s): 5
v2:
- Don't allow pipeline_stats to be per stream (Ilia). This may (not sure) be
needed for AMD_transform_feedback4, which we do not support.
> If AMD_transform_feedback4 is supported then GEOMETRY_SHADER_PRIMITIVES_-
> EMITTED_ARB counts primitives emitted to any of the vertex streams for
> which STREAM_RASTERIZATION_AMD is enabled.
- Remove comment from GL3.txt because it is only used for extensions that are
part of required versions (Ilia)
- Move the new tokens to a new XML doc instead of using the main GL4x.xml (Ilia)
- Add a fallthrough comment (Ilia)
- Only divide PS invocations by 4 on HSW+ (Ben)
v3:
- Add ARB_pipeline_statistics_query to relnotes.html
- Add ARB_pipeline_statistics_query.xml to the Makefile.am, and master XML (Ilia)
- Correct extension number (Ilia)
- Add link to xml in the main GL API xml (Ilia)
- remove special GS case from gen6_end_query (Ian)
- Make lookup table static so gcc doesn't initialized it on every call (Ian)
- Use if (_mesa_has_geometry_shaders(ctx)) instead of explicit checks (Ian)
- Core mesa parts moved into a prep patch (Ilia)
v4:
- Change to 10.6 relnotes since we missed 10.5 window
- Moved compute shader stuff into the switch statement (Jordan)
- Jordan: Add compute shader support
v5:
- Fixed relnote style (Ilia)
v6:
- Rebased on master which beat me to adding the first relnotes - essentially
this undoes v5 (which had a typo anyway)
- Some code style fixes (Ken)
- Remove some excess comments (Ken)
- Unify tessellation failure style - unreachable (Ken)
- Fix workaround comment for PS invocations (Ken)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
This patch adds macros needed for the HiZ PMA stall optimization.
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Tungsten Graphics Inc. was acquired by VMware Inc. in 2008. Leaving the
old copyright name is creating unnecessary confusion, hence this change.
This was the sed script I used:
$ cat tg2vmw.sed
# Run as:
#
# git reset --hard HEAD && find include scons src -type f -not -name 'sed*' -print0 | xargs -0 sed -i -f tg2vmw.sed
#
# Rename copyrights
s/Tungsten Gra\(ph\|hp\)ics,\? [iI]nc\.\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./g
/Copyright/s/Tungsten Graphics\(,\? [iI]nc\.\)\?\(, Cedar Park\)\?\(, Austin\)\?\(, \(Texas\|TX\)\)\?\.\?/VMware, Inc./
s/TUNGSTEN GRAPHICS/VMWARE/g
# Rename emails
s/[email protected]/[email protected]/
s/[email protected]/[email protected]/g
s/jrfonseca-at-tungstengraphics-dot-com/jfonseca-at-vmware-dot-com/
s/jrfonseca\[email protected]/[email protected]/g
s/keithw\[email protected]/[email protected]/g
s/[email protected]/[email protected]/g
s/thomas-at-tungstengraphics-dot-com/thellstom-at-vmware-dot-com/
s/[email protected]/[email protected]/
# Remove dead links
s@Tungsten Graphics (http://www.tungstengraphics.com)@Tungsten Graphics@g
# C string src/gallium/state_trackers/vega/api_misc.c
s/"Tungsten Graphics, Inc"/"VMware, Inc"/
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
Performed via:
$ for file in *; do sed -i 's/ *//g'; done
Signed-off-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
- MMIO registers for draw parameters
- New bit in 3DPRIMITIVE command to enable indirection
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
We'll need to write this register to start/stop performance counters.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
| |
This command reads a value from memory and writes it to a register (the
opposite of MI_STORE_REGISTER_MEM). It's only available on Gen7+.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7+ supports four transform feedback streams. Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so use that.
Gen6 only supports a single stream, so the single #define should be
fine. However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name. For one,
the word "NUM" doesn't appear in the actual name of the register.
It's also confusingly generic, as it doesn't exist on Gen7+. Add a
"GEN6_" prefix for clarity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7+ supports four transform feedback streams. Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so we use that.
Gen6 only supports a single stream, so the single #define should be
fine. However, SO_NUM_PRIMS_WRITTEN was confusingly generic, as it
doesn't exist on Gen7+. Add a "GEN6_" prefix for clarity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
v2: Remove unused DV_PF_* macros, too. (change by Ken)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Now that i915's forked off, they don't need to live in a shared directory.
Acked-by: Kenneth Graunke <[email protected]>
Acked-by: Chad Versace <[email protected]>
Acked-by: Adam Jackson <[email protected]>
(and I hear second hand that idr is OK with it, too)
|
| |
|
|
|
|
|
| |
OUT_BATCH is far more amenable to the upcoming relocations being done for TTM
support.
|
|
This driver comes from Tungsten Graphics, with a few further modifications by
Intel.
|