| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
There is no need for this to be in the common code.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Eric renamed these from dri_bufmgr_* and intel_bufmgr_* to drm_intel_*
in libdrm commit 4b9826408f65976a1a13387beda748b65e03ec52, circa 2008,
but we've been using the legacy names this whole time.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
These two functions do the exact same thing. One returns a uint64_t,
and the other takes the same uint64_t and truncates it to a uint32_t.
We only need the uint64_t variant - the caller can truncate if it wants.
This patch gives us one function, intel_batchbuffer_reloc, that does
the 64-bit thing.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The same thing is done in i915_update_program called by i915InvalidateState.
Why do it twice.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: add missing %s
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Yf/Ys tiling never got used in i965 due to not delivering
the expected performance benefits. So, this patch is deleting
this dead code in favor of adding it later in ISL when we
actually find it useful. ISL can then share this code between
vulkan and GL.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fast copy blit was primarily added to support Yf/Ys detiling.
But, Yf/Ys tiling never got used in i965 due to not delivering
the expected performance benefits. Also, replacing legacy blits
with fast copy blit didn't help the benchmarking numbers. This
is probably due to a h/w restriction that says "start pixel for
Fast Copy blit should be on an OWord boundary". This restriction
causes many blit operations to skip fast copy blit and use legacy
blits. So, this patch is deleting this dead code in favor of
adding it later when we actually find it useful.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We've already required Kernel 3.6 on Gen6+ since Mesa 9.2 (May 2013,
commit 92d2f5acfadea672417b6785710c9e8b7f605e41). It seems reasonable
to require it for Gen4-5 as well, bumping the requirement from 2.6.39.
This is necessary for glClientWaitSync with a timeout to work, which
is a feature we expose on Gen4-5. Without it, we would fall back to an
infinite wait, which is pretty bad.
See kernel commit 172cf15d18889313bf2c3bfb81fcea08369274ef in 3.6+.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
It's indexed by buffer, not stream. BRW_MAX_SOL_BUFFERS and
MAX_VERTEX_STREAMS happen to both be 4, so there's no actual bug.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We create the BO when creating a transform feedback object, and only
destroy it when deleting that object. So it won't be NULL.
CID: 1401410
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Automake generation rules are replicated for android.
$* macro was expected to return "hsw" but instead gives "hsw.{h,c}"
so $(basename $*) is used as a workaround
to set the correct --chipset option for brw_oa.py script.
Build tested with nougat-x86
Fixes: e565505 "i965: Add script to gen code for OA counter queries"
Reviewed-by: Tapani Pälli <[email protected]>
Acked-by: Robert Bragg <[email protected]>
Acked-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Game ported from D3D9 which expects sqrt() to compute the absolute
value as explained in the spec.
This gets rid of the NaN values as well as the black squares
with RadeonSI.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=97338
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This will allow to force computing the absolute value for sqrt()
and inversesqrt() in order to follow D3D9 behaviour for buggy
apps that rely on it.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Any users of KitKat are likely using an older version of Mesa and
KitKat support adds complexity to the make files. Dropping support
allows removing the MESA_LOLLIPOP_BUILD make variable in various make
files.
Signed-off-by: Rob Herring <[email protected]>
Reviewed-by: Tapani Pälli <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The only remaining case is the brw_oa.py generator which pipes the
generated file to stdout. That will be resolved with later commits.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Acked-by: Vedran Miletić <[email protected]>
Acked-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Used only internally.
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Lionel Landwerlin <[email protected]>
Acked-by: Vedran Miletić <[email protected]>
Acked-by: Juha-Pekka Heikkila <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
|
|
| |
All the plumbing is in place so the extension just needs to be
advertised.
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't really "do" anything because the default tiling for the
winsys buffer is X tiled. We do however want the X tiled modifier to
work correctly from the API perspective, which would imply that if you
set this modifier, and later do a get_modifier, you get back at least X
tiled.
Running with a modified kmscube, here are the bandwidth measurements.
Linear:
Read bandwidth: 1039.31 MiB/s
Write bandwidth: 1453.56 MiB/s
Y-tiled:
Read bandwidth: 458.29 MiB/s
Write bandwidth: 542.12 MiB/s
X-tiled:
Read bandwidth: 575.01 MiB/s
Write bandwidth: 606.25 MiB/s
Cc: Kristian Høgsberg <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Acked-by: Daniel Stone <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch begins introducing how we'll actually handle the potentially
many modifiers coming in from the API, how we'll store them, and the
structure in the code to support it.
Prior to this patch, the Y-tiled modifier would be entirely ignored. It
shouldn't actually be used until this point because we've not bumped the
DRIimage extension version (which is a requirement to use modifiers).
Measuring later in the series with kmscube:
Linear:
Read bandwidth: 1048.44 MiB/s
Write bandwidth: 1483.17 MiB/s
Y-tiled:
Read bandwidth: 471.13 MiB/s
Write bandwidth: 589.10 MiB/s
Similar functionality was introduced and then reverted here:
commit 6a0d036483caf87d43ebe2edd1905873446c9589
Author: Ben Widawsky <[email protected]>
Date: Thu Apr 21 20:14:58 2016 -0700
i965: Always use Y-tiled buffers on SKL+
v2: Use last set bit instead of first set bit in modifiers to address
bug found by Daniel Stone.
v3: Use the new priority modifier selection thing. This nullifies the
bug fixed by v2 also.
v4: Get rid of modifier compaction which originally served another
purpose and now serves none (Jason)
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Daniel Stone <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At image creation create a path for dealing with the linear modifier.
This works exactly like the old usage flags where __DRI_IMAGE_USE_LINEAR
was specified.
During development of this patch series, it was decided that a lack of
modifier was an insufficient way to express the required modifiers. As a
result, 0 was repurposed to mean a modifier for a LINEAR layout.
NOTE: This patch was added for v3 of the patch series.
v2: Rework the algorithm for modifier selection to go from a bitmask
based selection to this priority value.
v3: Make DRM_FORMAT_MOD_INVALID allowed at selection as a way of
identifying no modifiers found (because 0 is LINEAR) (Jason)
v4: Remove the logic to prune unknown modifiers (like those from other
vendors) and simply handle is in select_best_modifier (Jason)
Requested-by: Jason Ekstrand <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
New to the patch series after reordering things for landing smaller
chunks.
This will essentially enable modifiers from clients that were just
enabled in previous patches. A client could use the modifiers by
setting all of them at create, but had no way to actually query them
after creating the surface (ie. stupid clients could be broken before
this patch, but in more ways than this).
Obviously, there are no modifiers being actually stored yet - so this
patch shouldn't do anything other than allow the API to get back 0 (or
the LINEAR modifier).
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I intend to need to get to the devinfo structure, and storing the screen
is an easy way to do that.
It seems to be the consensus that you cannot share an image between
multiple screens.
Scape-goat: Rob Clark <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
Acked-by: Daniel Stone <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Recent glibc generates this warning:
brw_performance_query.c:1648:13: warning: In the GNU C Library, "minor" is defined
by <sys/sysmacros.h>. For historical compatibility, it is
currently defined by <sys/types.h> as well, but we plan to
remove this soon. To use "minor", include <sys/sysmacros.h>
directly. If you did not intend to use a system-defined macro
"minor", you should undefine it after including <sys/types.h>.
min = minor(sb.st_rdev);
So, include sys/sysmacros.h to shut up the warning.
v2: Use the AC_HEADER_MAJOR defines to figure out the right header
(thanks to Jonathan Gray for helping me not break non-glibc systems)
Reviewed-by: Matt Turner <[email protected]> [v1]
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was used for aubdumping (deleted a while ago) and INTEL_DEBUG=bat
decoding (deleted recently).
While we're changing parameters, delete the wrapper macro and make the
actual function brw_state_batch instead of __brw_state_batch.
This subsumes a patch by Emil Velikov to drop this from BLORP.
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This deletes all of our handwritten code in favor of autogenerated
genxml-based decoding. This should be much more usable, as the old
code isn't entirely accurate - we updated some things for new
generations, but not everything.
Aubinator has one annoying limitation: it has no idea how many entries
to print when encountering e.g. 3DSTATE_BINDING_TABLE_POINTERS_VS. It
picks an arbitrary number, which may skip decoding valid data, and may
print extra garbage entries.
We do a better job here by making brw_state_batch track the size of the
data stored at a particular batchbuffer offset. Then, we can divide by
the structure size to obtain the exact number of entries.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This should give substantially better decoding, as the public libdrm
decoder hasn't been properly maintained in years.
For now, we reuse the existing state dumping mechanism. We'll improve
that in the next patch.
To avoid increasing the size of the driver, we restrict this feature
to debug builds of Mesa. There's probably very little use for it in
release builds anyway.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Detecting register write support by trial and error introduces a
stall at screen creation time, which it would be nice to avoid.
Certain command parser versions guarantee this will work (see the
giant comment in intelInitScreen2 below, or a few commits ago):
- Ivybridge: version >= 1 (kernel v3.16)
- Baytrail: version >= 2 (kernel v3.19)
- Haswell: version >= 7 (kernel v4.8)
For simplicity, we don't bother with version 1 in this patch.
This assumes that the user hasn't disabled aliasing PPGTT via a kernel
command line parameter. Don't do that - you're only breaking things.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If we can't write registers, then the effective command parser version
is 0 - it may exist, but it's not usefully enabling anything.
See kernel commit 1ca3712ca3429a617ed6c5f87718e4f6fe4ae0c6 (in v4.8)
where the kernel starts doing this for us. This makes us do more or
less the same thing on older kernels.
This should preserve a bit of sanity by allowing us to perform a
screen->cmd_parser_version > N check to determine that we really can
use the features promised by command parser version N.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
| |
This should help us figure out the complexities of which kernel
versions we need to get various features on various platforms.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In commit d2590eb65ff28a9cbd592353d15d7e6cbd2c6fc6 I enabled GL 4.5
on Haswell...but failed to check if we could do indirect compute
shader dispatch...and query buffer objects.
Indirect compute shader dispatch requires command parser version 5
(kernel commit 7b9748cb513a6bef4af87b79f0da3ff7e8b56cd8, which is in
Linux v4.4). On earlier kernels we would have disabled
ARB_compute_shader, which is a mandatory part of OpenGL 4.3+.
Query buffer objects currently require MI_MATH and MI_LOAD_REGISTER_REG,
which mean command parser version 7 (Linux v4.8). On earlier kernels
we would have disabled ARB_query_buffer_object, which is a mandatory
part of OpenGL 4.4+.
The new version support looks like:
- Kernel 4.1 and older => OpenGL 3.3
- Kernel 4.2-4.3 => OpenGL 4.2
- Kernel 4.4-4.7 => OpenGL 4.3
- Kernel 4.8+ => OpenGL 4.5
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The PRMs state that this packet is 16 DWORDS long. Ensure that the last
three DWORDS are zeroed as required by the hardware when allocating a
null surface state.
Cc: <[email protected]>
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock
with the convenient property of being able to scale by an integer (80)
to nanosecond units.
For Skylake the frequency is 12MHz or a scale factor of 83.333333
This updates gen_device_info to track a floating point timebase_scale
factor and makes corresponding _queryobj.c changes to no longer assume a
scale factor of 80 works across all gens.
Although the gen6_ code could have been been left alone, the changes
keep the code more comparable, and it now shares a few utility functions
for scaling raw timestamps and calculating deltas. The utility for
calculating deltas takes into account 32 or 36bit overflow depending on
the current kernel version.
Note: this leaves the timestamp handling of ARB_query_buffer_object
untouched, which continues to use an incorrect scale of 80 on Skylake
for now. This is more awkward to solve since the scaling is currently
done using a very limited uint64 ALU available to the command parser
that doesn't support multiply or divide where it's already taking a
large number of instructions just to effectively multiple by 80.
This fixes piglit arb_timer_query-timestamp-get on Skylake
v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoid the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.
This issue with strncpy was picked up by Coverity.
CID: 1402201
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
just as earlier gens do.
CC: "17.0 13.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=96743
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Fix the build on OpenBSD by removing an uneeded include for asm/unistd.h.
Signed-off-by: Jonathan Gray <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
Unintentionally introduced by yours truly with the i965 compiler move.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
% pattern rules are a GNU extension. As there is only one file here
avoid patterns and globbing entirely to fix the build on non-GNU make.
Signed-off-by: Jonathan Gray <[email protected]>
v2 [Emil Velikov: brw_oa.py dependency]
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
For now we always return true, follow-up patches will handle fail scenarios.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
Acked-by: Timothy Arceri <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Tested-by: Mike Lothian <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Tested-by: Mike Lothian <[email protected]>
|
|
|
|
|
|
|
| |
Acked-by: Timothy Arceri <[email protected]>
Acked-by: Marek Olšák <[email protected]>
Tested-by: Dieter Nützel <[email protected]>
Tested-by: Mike Lothian <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Nothing special here other than a brief introduction to modifier
selection. Originally this was part of another patch but was split out
from
gbm: Introduce modifiers into surface/bo creation by request of Emil.
Requested-by: Emil Velikov <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This is just a stub for now and will be filled in later.
This was split out of an earlier patch
Requested-by: Emil Velikov <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
The GL driver had a driconf option (which doesn't make much sense) and
the Vulkan driver had a hand-rolled environment variable. Instead,
let's tie both into the INTEL_DEBUG mechanism and unify things.
Reviewed-by: Topi Pohjolainen <[email protected]>
|