| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes following build issues:
In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/common/dri_util.c:45:
vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found
...
In file included from vendor/intel/external/android_ia/mesa/src/mesa/drivers/dri/i965/intel_screen.c:44:
vendor/intel/external/android_ia/mesa/src/util/xmlpool.h:103:10: fatal error: 'xmlpool/options.h' file not found
Fixes: 601093f9 (xmlconfig: move into src/util)
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Chih-Wei Huang <[email protected]>
|
|
|
|
|
|
|
| |
Since make_surface() can fail, if the format isn't support by hw or
simlar error, we need to check the result before dereferencing it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Fixes: 601093f95ddf ("xmlconfig: move into src/util")
Tested-by: Eric Engestrom <[email protected]>
Tested-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
It's a single atomic add, so it makes sense to inline it.
Improves performance in Piglit's drawoverhead microbenchmark's
"DrawArrays ( 1 VBO, 0 UBO, 0 ) w/ no state change" subtest by
0.400922% +/- 0.310389% (n=350) on my i7-7700HQ.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: attempt to fix Android build (Emil)
v3: add missing include path
Reviewed-by: Marek Olšák <[email protected]> (v1)
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
In commit 877128505431adaf817dc8069172ebe4a1cdf5d8, José replaced the
Tungsten Graphics copyright notices with VMware, as Tungsten is gone.
I later imported brw_bufmgr.c, reintroducing a Tungsten copyright.
This commit does the equivalent of José's change to the new file.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reformats the copyright header to match what we use in most of the
newer parts of the driver. There are a few minor alterations: we change
"COPYRIGHT HOLDERS, AUTHORS AND/OR ITS SUPPLIERS" to the standard
"AUTHORS OR COPYRIGHT HOLDERS", and move the permission notice to the
proper place (it should be in the middle, so "next paragraph" actually
refers to something).
Both of these changes match the OSI's MIT License text:
https://opensource.org/licenses/MIT
I copied this from genX_state_upload.c.
|
|
|
|
|
|
|
| |
This reverts commit a7617a49fbde2fcfccdab22886aeabdbf8abb8e4.
glthread disables itself automatically and therefore has no effect
on the game.
|
|
|
|
|
|
|
| |
This is useless.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We already have this little optimization for color clears. Now that
we're actually tracking whether or not a slice has any fast-clear
blocks, it's easy enough to add for depth clears too.
Improves performance of GFXBench 4 TRex at 1920x1080 by:
- Skylake GT4: 0.905932% +/- 0.0620197% (n = 30)
- Apollolake: 0.382434% +/- 0.1134730% (n = 25)
v2: (by Ken) Rebase and drop intel_mipmap_tree.c changes, as they're
no longer necessary (other patches already landed to do that part)
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When changing the clear value, we need to resolve any fast cleared data.
Previously, we were performing resolves on every slice with HiZ enabled.
We only need to resolve slices that a) have fast clear data, and b)
aren't about to be cleared to the new color. In the latter case, we
were actually doing a resolve, and then a fast clear - when we could
skip both, causing the existing fast cleared area to be updated to the
new clear value for no additional work.
This patch stops using intel_miptree_prepare_access in favor of a more
optimal open coded loop that knows about our clear operation.
v2: (by Ken) Rebase on islification, write a real commit message.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
I want to use it in brw_clear.c.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
From 25-26 min fps to 31, used the game in conjuction with a mod (full
invasion 2) beaumaris castle map and 200 bots.
|
|
|
|
|
|
|
|
| |
Here the AUX_USAGE_* mode indicates that we have HiZ, so we will have
a HiZ buffer. But Coverity doesn't know that, so it thinks it might
be NULL because we checked hiz_buf != NULL earlier.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
Caught by Coverity (CID 1415680).
Cc: "17.2" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Increase the value, not the pointer to the stack variable.
Caught by Coverity (CID 1415574). Not shipped in a real release.
Cc: "17.2" <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Performance delta on Core i5-4570 + Radeon R9 270:
Overlord: +20% in certain locations
Overlord II: +20% in certain locations
Oil Rush: +12% in most locations
War Thunder: +4-9% in benchmarks
Saints Row 2: +10-35% in certain locations
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
As Chris commented, it makes more sense to have batch buffer flushes
before the query. Usually applications like frame_retrace do a series
of queries and in that case, with flushes at the end of the queries,
we might still have the first query contained in 2 different batchs.
More generally it would be quite usual to have the query contained in
2 batch buffers because we never now what's the fill rate of the
current batch buffer.
If we move the flushing at the beginning of the queries, it's pretty
much guaranteed that queries will be contained in a single batch
buffer (unless the amount of commands is huge, but then it's only fair
to include reloading request times in the measurements).
Fixes: adafe4b733c02 ("i965: perf: minimize the chances to spread queries across batchbuffers")
Reported-by: Chris Wilson <[email protected]>
Signed-off-by: Lionel Landwerlin <[email protected]>
Cc: "17.2 17.1" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
No need for all that switching when we can just assign a nice little
variable with the number of layers.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
gen4 have commands which start with KernelStartPointer, which is a
struct, so if we initialize it struct = { 0 }, we get warnings on some
compilers:
"GCC (pre 4.9?) can throw a Wmissing-braces on[1] while clang
-Wmissing-field-initializers [2]." - Emil
[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=53119
[2] https://bugs.llvm.org/show_bug.cgi?id=21689
This change works around that and will silence such warnings. It is both
a GCC and a clang extension.
v2:
- Use {} instead of memset macro (Matt)
Signed-off-by: Rafael Antognolli <[email protected]>
Cc: Jason Ekstrand <[email protected]>
Cc: Matt Turner <[email protected]>
Cc: Emil Velikov <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The extension should be in the list as returned by getExtensions().
Seems to have gone unnoticed since close to nobody wants to change the
vblank mode for the software driver.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The set of formats which supports CCS_E is actually fairly small on
gen9. However, everything that supports fast-clears on gen8 also
supports fast-clears on gen9+. The one very annoying exception is
that blending is broken for non-0/1 clear colors with sRGB formats.
In order to solve that problem, we do a resolve to get rid of the
clear color. Another option would be to just not fast-clear with
non-0/1 clear colors however non-0/1 + blending + sRGB is uncommon
enough that this shouldn't be a significant performance problem.
This appears to help gl_manhattan31_off by about 2%.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This commit replaces the generic "flags" parameter with a more explicit
aux usage parameter. This leads to a lot of duplicated code at the
moment but this will all get cleaned up directly.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
This will be a bit more convenient momentarily. It's also more correct
because it makes prepare_texture take sRGB into account.
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
This requires us to start using the partial clear state. It makes
things quite a bit more complicated but it's still a fairly
straightforward exercise in diagram following.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Now that we have this field, it's much easier to switch on it than to
walk an if ladder that checks different things.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
We also simplify the way we handle stencil since we know a priori that
it will have ISL_AUX_USAGE_NONE.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
The only real change here is that we now reject clear colors for MCS
with certain formats on gen < 9 because we can't trust that the
reinterpretation will work. This may cause some MCS partial resolves.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Our attempts to do it automatically are problematic at best. In order
to really be precise, we need to know both the desired aux usage and
whether or not clear is supported. The current automatic mechanism
doesn't cover this. This commit itself is not a functional change since
it just reworks everything to be in terms of a silly helper. Later
commits will switch things over to more sensible ways of choosing usage.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
We keep the old and possibly broken method of determining aux usage
intact for now. Therefore, the only functional change here is that we
may call finish_render a bit more accurately.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Multisample surfaces only have a single miplevel so there's no reason to
be passing the extra parameters around. It only leads to confusion.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This commit changes layer_range_length to return locical layers and also
changes the way we allocate the aux_state field to not allocate extra
layers for MCS. This will be important as we're about to start doing
significantly more detailed tracking of MCS state.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
intel_miptree_supports_ccs_e should handle the gen >= 9 requirement and
there's no reason why we can't do CCS_E on window system buffers so long
as we resolve.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
| |
Nothing created through intel_miptree_create_for_renderbuffer will ever
be exposed externally so there's no need to set FOR_SCANOUT.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
| |
It turns out that if you have rendering in-flight with CCS_E enabled and
you go to do a depth resolve without flushing, the CCS data may never
hit the memory.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
This fixes the Piglit ARB_texture_views rendering-formats test.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
(Patch written by Ken, but entirely comments written by Chris.)
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The non-LLC story was a horror show. We uploaded data via pwrite
(drm_intel_bo_subdata), which would stall if the cache BO was in
use (being read) by the GPU. Obviously, we wanted to avoid that.
So, we tried to detect whether the buffer was busy, and if so, we'd
allocate a new BO, map the old one read-only (hopefully not stalling),
copy all shaders compiled since the dawn of time to the new buffer,
upload our new one, toss the old BO, and let the state upload code
know that our program cache BO changed. This was a lot of extra data
copying, and flagging BRW_NEW_PROGRAM_CACHE would also cause a new
STATE_BASE_ADDRESS to be emitted, stalling the entire pipeline.
Not only that, but our rudimentary busy tracking consistented of a flag
set at execbuf time, and not cleared until we threw out the program
cache BO. So, the first shader upload after any drawing would hit this
"abandon the cache and start over" copying path.
This is largely unnecessary - it's just ancient and crufty code. We can
use the same persistent mapping paths on all platforms. On non-ancient
kernels, this will use a write combining map, which should be reasonably
fast.
One aspect that is worse: we do occasionally grow the program cache BO,
and copy the old contents to the newer BO. This will suffer from UC
readback performance now. To mitigate this, we use the MOVNTDQA based
streaming memcpy on platforms with SSE 4.1 (all Gen7+ atoms). Gen4-5
are unfortunately going to be penalized.
v2: Add MOVNTDQA path, rebase on other map flag changes.
v3: Drop cache->bo_used_by_gpu too (caught by Chris Wilson).
Reviewed-by: Matt Turner <[email protected]>
|