| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Emit clip state on Gen6+ using brw_batch_emit helper, using pack structs
from genxml.
v3:
- Lots style fixes (Ken)
- Do not set CullTestEnableBitMask on Gen8+ (Ken)
v4:
- Do not include brw_defines_common.h.
v5 (Ken): s/BRW_NEW_WM_PROG_DATA/BRW_NEW_FS_PROG_DATA/
Signed-off-by: Rafael Antognolli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This emits 3DSTATE_WM_DEPTH_STENCIL on Gen8+ or DEPTH_STENCIL_STATE
(and the relevant pointer packets) on Gen6-7.5 from a single function.
v3:
- Watch for BRW_NEW_BATCH too on gen < 8 (Ken)
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Rafael Antognolli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Make atoms initalization compile conditionally based on the target
platform.
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
| |
v3 (Rafael): Drop aub parameter
v4 (Ken): Squash in gen4/g45 automake fixes
Signed-off-by: Kenneth Graunke <[email protected]>
Signed-off-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Ironlake documentation is terrible, so it's unclear whether or not
this field exists there. It definitely doesn't exist on Sandybridge
and later. It definitely does exist on G45.
We haven't been setting it for our normal vertex attributes - just
the SGVs (VertexID, InstanceID, BaseVertex, BaseInstance, DrawID).
We should be consistent. My guess is that it isn't necessary and
doesn't exist - this patch drops it from the SGVs elements, making
them follow the behavior of most attributes.
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
These macros are defined in brw_defines.h, which contains a lot of
macros that conflict with autogenerated code from genxml. But we need to
use them (the MOCS macros) in some of that same genxml code.
Moving them to brw_context.h solves that problem and we don't have to
include brw_defines.h.
Signed-off-by: Rafael Antognolli <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The Android native fence in i965 has two fds: _EGLSync::SyncFd and
brw_fence::sync_fd.
The semantics of __DRI2fenceExtensionRec::create_fence_fd are unclear on
whether the DRI driver takes ownership of the incoming fd (which is the
same incoming fd from eglCreateSync). i965 did take ownership, but all
other Mesa drivers do not; instead, they dup the incoming fd. As
a result, _EGLSync::SyncFd and brw_fence::sync_fd were the same fd, and
both egl_dri2 and i965 believed they owned it. On eglDestroySync, that
led to a double-close.
Fix the double-close by making brw_dri_create_fence_fd dup the incoming
fd, just like the other drivers do.
Signed-off-by: Randy Xu <[email protected]>
Test: Run Vulkan and GLES stress test and no crash.
Fixes: 6403e376511 ("i965/sync: Implement fences based on Linux sync_file")
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
[chadv: Polish the commit message]
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Samplers are encoded into the instruction word, so there's no need to
make space in the uniform file.
Previously matrix_columns and vector_elements were set to 0, making this
else case a no-op. Commit 75a31a20af26 changed that, causing malloc
corruption in thousands of tests on i965.
Fixes: 75a31a20af26 ("glsl: set vector_elements to 1 for samplers")
Reviewed-by: Jason Ekstrand <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100871
|
|
|
|
|
|
|
| |
We already have BRW_NEW_BATCH, which completely covers all the cases
that BRW_NEW_CONTEXT would handle. Drop it.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
| |
There's no reason for this as far as I can tell.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen4-5 and Gen8+ already set this, but Gen6-7.5 did not. We ought to
be consistent - the answer depends on the API, not the hardware generation.
The Sandybridge PRM says about RASTRULE_UPPER_RIGHT:
"To match OpenGL point rasterization rules (round to +infinity, where
this is the upper right direction wrt OpenGL screen origin of lower
left).
So this is likely the one we should use.
Reviewed-by: Rafael Antognolli <[email protected]>
|
|
|
|
|
|
|
|
| |
We set this unconditionally on every other platform. Zero (Manhattan)
isn't even listed as an option in the Sandybridge docs - only "true".
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The original Broadwater and Crestline platforms computed antialiased
line distances using "manhattan" distance, aka a + b = c. Eaglelake
and Cantiga added "true" distance, which apparently does something
like max(a, b) + min(a, b) / 4. Not exactly "true", but at least
more accurate.
The G45 documentation indicates that the old manhattan distance setting
is "only for debug purposes" and should never be used. The Ironlake
documentation no longer mentions AALINEDISTANCE_MANHATTAN, though it
does still contain the narrative about the feature.
At any rate, we should use the more accurate mode.
Reviewed-by: Plamena Manolova <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
| |
Should have been removed in commit ad55b1a7701a
|
|
|
|
|
|
|
| |
These are no longer used since the previous commit.
Acked-by: Elie Tournier <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IVB is running into some spilling issues in piglit with the
loop removed. However those tests are not really reflective
of a real world use case, also fp64 is brand new to IVB
so we leave the spilling issues to be resolved at a later
time.
Run time for shader-db on my machine goes from ~795 seconds to
~665 seconds.
shader-db results BDW:
total instructions in shared programs: 12969459 -> 12968891 (-0.00%)
instructions in affected programs: 1463154 -> 1462586 (-0.04%)
helped: 3622
HURT: 3326
total cycles in shared programs: 246453572 -> 246504318 (0.02%)
cycles in affected programs: 208842622 -> 208893368 (0.02%)
helped: 24029
HURT: 35407
total loops in shared programs: 2931 -> 2931 (0.00%)
loops in affected programs: 0 -> 0
helped: 0
HURT: 0
total spills in shared programs: 14560 -> 14498 (-0.43%)
spills in affected programs: 2270 -> 2208 (-2.73%)
helped: 17
HURT: 2
total fills in shared programs: 19671 -> 19632 (-0.20%)
fills in affected programs: 2060 -> 2021 (-1.89%)
helped: 17
HURT: 2
LOST: 17
GAINED: 40
Most of the hurt shaders are 1-2 instructions, with what looks like a max of 7.
I've looked at the worst cycles regressions and as far as I can tell its just
a scheduling difference.
Acked-by: Elie Tournier <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
This avoids repeated translations of the enum.
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
These names make it easier to understand what is going on in
regards to references.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I accidentally moved the bo->bufmgr dereference above the NULL check
when cleaning up this code.
While passing NULL to free() is a common pattern...passing NULL to
unmap seems pretty bad. You really ought to know whether you have
a buffer or not. We don't want to paper over bugs like that. So,
just drop the NULL check altogether.
CID: 1405006
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
| |
If ret is 0, we return. If ret is not 0, we return. This is dead.
CID: 1405013 (Structurally dead code (UNREACHABLE))
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Andreas Boll <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Andreas Boll <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Andreas Boll <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following Clang warnings.
brw_fs_channel_expressions.cpp:219:12: warning: enumeration values 'ir_unop_ballot', 'ir_unop_read_first_invocation', and 'ir_binop_read_invocation' not handled in switch [-Wswitch]
switch (expr->operation) {
^
1 warning generated.
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes the following Clang warning.
In file included from radeon_debug.c:32:
./radeon_common_context.h:500:19: warning: duplicate 'const' declaration specifier [-Wduplicate-decl-specifier]
extern const char const *radeonVendorString;
v2: - do not remove the duplicate 'const' qualifier, fix it
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
| |
These one bit values are booleans.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
unsigned long is a terrible type for a bitfield - if you need fewer
than 32 bits, it wastes 4 bytes. If you need more, things break on
32-bit builds. Just use unsigned.
Even that's a bit ridiculous as we only have one flag today.
Still, it's at least somewhat better.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The drm_i915_gem_create ioctl structure uses a __u64 for the size,
so we should probably use uint64_t to match. In theory, we could
probably have a BO larger than 4GB, using a 48-bit PPGTT - it just
wouldn't be mappable in the CPU's 32-bit address space.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Theoretically, with a 48-bit address space, we could have buffers
with an alignment of >= 4GB. It's a bit silly, but the exec_object
structs (drm_i915_gem_exec_object2) use a __u64 for this, so we may
as well use the same type as the kernel API.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
| |
struct drm_i915_gem_set_tiling's stride field is a __u32.
intel_mipmap_tree::stride is a uint32_t. Using unsigned long just
doesn't make sense. Switching also lets us drop many pointless
locals that only existed to deal with the type mismatch.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
| |
The ioctl structs contain __u64 offset and size fields, so make them
uint64_t rather than unsigned long.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For some reason we passed tiling by pointer, through several layers,
even though the functions only read the initial value, and never
actually change it. We even had a do-while loop that executed until
the tiling mode matched - except it always did, so it only ran once.
We then had bogus error handling in case it changed the tiling mode
to something nonsensical...which it never did.
Drop all this nonsense.
Reviewed-by: Chris Wilson <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We check these bitfields when computing the Haswell max GL version.
We need to set them ahead of time, or they won't exist, and all our
checks will fail. That sets the max core profile GL version to 4.2.
This introduces the bizarre situation where asking for a GL context
with version 4.3+ fails, but asking for a GL core profile context
with version <= 4.2 actually promotes you a 4.5 context.
GLX_MESA_query_renderer also reported the bogus 4.2 value.
Now it shows 4.5.
Cc: "17.0" <[email protected]>
Reported-and-tested-by: Rafael Ristovski <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This restores the performance warnings removed in:
i965: Drop brw_bo_map[_gtt] wrappers which issue perf warnings.
but adds them for nearly all BO mapping, and also for wait_rendering.
Because we add this to the core bufmgr, we automatically get stall
warnings in all callers, unlike before where only a few callsites used
the wrappers that gave stall warnings.
We also do it a bit differently: we simply measure how long set_domain
takes (the part that stalls), and complain if it's more than 0.01 ms.
We don't bother calling brw_bo_busy(), and we don't measure the mmap
time (which doesn't stall). This should be more accurate.
Reviewed-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
| |
Less boilerplate.
Reviewed-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In theory gcc is free to re-load them, and if a concurrent
execbuf races and updates bo->offset64 then we have a problem:
execbuffer api requires that the ->presumed_offset and the one
we used for the reloc matches. It does not require that the value
is sensible, which means no locks needed, just a consistent load.
Ken said his next series will nuke this, so just hand-roll the
kernel's READ_ONCE idea inline.
FIXME: Most callers of brw_emit_reloc recompute the relocation
themselves, which means this doesn't really fix the race. But the long
term plan is to move to per-context relocation handling, which will
fix this all properly. So leave this for now as just a reminder.
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This was done because the kernel has 1 global address space, shared
with all render clients, for gtt mmap offsets, and that address space
was only 32bit on 32bit kernels.
This was fixed in
commit 440fd5283a87345cdd4237bdf45fb01130ea0056
Author: Thierry Reding <[email protected]>
Date: Fri Jan 23 09:05:06 2015 +0100
drm/mm: Support 4 GiB and larger ranges
which shipped in 4.0. Of course you still want to limit the bo cache
to a reasonable size on 32bit apps to avoid ENOMEM, but that's better
solved by tuning the cache a bit. On 64bit, this was never an issue.
On top, mesa never set this, so it's all dead code. Collect an trash it.
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
is_reusable was needed by uxa because it couldn't keep track of its
scanout buffers and used this as a proxy. Disabling reuse is a silly
idea, we set this once at start. Remove both.
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Iirc this was used by uxa for persistent mmpas of the frontbuffer. For
mesa all the set_domain stuff needed before a synchronized mmap is handled
within the bufmgr, so no reason ever to call this.
Inline the implementation into its only internal user.
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Entirely unused, and really shouldn't be used. The alloc functions already
take care of this. And even in a future where we're not going to
h/v-align tiled buffers in the bufmgr, but only in isl, I think we
still want to adjust the tiling mode in the bufmgr, since that ties in
closely to mmaps and stuff like that.
get_tiling is still needed for the import paths (until we have modifiers
everywhere).
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
Entirely unused, mesa instead used the BO_ALLOC_FOR_RENDER flag.
Signed-off-by: Daniel Vetter <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Suggested by Chris Wilson. A tiny bit simpler.
Reviewed-by: Daniel Vetter <[email protected]>
|
|
|
|
|
|
| |
Matches the class name and the header file name.
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
indent -i3 -nut -br -brs -npcs -ce --no-tabs -Tuint32_t -Tuint64_t
plus some manual fixes because those aren't quite the right settings.
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The bacon is all gone.
This renames both the class and the related functions. We're about to
run indent on the bufmgr code, so no need to worry about fixing bad
indentation.
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The stupid reason for eliminating these functions is that I'm about
to rename drm_bacon_bo_map() to brw_bo_map(), which makes the real
function have the short name, rather than the wrapper.
I'm also planning on reworking our mapping code soon, so we use WC
mappings and proper unsynchronized mappings on non-LLC platforms.
It will be easier to do that without thinking about the stall
warnings and wrappers.
My eventual hope is to put the performance warnings in the BO map
function itself, so all callers gain the warning.
Acked-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
| |
Less bacon.
Acked-by: Jason Ekstrand <[email protected]>
|