| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 9b66351f5b274f3d79cb2c48afa3b2fcc2bf3442)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 373d88a7117150de984510453e1c30a455987686)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit 879d24c49727cfc6c62cbd5bca58efad4c914e40)
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
(cherry picked from commit fcef88d13a9ebdcadc6a878e9284c55651785301)
|
|
|
|
|
|
|
|
|
|
|
|
| |
The crash is due to NULL pColorBlendState, which is legal if the
pipeline has rasterization disabled or if the subpass of the render pass
the pipeline is created against does not use any color attachments.
Test: Sample subpasses from LunarG can run without crash
Signed-off-by: Xu,Randy <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Cc: "17.0 13.0" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The current code evaluated to always true, we only want to flush
on the first submit. Rename the variable to do_flush, and only
emit on the first iteration.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we already do fabs on the one source, we're guaranteed to get
positive infinity if we get any infinity at all. Since +inf only has
one IEEE 754 representation, we can use an integer comparison and avoid
all of the ordered/unordered issues.
Cc: Dave Airlie <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
Reviewed-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
This just uses an 8-bit clear and packs the values.
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 2845a108a9a8bd4b0e6e9b590c976452fb99eb10.
This break VK-GL-CTS randomly.
./deqp-vk --deqp-case=dEQP-VK.texture.filtering.3d.formats.r4g4b4a4*
bounces around here from 6/6 to 3/6 or 4/6 to hanging.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
| |
glNewList() swaps dispatch tables, and we don't have anything in
place to handle that in glthread.
Tested-by: Michel Dänzer <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This was meant to be checking the index type to get the correct
index not the last emitted one. This fixes:
dEQP-VK.pipeline.input_assembly.primitive_restart.index_type_uint32.triangle_strip_with_adjacency
Reviewed-by: Bas Nieuwenhuizen <[email protected]>
Cc: "13.0 17.0" <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
| |
I haven't seen this causing problems in practice, but for correctness
we should also check if rename succeeded to avoid breaking accounting
and leaving a .tmp file behind.
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
At the time of target file check, .tmp file is already created and file
lock is held, so we should remove the .tmp, like in other error paths.
With this, piglit no longer leaves large amount of empty .tmp files
behind, which waste directory entries and may interfere with eviction.
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It seems there is a bug because:
- 20 bytes are compared, but only 1 byte stored_keys step is used
- entries can overlap each other by 19 bytes
- index_mmap is ~1.3M in size, but only first 64K is used
With this fix for Deus Ex:
- startup time (from launch to Feral logo): ~38s -> ~16s
- disk_cache_has_key() hit rate: ~50% -> ~96%
Signed-off-by: Grazvydas Ignotas <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
Fixes crashes after recent upload rework.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There should be minimal gain, if any, for nvc0, but nv50 may end up
noticing more often that the lod argument is uniform. This, in turn,
will remove the need for some unnecessary transformations, which were
being hit due to the checks being done pre-ssa.
Signed-off-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This prevents textureQueryLevels, which maps as LODQ, from ending up
with a xyzw writemask, which is illegal.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100061
Signed-off-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Helps mainly Feral-ported games, due to their use of fma()
shader-db changes:
total instructions in shared programs : 3901147 -> 3842505 (-1.50%)
total gprs used in shared programs : 471258 -> 467359 (-0.83%)
total local used in shared programs : 27405 -> 27361 (-0.16%)
total bytes used in shared programs : 35749888 -> 35214176 (-1.50%)
local gpr inst bytes
helped 17 1829 4091 4091
hurt 4 44 3 3
Signed-off-by: Karol Herbst <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Since switching to LRU eviction the only user of these predicate
functions now resolves directory entry stats itself so pass them
directly saving calling fstat and strlen twice (and the
expensive strlen is skipped entirely if access time is newer).
v2: Update for empty cache dir detection changes
v3: Fix passing string length to predicate with the +1 for NULL
termination and also pass sb as pointer
v4: Missed ampersand for passing sb as pointer
Reviewed-by: Grazvydas Ignotas <[email protected]>
Acked-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously each time we saw a variable we just created a duplicate
entry in the list. This is particularly bad for loops were we add
everything twice, and then throw nested loops into the mix and the
list was growing expoentially.
This stops the glsl-vs-unroll-explosion test which has 16 nested
loops from reaching the tests mem usage limit in this pass. The
test now hits the mem limit in opt_copy_propagation_elements()
instead.
I suspect this was also part of the reason this pass can be so
slow with some shaders.
Reviewed-by: Thomas Helland <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes a bunch of piglit crashes that hit an assert() when trying
to delete the framebuffer. The assert() was triggered because
WinSysDrawBuffer was set to NULL before glDeleteFramebuffers()
was called.
Tested-by: Michel Dänzer <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
Suggested-by: Damian Dixon <[email protected]>
Reviewed-by: Elie Tournier <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99789
|
|
|
|
|
|
| |
This is way more convenient than having two separate dword fields.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
It all just works since it's just a hardware register so we might as
well turn it on.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
In the end, pipeline statistics queries look a lot like occlusion
queries only with between 1 and 11 begin/end pairs being generated
instead of just the one.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In order to get accurate statistics, we need to disable statistics for
blits, clears, and the surface state memcpy at the top of each secondary
command buffer. There are two possible approaches to this:
1) Disable before the blit/memcpy and re-enable afterwards
2) Move emitting 3DSTATE_VF_STATISTICS from initialization and make it
part of pipeline state and then just disabale statistics before
blits and memcpy operations.
Emitting 3DSTATE_VF_STATISTICS should be fairly cheap so it doesn't
really matter which path we take. We choose the second option as it's
more consistent with the way the rest of the statistics are enabled and
disabled.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
| |
It's in 3DSTATE_CLIP, so it doesn't really need the extra detail. This
matches what we do for VS, FS, etc.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
The new version is a nice GPU parallel to cpu_write_query_result and it
nicely handles things like dealing with 32 vs. 64-bit offsets in the
destination buffer.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Not all queries are the same. Even the two queries we support today
require a different amount of data per slot. Once we introduce pipeline
statistics queries, the size will vary wildly.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
We're about to make slots variable-length and always having the
available bits at the front makes certain operations substantially
easier once we do that.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the Vulkan 1.0.39 Specification:
"If VK_QUERY_RESULT_64_BIT is not set and the result overflows a
32-bit value, the value may either wrap or saturate."
So we can either clamp or wrap. Wrapping is both easier and what the
user gets if they use vkCmdCopyQueryPoolResults and we should be
consistent. We could make vkCmdCopyQueryPoolResults clamp but it's
annoying and ends up burning extra batch for something the spec clearly
doesn't require.
Reviewed-By: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Marek Olšák <[email protected]>
Cc: 17.0 <[email protected]>
Signed-off-by: Alex Deucher <[email protected]>
|
|
|
|
| |
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
for threaded gallium
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
for threaded gallium
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
| |
required by threaded gallium
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
| |
threaded gallium can't use pipe_context's LLVM target machine, because
create_shader_selector can be called from a non-driver thread.
Reviewed-by: Timothy Arceri <[email protected]>
|
| |
|
|
|
|
| |
This is simpler for drivers.
|
|
|
|
|
|
|
|
|
|
| |
Now that there's a timebase_scale in gen_device_info which is
effectively the 'period' this switches anv_GetPhysicalDeviceProperties
to using this common device info to initialize the timestampPeriod
device limit.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock
with the convenient property of being able to scale by an integer (80)
to nanosecond units.
For Skylake the frequency is 12MHz or a scale factor of 83.333333
This updates gen_device_info to track a floating point timebase_scale
factor and makes corresponding _queryobj.c changes to no longer assume a
scale factor of 80 works across all gens.
Although the gen6_ code could have been been left alone, the changes
keep the code more comparable, and it now shares a few utility functions
for scaling raw timestamps and calculating deltas. The utility for
calculating deltas takes into account 32 or 36bit overflow depending on
the current kernel version.
Note: this leaves the timestamp handling of ARB_query_buffer_object
untouched, which continues to use an incorrect scale of 80 on Skylake
for now. This is more awkward to solve since the scaling is currently
done using a very limited uint64 ALU available to the command parser
that doesn't support multiply or divide where it's already taking a
large number of instructions just to effectively multiple by 80.
This fixes piglit arb_timer_query-timestamp-get on Skylake
v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too.
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Older versions of GCC don't like compound literals in static const
variable declarations because they don't think it's an actual constant
value.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds some missing return value checks for all uses of snprintf in
brw_performance_query.c. This also switches a use of strncpy + strncat
for snprintf for consistency and to avoid the chance of the strncpy
leaving an unterminated string in the dest buffer if the src is too
long.
This issue with strncpy was picked up by Coverity.
CID: 1402201
Signed-off-by: Robert Bragg <[email protected]>
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Fixes: d8d81fbc316 ("mesa: Add infrastructure for a worker thread to process GL commands.")
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Otherwise it'll be missing in the tarball and make distcheck will fail.
Fixes: 05dd4a1104e ("glapi: Generate GL API marshalling code from the XML.")
Signed-off-by: Emil Velikov <[email protected]>
|