| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
| |
Found while running shader-db under valgrind.
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The 32-bit to 64-bit conversions need to have the 32-bit
data source elements aligned to 64-bit but only with doubles as
destination type.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Tested-by: Mark Janes <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The optimization in unpack_64 is clearly subsumed with the opt_algebraic
optimizations in the previous commit. The pack optimization may not be
quite handled by opt_algebraic but opt_algebraic should get the really
bad cases. Also, it's been broken since it was merged and we've never
noticed so it must not be doing anything.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
There's nothing "double" about it other than, perhaps, the fact that it
packs two 32-bit values.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
NIR is a typeless IR and the two opcodes, when considered bitwise, do
exactly the same thing. There's no reason to have two versions.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We can only do the optimization if the source *is* SSA.
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "13.0 17.0" <[email protected]>
|
|
|
|
|
|
| |
for shaders
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Remove local definition of RADEON_INFO_TILE_CONFIG and use the correct
macro provided by libdrm_radeon RADEON_INFO_TILING_CONFIG.
Latter was present as of libdrm 2.4.22, sirca 2010.
Signed-off-by: Emil Velikov <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes regressions from c505d6d852220f4aaaee161465dd2c579647e672.
Switching from using gl_shader_program to gl_program for the pipline
objects CurrentProgram array meant we were freeing gl_shader_programs
immediately after glDeleteProgram was called, but the spec states
the program should only get deleted once it is no longer in use.
To work around this we add a new ReferencedPrograms array to track
gl_shader_programs in use.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
| |
Acked-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the buffer has been freed by the kernel under memory pressure, it is
invalid to try and access the backing storage for that buffer in the
future - the backing storage is not recreated automatically. As such we
need to mark the GL object as being freed for unretained buffers and so
recreate the object on next use.
Futhermore from the GL_APPLE_object_purgeable:
"In contrast, by calling ObjectUnpurgeableAPPLE with an <option> of
UNDEFINED_APPLE, the application is indicating that it intends to
recreate the contents of the storage from scratch. Further, the
application is is stating that it would like the GL to do only the
minimal amount of work set PURGEABLE_APPLE to FALSE. If
ObjectUnpurgeableAPPLE is called with the <option> set to
UNDEFINED_APPLE, then ObjectUnpurgeableAPPLE will return the value
UNDEFINED_APPLE."
we must always report GL_UNDEFINED_APPLE when called with
glObjectUnpurgeable(GL_UNDEFINED_APPLE).
Testcase: piglit/object_purgeable-api-*
Signed-off-by: Chris Wilson <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This partially reverts commit 97217a40f97cdeae0304798b607f704deb0c3558.
It leaves ES 2.0 support in place per Ian's suggestion, because ES 2.0
is designed to work on hardware like i915.
Chrome only uses the GPU if you have GL >= 2.0, and using i915 (and
prog_execute) actually hurt performance compared with the software
paths.
|
|
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Fixes: 9d16f3903e2 ("driconf: add allow_higher_compat_version option")
|
|
|
|
|
|
|
|
|
| |
v2: s/force_compat_profile/allow_higher_compat_version
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: s/force_compat_profile/allow_higher_compat_version
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
v2: s/force_compat_profile/allow_higher_compat_version
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Mesa currently doesn't allow to create 3.1+ compatibility profiles
mainly because various features are unimplemented and bugs can
happen.
However, some buggy apps request a compat profile without using
any old features unimplemented in mesa, and they fail to start.
This option should help some games to run but it's not enough
for all (eg. Dying Light).
v2: - s/force_compat_profile/allow_higher_compat_version
Signed-off-by: Samuel Pitoiset <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Edmondo Tommasina <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Emil Velikov <[email protected]>
Acked-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes two GL ES 3.0 CTS tests on Sandy Bridge:
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_linear
ES3-CTS.functional.texture.mipmap.cube.base_level.linear_nearest
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0 13.0" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0 13.0" <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "17.0" <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
| |
Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99660
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
| |
This fixes 143 of the new piglit tests added by Nicolai
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
For the on-disk shader cache we want to be able to differentiate
between a program that was linked and one that was loaded from cache.
V2:
- don't return the new enum directly to the application when queried,
instead return GL_TRUE or GL_FALSE as required. Fixes google-chrome
corruptions when using cache.
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reverts commit 0bac2551e40410e2251daf4fd9faf69310ab34ce.
Now that we position the guardband correctly (applying translations
in addition to scaling) and made it as large (or larger) than the
render target, this shouldn't be necessary.
Now we leave guardband clipping enabled 100% of the time, like the
Windows driver does.
Fixes GL45-CTS.gtf21.GL2FixedTests.clip.clip. It tries to draw a
16384x64 rectangle, and it appears that some kind of numerical
imprecisions in the clipper result in some edge pixels going missing.
The Windows driver passes this test because of guardband clipping.
Cc: "17.0" <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we disabled the guardband when the viewport was smaller than
the framebuffer on Gen6-7.5, to prevent portions of primitives from
being draw outside of the viewport. On Gen8+, we relied on the viewport
extents test to effectively scissor this away for us.
We can simply always enable scissoring instead. We already include the
viewport in the scissor rectangle, so this will effectively do the
viewport extents test for us. (The only difference is that the scissor
rectangle doesn't support sub-pixel values. I think that's okay.)
Given that the viewport extents test is essentially a second scissor,
and is enabled for basically all 3D drawing on Gen8+, it stands to
reason that scissoring is cheap. Enabling the guardband reduces the
cost of clipping, which is expensive.
The Windows driver appears to never disable guardband clipping, and
appears to use scissoring in this case. I don't know if they leave
it on universally though.
This fixes misrendering in Blender, where the "floor plane" grid lines
started rendering at wrong angles after I disabled XY clipping of line
primitives. Enabling the guardband seems to solve the issue.
Cc: "17.0" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=99339
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
(Patch co-authored by Jason and Ken.)
We scaled the guardband based on the viewport size, but failed to
take into account the translation portion of the viewport transform.
This meant the guardband was always centered around the origin.
We want it to be centered around the screen-space drawing area,
which is the intersection of the viewport and the render target.
At best, getting this wrong would reduce the guardband's effectiveness
in some cases. At worst, it might break things - objects outside of the
guardband are trivially rejected, so getting the guardband in the wrong
place and leaving guardband clipping enabled could cause problems.
v2: drop clamping of positive maximums.
Cc: "17.0" <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The next patch will make the guardband calculation dependent on the
transformation matrix. Instead of computing it in both atoms, just
combine them into a single atom.
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The GLX specification says about glXDestroyPixmap:
"The storage for the GLX pixmap will be freed when it is not current
to any client."
We're not really following this language to the letter: some of the storage
is freed immediately (in particular, the dri3_drawable, which contains both
GLXDRIdrawable and loader_dri3_drawable). So we NULL out the pointers to
that freed storage; the previous patches added the corresponding NULL-pointer
checks.
This fixes memory corruption in piglit
./bin/glx-visuals-depth/stencil -pixmap -auto
Cc: 17.0 <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The term "lossless compression" could potentially mean multisample
color compression, single-sample color compression or HiZ because they
are all lossless. The term CCS_E, however, has a very precise meaning;
in ISL and is only used to refer to single-sample color compression.
It's also much shorter which is nice.
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add assert checking that num_sources is never larger than 3.
This prevents Coverity from concluding that the unhandled
cases of num_sources not being 0-3 are relevant.
Coverity-Id: 1399480-1399489
Signed-off-by: Robert Foss <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This does point at the front-end emitting silly code that could have
been optimized out, but the current fsign implementation would emit
bogus IR if abs was set for the argument (because it would apply the
abs modifier on an unsigned integer type), and we shouldn't rely on
the upper layer's optimization passes for correctness.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Gallium drivers have had this for a while. It makes sense to support
it consistently across drivers, so expose it in i965 as well.
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
At this point, the pitch is in bytes. We haven't yet divided the pitch
by 4 for tiled surfaces, so abs(pitch) may be larger than 32K. This
means the bit 15 trick won't work.
The caller now has signed integers anyway, so just pass those through
and do the obvious check.
Cc: "17.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
| |
jip should always be negative here as its the result of
do instruction - while instruction.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We are casting from a signed 32bit int to an unsigned 16bit int
so shift 15 bits rather than 16.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Applications may delete a shader program, create a new one, and bind it
before the next draw. With terrible luck, malloc may randomly return a
chunk of memory for the new gl_program that happened to be the exact
same pointer as our previously bound gl_program. In this case, our
logic to detect new programs in brw_upload_pipeline_state() would break:
if (brw->vertex_program != ctx->VertexProgram._Current) {
brw->vertex_program = ctx->VertexProgram._Current;
brw->ctx.NewDriverState |= BRW_NEW_VERTEX_PROGRAM;
}
Because the pointer is the same, we'd think it was the same program.
But it could be wildly different - a different stage altogether,
different sets of resources, and so on. This causes utter chaos.
As unlikely as this seems, I believe I hit this when running a subset
of the CTS in a loop, in a group of tests that churns through simple
programs, deleting and rebuilding them. Presumably malloc uses a
bucketing cache of sorts, and so freeing up a gl_program and allocating
a new one fairly quickly causes it to reuse that memory.
The result was that brw->vertex_program->info.num_ssbos claimed the
program had SSBOs, while brw->vs.base.prog_data.binding_table claimed
that there were none. This was crazy, because the binding table is
calculated from info.num_ssbos - the shader info appeared to change
between shader compile time and draw time. Careful use of watchpoints
revealed that it was being clobbered by rzalloc's memset when building
an entirely different program...
Fortunately, our 0xd0d0d0d0 canary for unused binding table entries
caused us to crash out of bounds when trying to upload SSBOs, or we
may have never discovered this heisenbug.
Fixes crashes in GL45-CTS.compute_shader.sso-case2 when using a hacked
cts-runner that only runs GL45-CTS.compute_shader.s* in EGL config ID 5
at 64x64 in a loop with 100 iterations.
Cc: "17.0 13.0 12.0" <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch implements a new type of struct brw_fence, one that is based
struct sync_file.
This completes support for EGL_ANDROID_native_fence_sync.
* Background
Linux 4.7 added a new file type, struct sync_file. See
commit 460bfc41fd52959311ed0328163f785e023857af
Author: Gustavo Padovan <[email protected]>
Date: Thu Apr 28 10:46:57 2016 -0300
Subject: dma-buf/sync_file: de-stage sync_file headers
A sync file is a cross-driver explicit synchronization primitive. In a
sense, sync_file's relation to synchronization is similar to dma_buf's
relation to memory: both are primitives that can be imported and
exported across drivers (at least in theory).
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rename to brw_fence_insert_locked(). This is correct because the fence's
mutex is effectively locked, as all callers are also *creators* of the
fence, and have not yet returned the new fence.
This reduces noise in the next patch, which defines and uses
brw_fence_insert(), an unlocked variant.
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Pre-patch, brw_sync.c ignored the return value of
intel_batchbuffer_flush().
When intel_batchbuffer_flush() fails during eglCreateSync
(brw_dri_create_fence), we now give up, cleanup, and return NULL.
When it fails during glFenceSync, however, we blindly continue and hope
for the best because there does not exist yet a way to tell core GL that
sync creation failed.
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This a refactor patch; no expected changed in behavior.
Add `enum brw_fence_type` and brw_fence::type. There is only one type
currently, BRW_FENCE_TYPE_BO_WAIT. This patch reduces a lot of noise in
the next, which adds new type BRW_FENCE_TYPE_SYNC_FD.
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A variant of intel_batchbuffer_flush() with parameters for in and out
fence fds.
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
This bool maps to I915_PARAM_HAS_EXEC_FENCE_FD.
Reviewed-by: Rafael Antognolli <[email protected]>
Tested-by: Rafael Antognolli <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
The path in question (... dri/intel/server) was removed years ago.
Signed-off-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Analogous to previous commit.
Cc: "12.0 13.0" <[email protected]>
Signed-off-by: Emil Velikov <[email protected]>
|