| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
| |
This is similar to blorp_gen8_hiz_clear_attachments except that it takes
actual images instead of trusting in the already set depth state.
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
SIMD4x2 dispatch mode has been removed in GEN11. We're not using
it anyways in Mesa. Adding few asserts to make it explicit.
Use GEN_GEN macro in place of devinfo->gen (Ken)
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
StateCacheInvalidation is required on all gen7+ platforms. We
don't need to update this check for every new gen h/w unless
this requirement is changed. So, dropping the check for latest
gen h/w.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This pass performs an "ambiguate" operation on a CCS-compressed surface
by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly
detailed notion of how the CCS is laid out so this is fairly simple to
do. On gen7, the CCS tiling is quite crazy but that isn't an issue
because we can only do CCS on single-slice images so we can just blast
over the entire CCS buffer if we want to.
Reviewed-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We have to start render targets at binding table index 0 in order to use
headerless FB write messages, and in fact already assume this in a bunch
of places in the code. Let's finish that off, and not bother storing 0
in a struct to pretend to add it in a few places.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This creates two new internal dependencies, idep_nir_headers and
idep_nir. The former encapsulates the generation of nir_opcodes.h and
nir_builder_opcodes.h and adding src/compiler/nir as an include path.
This ensures that any target that needs nir headers will have the
includes and that the generated headers will be generated before the
target is build. The second, idep_nir, includes the first and
additionally links to libnir.
This is intended to make it easier to avoid race conditions in the build
when using nir, since the number of consumers for libnir and it's
headers are quite high.
Acked-by: Eric Engestrom <[email protected]>
Signed-off-by: Dylan Baker <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Older OpenGL defines two equations for converting from signed-normalized
to floating point data. These are:
f = (2c + 1)/(2^b - 1) (equation 2.2)
f = max{c/2^(b-1) - 1), -1.0} (equation 2.3)
Both OpenGL 4.2+ and OpenGL ES 3.0+ mandate that equation 2.3 is to be
used in all scenarios, and remove equation 2.2. DirectX uses equation
2.3 as well. Intel hardware only supports equation 2.3, so Gen7.5+
systems that use the vertex fetcher hardware to do the conversions
always get formula 2.3.
This can make a big difference for 10-10-10-2 formats - the 2-bit value
can represent 0 with equation 2.3, and cannot with equation 2.2.
Ivybridge and older were using equation 2.2 for OpenGL, and 2.3 for ES.
Now that Ivybridge supports OpenGL 4.2, this is wrong - we need to use
the new rules, at least in core profile. That would leave Gen4-6 doing
something different than all other hardware, which seems...lame.
With context version promotion, applications that requested a pre-4.2
context may get promoted to 4.2, and thus get the new rules. Zero cases
have been reported of this being a problem. However, we've received a
report that following the old rules breaks expectations. SuperTuxKart
apparently renders the cars red when following equation 2.2, and works
correctly when following equation 2.3:
https://github.com/supertuxkart/stk-code/issues/2885#issuecomment-353858405
So, this patch deletes the legacy equation 2.2 support entirely, making
all hardware and APIs consistently use the new equation 2.3 rules.
If we ever find an application that truly requires the old formula, then
we'd likely want that application to work on modern hardware, too. We'd
likely restore this support as a driconf option. Until then, drop it.
This commit will regress Piglit's draw-vertices-2101010 test on
pre-Haswell without the corresponding Piglit patch to accept either
formula (commit 35daaa1695ea01eb85bc02f9be9b6ebd1a7113a1):
draw-vertices-2101010: Accept either SNORM conversion formula.
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fix incomplete check of input params in blorp_surf_convert_to_uncompressed()
which can lead to NULL pointer dereferencing.
Fixes: 5ae8043fed2 ("intel/blorp: Add an entrypoint for doing
bit-for-bit copies")
Fixes: f395d0abc83 ("intel/blorp: Internally expose
surf_convert_to_uncompressed")
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Andres Gomez <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The only reason why we needed that version was because the Vulkan driver
needed to be able to create the surface states so it could handle
indirect clear colors. Now that blorp handles them natively, there's no
need for the extra entrypoint.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This doesn't go all the way of avoiding the txf_ms if it's fast-cleared,
however it does at least make us only do it once. This should improve
performance of MSAA resolves in the presence of lots of clear color.
Without the patch, enabling fast-clears in the multisampling Sascha demo
drops the framerate by about 10%. With this patch, enabling fast-clears
increases the demo's framerate by 25%.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
|
| |
That name is already taken by one of the helpers in blorp_nir_builder.h
and, while we haven't moved the guts of blorp_blit.c there yet, we'd
like to start using some things from that header.
Reviewed-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Nanley Chery <[email protected]>
|
|
|
|
|
|
|
|
| |
This makes our MOCS settings significantly more flexible.
Cc: "17.3" <[email protected]>
Tested-by: Lyude Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Cc: "17.3" <[email protected]>
Tested-by: Lyude Paul <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
It's a neat idea, and still useful in some cases, but the intel common
code is used by i965 and anvil only, this is a little clearer.
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The caller can now use brw_stage_prog_data::program_size which is set
by the brw_compile_* functions.
Cc: Jason Ekstrand <[email protected]>
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Lionel Landwerlin <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
It's redundant with nir_shader::info::stage.
Acked-by: Timothy Arceri <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This moves us away to the array of pointers model and onto a model where
each param is represented by a generic uint32_t handle. We reserve 2^16
of these handles for builtins that get generated by somewhere inside the
compiler and have well-defined meanings. Generic params have handles
whose meanings are defined by the driver.
The primary downside to this new approach is that it moves a little bit
of the work that we would normally do at compile time to draw time. On
my laptop this hurts OglBatch6 by no more than 1% and doesn't seem to
have any measurable affect on OglBatch7. So, while this may come back
to bite us, it doesn't look too bad.
Reviewed-by: Jordan Justen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This allows building and installing the Intel "anv" Vulkan driver using
meson and ninja, the driver has been tested against the CTS and has
seems to pass the same series of tests (they both segfault when the CTS
tries to run wayland wsi tests).
There are still a mess of TODO, XXX, and FIXME comments in here. Those
are mostly for meson bugs I'm trying to fix, or for additional things to
implement for other drivers/features.
I have configured all intermediate libraries and optional tools to not
build by default, meaning they will only be built if they're pulled in
as a dependency of a target that will actually be installed) this allows
us to avoid massive if chains, while ensuring that only the bits that
need to be built are.
v2: - enable anv, x11, and wayland by default
- add configure option to disable valgrind
v3: - fix typo in meson_options (Nicholas)
v4: - Remove dead code (Eric)
- Remove change to generator that was from v0 (Eric)
- replace if chain with loop (Eric)
- Fix typos (Eric)
- define HAVE_DLOPEN for both libdl and builtin dl cases (Eric)
v5: - rebase on util string buffer implementation
Signed-off-by: Dylan Baker <[email protected]>
Reviewed-by: Eric Anholt <[email protected]> (v4)
|
|
|
|
|
|
| |
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
|
|
|
|
| |
Vulkan needs to be able to clear any texture you can create. We want to
add support for VK_FORMAT_R8_SRGB and we need to use L8_UNORM_SRGB to do
that so we need to be able to clear it.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
| |
Gen4-6 can only handle surfaces up to 8192. Only Gen7+ can do 16384.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
I want to be able to copy between buffer objects using BLORP in the i965
driver. Anvil already had code to do this, in a reasonably efficient
manner - first using large bpp copies, then smaller bpp copies.
This patch moves that logic into BLORP as blorp_buffer_copy(), so we
can use it in both drivers.
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes warnings like
warning: implicit conversion from enumeration type 'enum isl_format' to
different enumeration type 'enum GEN10_SURFACE_FORMAT'
[-Wenum-conversion]
.SourceElementFormat = ISL_FORMAT_R32_UINT,
^~~~~~~~~~~~~~~~~~~
Reviewed-by: Emil Velikov <[email protected]>
Reviewed-by: Eric Engestrom <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The PRM SKL-Vol 2b-05.16 says:
"Within a VERTEX_ELEMENT_STATE structure, if a Component Control
field is set to something other than VFCOMP_STORE_SRC, no
higher-numbered Component Control fields may be set to
VFCOMP_STORE_SRC. In other words, only trailing components can be set
to something other than VFCOMP_STORE_SRC."
Since we set the component 1 to VFCOMP_STORE_0 on gen8+, and
VFCOMP_STORE_IID on gen5+, and we are not using components 2 and 3,
let's also set them to VFCOMP_STORE_0.
Signed-off-by: Rafael Antognolli <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
v2 (Jason): Adjust directly in surf_fake_rgb_with_red()
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101910
CC: [email protected]
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The kernel only cares about whether the object is to be written to or
not, only reduces (reloc.read_domains, reloc.write_domain) down to just
!!reloc.write_domain. When we use NO_RELOC, the kernel doesn't even read
those relocs and instead userspace has to pass that information in the
execobject.flags. We can simplify our reloc api by also removing the
unused read/write domains and only pass the resultant flags.
The caveat to the above are when we need to make the kernel aware that
certain objects need to take into account different work arounds.
Previously, this was done using the magic (INSTRUCTION, INSTRUCTION)
reloc domains. NO_RELOC requires this to be passed in the execobject
flags as well, and now we push that up the callstack.
The API is more compact, more expressive of what happens underneath, but
unfortunately requires more knowledge of the system at the point of use.
Conversely it also means that knowledge is specific and not generally
applied and so not overused.
text data bss dec hex filename
8502991 356912 424944 9284847 8dacef lib/i965_dri.so (before)
8500455 356912 424944 9282311 8da307 lib/i965_dri.so (after)
v2: (by Ken) Rebase.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
We already have a helper for doing this in BLORP, this just moves the
logic into ISL where we can share it with other components.
Reviewed-by: Lionel Landwerlin <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
| |
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
This will falsely trigger an assert on number of layers once
isl is used for 3D layouts of Gen4 cube maps.
Reviewed-by: Jason Ekstrand <[email protected]>
Signed-off-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Apparently, the sampler has some sort of precision issues for
non-normalized texture coordinates with linear filtering. This caused
some small precision issues in scaled blits. Work around this by using
normalized coordinates. There is some extra work necessary because Gen6
uses TEX (instead of TXF) for some multisample resolve blits.
Fixes piglit.spec.arb_framebuffer_object.fbo-blit-stretch on SNB.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68365
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2:
- Do layered resolves.
(Jason Ekstrand):
- Replace "bt" suffix with "attachment".
- Rename helper function to prepare_ccs_resolve.
- Move blorp_params_init() into helper function.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: Update commit title (Jason Ekstrand)
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]> (v1)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2 (Jason Ekstrand):
- Update commit title.
- Check aux level and layer as well.
v3 (Jason Ekstrand):
- Move the non-aux layer check.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]> (v1)
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously the offset was only applied in the TXF case.
Signed-off-by: Ian Romanick <[email protected]>
Suggested-by: Jason Ekstrand <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise the values used for coordinate normalization use the wrong
sizes.
Signed-off-by: Ian Romanick <[email protected]>
Suggested-by: Jason Ekstrand <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We call convert_to_single_slice so they may end up with a non-trivial
offset that needs to be taken into account.
v2 (idr): Also set needs_src_offset. Suggested by Jason.
Fixes ES2-CTS.functional.texture.specification.basic_copyteximage2d.cube_rgba
and ES2-CTS.functional.texture.specification.basic_copytexsubimage2d.cube_rgba
on G45.
Signed-off-by: Ian Romanick <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101284
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
| |
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
| |
The blorp_hiz_op entrypoint always acts on a full subresource of a HiZ
buffer so we can just set the flag unconditionally.
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101283
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|