| Commit message (Collapse) | Author | Age | Files | Lines |
... | |
|
|
|
|
|
|
|
| |
v2 (Ken): Drop GEN8_RASTER_FRONT_WINDING_CCW in raster state
Add emission of pma stall.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Ken): Added switch cases for gen8/9 in texel_fetch(). These
were wrongly introduced in blit-enabling patch.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2 (Ken): Use payload directly instead of retyping it into vec8.
Drop the implied header, it isn't used for gen6+ anyway.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we hardcoded "VS URB Starting Address" to 2 (in 8kB chunks),
which meant VS URB data would start at an offset of 16kB.
However, on Haswell GT3 and Gen8+, we allocate the first 32kB for the
push constant region. This means that the PS push constant and VS URB
data regions overlap, which can lead to corruption.
v2 (Ken): Better description of the change, and do not change vs_size
from 2 to 1.
Cc: [email protected]
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Currently the size is sizeof(float) times too large. One reserves
GEN6_BLORP_VBO_SIZE many floats whereas GEN6_BLORP_VBO_SIZE stands
for the size of vertex buffer in bytes.
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Topi Pohjolainen <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes dEQP-GLES31.functional.shaders.multisample_interpolation tests:
- interpolate_at_sample.non_multisample_buffer.sample_n_default_framebuffer
- interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_rbo
- interpolate_at_sample.non_multisample_buffer.sample_n_singlesample_texture
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The coverage mask is not sufficient - in per-sample mode, we also need
to AND with a mask representing the samples being processed by the
current fragment shader invocation.
Fixes 18 dEQP-GLES31.functional.shaders.sample_variables tests:
sample_mask_in.bit_count_per_sample.multisample_{rbo,texture}_{1,2,4,8}
sample_mask_in.bit_count_per_two_samples.multisample_{rbo,texture}_{4,8}
sample_mask_in.bits_unique_per_sample.multisample_{rbo,texture}_{1,2,4,8}
sample_mask_in.bits_unique_per_two_samples.multisample_{rbo,texture}_{4,8}
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The ARB_sample_shading specification says that setting gl_SampleMask
bits to 0 means that the corresponding sample "should be considered
uncovered for the purposes of multisample fragment operations
(Section 4.1.3)."
The OpenGL 4.4 specification, section 17.3.3 ("Multisample Fragment
Operations") specifies:
"No changes to the fragment alpha or coverage values are made at this
step if MULTISAMPLE is disabled, or if the value of SAMPLE_BUFFERS
is not one."
oMask output alters coverage masks and can kill pixels. We need to
disable it in the above case, which conveniently corresponds to
key->multisample_fbo being false.
Khronos bug #12188 also spells this out clearly:
https://cvs.khronos.org/bugzilla/show_bug.cgi?id=12188
Fixes two Piglit tests:
tests/spec/arb_sample_shading/builtin-gl-sample-mask-simple 0
tests/spec/arb_sample_shading/builtin-gl-sample-mask 0
Fixes 21 ES3 conformance tests:
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_0
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_1
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8.samples_0.mask_7
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_4
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_5
ES31-CTS.sample_variables.mask.rgba8i.samples_0.mask_7
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_3
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_4
ES31-CTS.sample_variables.mask.rgba8ui.samples_0.mask_6
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_zero
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_0
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_2
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_5
ES31-CTS.sample_variables.mask.rgba32f.samples_0.mask_7
Fixes 9 dEQP-GLES31.functional.shaders.sample_variables tests:
sample_mask.discard_half_per_pixel.default_framebuffer
sample_mask.discard_half_per_pixel.singlesample_rbo
sample_mask.discard_half_per_pixel.singlesample_texture
sample_mask.discard_half_per_sample.default_framebuffer
sample_mask.discard_half_per_sample.singlesample_rbo
sample_mask.discard_half_per_sample.singlesample_texture
sample_mask.discard_half_per_two_samples.default_framebuffer
sample_mask.discard_half_per_two_samples.singlesample_rbo
sample_mask.discard_half_per_two_samples.singlesample_texture
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I'm going to need a key entry meaning "we have a multisample FBO,
and multisampling is enabled" in an upcoming patch. This is basically
wm_key->compute_sample_id, except that it also checks that the SAMPLE_ID
system value is read.
The only use of wm_key->compute_sample_id is in emit_sampleid_setup(),
which is only called when handling the SAMPLE_ID system value. So we
can just eliminate the check and generalize the field.
v2: Also update the Vulkan driver.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This was only used by the old gl_SampleID calculations. The new code
doesn't need to handle 2x specially.
v2: Delete it from the Vulkan driver, too.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
On Gen7+, the thread payload provides the sample ID - we can read it
in two instructions, without any elaborate calculations. We don't even
need a state dependency - this will properly produce zero in the
non-MSAA case. Unfortunately, we need the state flag anyway, so we
may as well continue to use it to produce a single MOV 0 instead of
SHR/AND.
For some reason, the sample ID field is always zero on Gen7/7.5, so
we can't use this yet. However, it works fine on Gen8+. So, land the
code and use it where it's working, and leave a TODO for later.
v2: Fix register types in the comment (caught by Matt Turner!).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
This just moves the simple case first.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Some CAPs are stored as 64-bit value while Mesa stores
the related constant as 32-bit value.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, opt_vector_float() always interpreted MOV sources as
floating point, and always created a MOV with a F-type destination.
This meant that we could mess up sequences of integer loads, such as:
mov vgrf6.0.x:D, 0D
mov vgrf6.0.y:D, 1D
mov vgrf6.0.z:D, 2D
mov vgrf6.0.w:D, 3D
Here, integer 0/1/2/3 become approximately 0.0f, so we generated:
mov vgrf6.0:F, [0F, 0F, 0F, 0F]
which is clearly wrong. We can properly handle this by converting
integer values to float (rather than bitcasting), and emitting a type
converting MOV:
mov vgrf6.0:D, [0F, 1F, 2F, 3F]
To do this, see first see if the integer values (converted to float)
are representable. If so, we use a D-type MOV. If not, we then try
the floating point values and an F-type MOV. We make zero not impose
type restrictions. This is important because 0D would imply a D-type
MOV, but is often used in sequences such as MOV 0D, MOV 0x3f800000D,
where we want to use an F-type MOV.
Fixes about 54 dEQP-GLES2 failures with the vec4 VS backend. This
recently became visible due to changes in opt_vector_float() which
made it optimize more cases, but it was a pre-existing bug.
Apparently it also manages to turn more integer loads into VFs,
producing the following shader-db statistics on Haswell:
total instructions in shared programs: 7084195 -> 7082191 (-0.03%)
instructions in affected programs: 246027 -> 244023 (-0.81%)
helped: 1937
total cycles in shared programs: 65669642 -> 65651968 (-0.03%)
cycles in affected programs: 531064 -> 513390 (-3.33%)
helped: 1177
v2: Handle the type of zero better.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We don't handle this properly - we'd have to perform the type conversion
before trying to convert the value to a VF.
While we could do that, it doesn't seem particularly useful - most
vector loads should be consistently typed (all float or all integer).
As a special case, we do allow type-converting MOVs of integer 0, as
it's represented the same regardless of the type. I believe this case
does actually come up.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
After the previous patch, this helper is only called in one place.
So, just fold it back in - there are a lot of parameters here and
not much code.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reworks opt_vector_float() so that there's only one place that
flushes out any accumulated state and emits a VF.
v2: Don't break the sequence for non-representable numbers - just skip
recording their values. Only break it for non-MOVs or register
changes.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally, we split uniforms at the end but in Vulkan, we bail because we
don't want pull constants. However, we still need them split because
pack_uniforms relies on it.
I really don't like this patch not because it doesn't work (it does) but
because now that we're using MOV_INDIRECT, uniform numbers and sizes don't
really matter anymore. In the FS backend, uniform splitting and packing is
handled all at once (actual re-assignment of locations happens later) and
we really should do it that way in vec4 eventually as well.
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
|
| |
This was actually caught by Ken in review the first time around but somehow
didn't get fixed before the patches were pushed. :-(
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
| |
We shouldn't be reading the const_index directly
Reviewed-by: Iago Toral Quiroga <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94998
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=95001
|
|
|
|
|
|
|
|
| |
v2: Also depend on atomic counters.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
They were already declared as such. It was changed here:
commit 31f0967fb50101437d2568e9ab9640ffbcbf7ef9
Author: Ian Romanick <[email protected]>
Date: Wed Sep 2 14:43:18 2015 -0700
i965: Make intel_miptree_map_raw static
Cc: Ian Romanick <[email protected]>
Signed-off-by: Ben Widawsky <[email protected]>
Reviewed-by: Timothy Arceri <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Color clears should respect each drawbuffer's color mask state.
Previously, we tried to leave the color mask untouched. However,
_mesa_meta_drawbuffers_from_bitfield() ended up rebinding all the
color drawbuffers in a different order, so we ended up pairing
drawbuffers with the wrong color mask state.
The new _mesa_meta_drawbuffers_and_colormask() function does the
same job as the old _mesa_meta_drawbuffers_from_bitfield(), but
also rearranges the color mask state to match the new drawbuffer
configuration.
This code was largely ripped off from Gallium's st_Clear code.
This fixes ES31-CTS.draw_buffers_indexed.color_masks, which binds
up to 8 drawbuffers, sets color masks for each, and then calls
glClearBufferfv to clear each buffer individually. ClearBuffer
causes us to rebind only one drawbuffer, at which point we used
ctx->Color.ColorMask[0] (draw buffer 0's state) for everything.
We could probably delete _mesa_meta_drawbuffers_from_bitfield(),
but I'd rather not think about the i965 fast clear code. Topi is
rewriting a bunch of that soon anyway, so let's delete it then.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94847
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This allows meta operations to inspect the existing color mask, and
then do their own smashing.
BlitFramebuffer and Clear already override the color mask, so this
was also redundant.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We have had a guard against OOB array access of images on IVB for a long
time, but it can actually cause hangs on any GPU generation. This can
happen due to getting an untyped SURFACE_STATE for a typed message. We
didn't used to hit this with the piglit test on anything other than IVB
because the OOB in the test would cause us to go past the top of the pull
constant UBO and we would get a surface index of 0 which is was always a
valid surface. Now that we're pushing small arrays, we can end up grabbing
garbage from the GRF and going to some random index which causes a hang.
The solution is to just do the bounds check on all hardware.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94944
Reviewed-by: Francisco Jerez <[email protected]>
Tested-by: Mark Janes <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
CompressedTexImage
Enable drivers to use their own implementation of this method instead of
the mesa default. Since the drivers that currently overwrite
dd_function_table::CompressedTexSubImage also overwrite
::CompressedTexImage, there should be no behavioral change.
Signed-off-by: Nanley Chery <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Up until now, we have been able to assume that all push constants are
vec4-aligned because this is what the GL driver gives us. In Vulkan, we
need to be able to support full std140 because we get the layout from the
client.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
The surface format table hasn't entirely been kept up-to-date. This commit
marks a couple more compressed formats as sampleable on gen8+ and adds the
A4B4G4R4 format as renderable on gen9.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This removes a hack introduced in 1999 in the first version of
fakeglx.c, with the comment:
/* XXX revisit this after 3.0 is finished. */
Mesa 4.0 was released in 2001. It is now 2016, and Mesa 11.0 was
released last year.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The vishandle member of XMesaVisualInfo is used to support the
comparison of XVisualInfo instances by pointer value, in
find_glx_visual(). The comparison however will always be false, as in
every case the comparison is made, the VisualInfo instance being
compared to is a new allocation passed in through a GLX API call.
In addition, the XVisualInfo instance pointed to by vishandle is itself
never freed, causing a memory leak. Since vishandle is essentially
useless, we just remove it and thereby also fix the leak.
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The returned XVisualInfo from glXChooseVisual/glXGetVisualFromFBConfig
is being cached in XMesaVisual.vishandle (and unconditionally
overwritten on subsequent calls). However, these entry points are
specified to return XVisualInfo instances to be owned by the caller and
freed with XFree(), so the return values should not be retained.
With this change, XMesaVisual.vishandle is essentially unused and will
be removed in a subsequent change.
v2: update commit message
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
driCreateContextAttribs() emits an error if bit
__DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS is set for an ES context. But,
EGL_EXT_create_context_robustness and EGL 1.5 both allow creation of
robust ES contexts. One requests a robust ES context by setting the
EGL_CONTEXT_OPENGL_ROBUST_ACCESS *attribute*, which Mesa's EGL layer
translates into the __DRI_CTX_FLAG_ROBUST_BUFFER_ACCESS *bit*.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Unfortunately, this also means that we need to use a slightly different
algorithm for assign_constant_locations. The old algorithm worked based on
the assumption that each read of a uniform value read exactly one float.
If it encountered a MOV_INDIRECT, it would immediately bail and push the
whole thing. Since we can now read ranges using MOV_INDIRECT, we need to
be able to push a series of floats without breaking them up. To do this,
we use an algorithm similar to the on in split_virtual_grfs.
Reviewed-by: Kristian Høgsberg <[email protected]>
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Kristian Høgsberg <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|