| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
It's called by the inline intel_batchbuffer_begin() function which
itself is used in BEGIN_BATCH. So in sequence of code emitting multiple
packets, we have inlined this ~200 byte function multiple times. Making
it an out-of-line function presumably improved icache usage.
Improves performance of Gl32Batch7 by 3.39898% +/- 0.358674% (n=155) on
Ivybridge.
Reviewed-by: Abdiel Janulgue <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Since there is no way to create immutable texture buffers in GL ES,
mutable buffer textures are allowed to back images. See issue 7 of the
GL_OES_texture_buffer specification.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Samuel Pitoiset <[email protected]>
|
|
|
|
|
|
|
|
|
| |
No longer called from any other file.
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Tested-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The prepare_mipmap_level() wrapper for _mesa_prepare_mipmap_level() is
not needed. It only served to undo the GL_TEXTURE_1D_ARRAY height/depth
change was was made before the call to prepare_mipmap_level()
Said another way, regardless of how the meta code manipulates the height/
depth dims for GL_TEXTURE_1D_ARRAY, the gl_texture_image dimensions are
correctly set up by _mesa_prepare_mipmap_levels().
Tested by plugging _mesa_meta_GenerateMipmap() into the swrast driver
and testing with piglit.
v2 (idr): Early out of the mipmap generation loop with dstImage is NULL.
This can occur for immutable textures that have a limited range of
levels or in the presense of memory allocation failures. Fixes
arb_texture_view-mipgen on Intel platforms.
Reviewed-by: José Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Tested-by: Ian Romanick <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This adds the glXCreateContextAttribsARB() function for the xlib/swrast
driver. This allows more piglit tests to run with this driver.
For example, without this patch we get:
$ bin/fbo-generatemipmap-1d -auto
piglit: error: waffle_config_choose failed due to WAFFLE_ERROR_UNSUPPORTED_
ON_PLATFORM: GLX_ARB_create_context is required in order to request an OpenGL
version not equal to the default value 1.0
piglit: error: Failed to create waffle_config for OpenGL 2.0 Compatibility Context
piglit: info: Failed to create any GL context
PIGLIT: {"result": "skip" }
Reviewed-by: Jose Fonseca <[email protected]>
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The whole st_generate_mipmap() function was overly complicated. Now
we just call the new _mesa_prepare_mipmap_levels() function to prepare
the texture mipmap memory, then call the generate function which fills
in the texture images.
This fixes a failed assertion in llvmpipe/softpipe which is hit with the
new piglit generatemipmap-base-change test. Also fixes some device errors
(format mismatches) with the VMware svga driver.
v2: fix a comment typo, per Sinclair
Reviewed-by: Sinclair Yeh <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Simplifies the loops in generate_mipmap_uncompressed() and
generate_mipmap_compressed(). Will be used in the state tracker too.
Could probably be used in the meta code. If so, some additional
clean-ups can be done after that.
v2: use unsigned types instead of GLuint, per Ian
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There is no linear filtering for integer formats, so we should always
be using CLAMP_TO_EDGE mode.
Fixes 46 dEQP cases on Ivybridge (which were likely broken by commit
0faf26e6a0a34c3544644852802484f2404cc83e).
This workaround doesn't appear to be necessary on any other hardware;
I haven't found any documentation mentioning errata in this area.
v2: Only apply on Ivybridge/Baytrail to avoid regressing GLES3.1 tests.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]> [v1]
|
|
|
|
|
|
|
| |
This reverts commit 60d6a8989ab44cf47accee6bc692ba6fb98f6a9f.
It's pretty sketchy, and apparently regressed a bunch of dEQP tests
on Sandybridge.
|
|
|
|
|
|
|
|
|
|
| |
Avoid using internal structures from another API.
v2: rebase and moved includes so they don't cause problem when VDPAU isn't installed.
Signed-off-by: Christian König <[email protected]>
Reviewed-by: Marek Olšák <[email protected]> (v1)
Reviewed-by: Leo Liu <[email protected]>
|
|
|
|
|
|
|
|
| |
OES_texture_buffer combines bits from a number of desktop extensions.
When they're all available, turn it on.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
| |
Allow ES 3.1 contexts to access the texture buffer functionality.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We need to add a new bit since the GL ES exts require functionality from
a combination of texture buffer extensions as well as images (for
imageBuffer) support. Additionally, not all GPUs support all the texture
buffer functionality (e.g. rgb32 isn't supported by nv50).
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This fixes all failures with dEQP tests in this area. While
ARB_texture_buffer_object explicitly says that GetTexLevelParameter & co
should not be supported, GL 3.1 reverses this decision and allows all of
these queries there.
Conversely, there is no text that forbids the buffer-specific queries
from being used with non-buffer images.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
The only place this was used was in a gallium debug function that
had to be manually enabled.
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Yuanhan Liu decided these were useful for linear filtering in
commit 76669381 (circa 2011). Prior to that, we never set them;
it seems he tried to preserve that behavior for nearest filtering.
It turns out they're useful for nearest filtering, too: setting
these fixes the following dEQP-GLES3 tests:
functional.fbo.blit.rect.nearest_consistency_mag
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_min
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_min_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_mag_reverse_src_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_dst_y
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_x
functional.fbo.blit.rect.nearest_consistency_out_of_bounds_min_reverse_src_dst_y
Apparently, BLORP has always set these bits unconditionally.
However, setting them unconditionally appears to regress tests using
texture projection, 3D samplers, integer formats, and vertex shaders,
all in combination, such as:
functional.shaders.texture_functions.textureprojlod.isampler3d_vertex
Setting them on Gen4-5 appears to regress Piglit's
tests/spec/arb_sampler_objects/framebufferblit.
Honestly, it looks like the real problem here is a lack of precision.
I'm just hacking around problems here (as embarassing as it is).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When using seamless cube map mode and NEAREST filtering, we explicitly
overrode the wrap modes to CLAMP_TO_EDGE. This was to implement the
following spec text:
"If NEAREST filtering is done within a miplevel, always apply apply
wrap mode CLAMP_TO_EDGE."
However, textureGather() ignores the sampler's filtering mode, and
instead returns the four pixels that would be blended by LINEAR
filtering. This implies that we should do proper seamless filtering,
and include pixels from adjacent cube faces.
It turns out that we can simply delete the NEAREST -> CLAMP_TO_EDGE
overrides. Normal cube map sampling works by first selecting the
face, and then nearest filtering fetches the closest texel. If the
nearest texel was on a different face, then that face would have been
chosen. So it should always be within the face anyway, which
effectively performs CLAMP_TO_EDGE.
Fixes 86 dEQP-GLES31.texture.gather.basic.cube.* tests.
Signed-off-by: Kenneth Graunke <[email protected]>
Suggested-by: Ian Romanick <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Our driver uses the brw_render_cache mechanism to track buffers we've
rendered to and are about to sample from.
Previously, we did a single PIPE_CONTROL with the following bits set:
- Render Target Flush
- Depth Cache Flush
- Texture Cache Invalidate
- VF Cache Invalidate
- Instruction Cache Invalidate
- CS Stall
This combined both "top of pipe" invalidations and "bottom of pipe"
flushes, which isn't how the hardware is intended to be programmed.
The "top of pipe" invalidations may happen right away, without any
guarantees that rendering using those caches has completed. That
rendering may continue altering the caches. The "bottom of pipe"
flushes do wait for the rendering to complete. The CS stall also
prevents further work from happening until data is flushed out.
What we wanted to do was wait for rendering complete, flush the new
data out of the render and depth caches, wait, then invalidate any
stale data in read-only caches. We can accomplish this by doing the
"bottom of pipe" flushes with a CS stall, then the "top of pipe"
flushes as a second PIPE_CONTROL. The flushes will wait until the
rendering is complete, and the CS stall will prevent the second
PIPE_CONTROL with the invalidations from executing until the first
is done.
Fixes dEQP-GLES3.functional.texture.specification.teximage2d_pbo
subtests on Braswell and Skylake. These tests hit the meta PBO
texture upload path, which binds the PBO as a texture and samples
from it, while rendering to the destination texture. The tests
then sample from the texture.
For now, we leave Gen4-5 alone. It probably needs work too, but
apparently it hasn't even been setting the (G45+) TC invalidation
bit at all...
v2: Add Sandybridge post-sync non-zero workaround, for safety.
Cc: [email protected]
Suggested-by: Francisco Jerez <[email protected]>
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
dEQP-GLES31.functional.fbo.no_attachments.* draws a quad with no
framebuffer attachments, using a shader that discards based on
gl_FragCoord. It uses occlusion queries to inspect whether pixels
are rendered or not.
Unfortunately, the hardware is not dispatching any pixel shaders,
so discards never happen, and the full quad of pixels increments
PS_DEPTH_COUNT, making the occlusion query results bogus.
To understand why, we have to delve into the WM_INT internal
signalling mechanism's formulas.
The "WM_INT::Pixel Shader Kill Pixel" signal is defined as:
3DSTATE_WM::ForceKillPixel == ON ||
(3DSTATE_WM::ForceKillPixel != Off &&
!WM_INT::WM_HZ_OP &&
3DSTATE_WM::EDSC_Mode != PREPS &&
(WM_INT::Depth Write Enable || WM_INT::Stencil Write Enable) &&
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
(3DSTATE_PS_EXTRA::PixelShaderKillsPixels ||
3DSTATE_PS_EXTRA:: oMask Present to RenderTarget ||
3DSTATE_PS_BLEND::AlphaToCoverageEnable ||
3DSTATE_PS_BLEND::AlphaTestEnable ||
3DSTATE_WM_CHROMAKEY::ChromaKeyKillEnable))
Because there is no depth or stencil buffer, writes to those buffers
are disabled. So the highlighted condition is false, making the whole
"Kill Pixel" condition false. This then feeds into the following
"WM_INT::ThreadDispatchEnable" condition:
3DSTATE_WM::ForceThreadDispatch != OFF &&
!WM_INT::WM_HZ_OP &&
3DSTATE_PS_EXTRA::PixelShaderValid &&
(3DSTATE_PS_EXTRA::PixelShaderHasUAV ||
WM_INT::Pixel Shader Kill Pixel ||
WM_INT::RTIndependentRasterizationEnable ||
(!3DSTATE_PS_EXTRA::PixelShaderDoesNotWriteRT &&
3DSTATE_PS_BLEND::HasWriteableRT) ||
(WM_INT::Pixel Shader Computed Depth Mode != PSCDEPTH_OFF &&
(WM_INT::Depth Test Enable || WM_INT::Depth Write Enable)) ||
(3DSTATE_PS_EXTRA::Computed Stencil && WM_INT::Stencil Test Enable) ||
(3DSTATE_WM::EDSC_Mode == 1 && (WM_INT::Depth Test Enable ||
WM_INT::Depth Write Enable ||
WM_INT::Stencil Test Enable)))
Given that there's no depth/stencil testing, no writeable render target,
and the hardware thinks kill pixel doesn't happen, all of these
conditions are false. We have to whack some bit to make PS invocations
happen. There are many options.
Curro suggested using the UAV bit. There's some precedence in doing
that - we set it for fragment shaders that do SSBO/image/atomic writes
when no color buffer writes are enabled. We can simply include discard
here too.
Fixes 64 dEQP-GLES31.functional.fbo.no_attachments.* tests.
v2: Add a comment suggested and written by Jason Ekstrand.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Francisco Jerez <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The st_texture_object documentation says:
"the number of 1D array layers will be in height0"
We can't minify that.
Spotted by luck. No app is known to hit this issue.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
v2: comment about the purpose of the code
v3: also compare texFormat,
add a perf debug message,
formatting fixes
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
This fixes crash when post-processing is enabled in SW:KotOR.
v2: fix const-ness
v3: move assignment into the if() block
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v2: fix arithmetic for special opcodes,
fix fog state, cleanup
v3: simplify handling of special opcodes,
fix rebinding with different textargets or fog equation,
lots of formatting fixes
v4: adapt to the compile early, fix later architecture,
formatting fixes
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
| |
the state tracker will use it
Acked-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Signed-off-by: Miklós Máté <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
While here, remove itermediate glsl_feature_level variable.
Signed-off-by: Edward O'Callaghan <[email protected]>
Signed-off-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This reduces some of the craziness required for handling buffer
blocks. The problem is each shader stage holds its own information
about a block in memory, we were copying that information to a
program wide list but the per stage information remained meaning
when a binding was updated we needed to update all versions of it.
This changes the per stage blocks to instead point to a single
version of the block information in the program list.
Acked-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the ES 3.2 spec, section 16.1.1 (Selecting Buffers for Reading):
"An INVALID_ENUM error is generated if src is not BACK or one of
the values from table 15.5."
Table 15.5 contains NONE and COLOR_ATTACHMENTi.
Mesa properly returned INVALID_ENUM for unknown enums, but it decided
what was known by using read_buffer_enum_to_index, which handles all
enums in every API. So enums that were valid in GL were making it
past the "valid enum" check. Such targets would then be classified
as unsupported, and we'd raise INVALID_OPERATION, but that's technically
the wrong error code.
Fixes dEQP-GLES31's
functional.debug.negative_coverage.get_error.buffer.read_buffer
v2: Only call read_buffer_enuM_to_index when required (Eduardo).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The KHR_debug spec doesn't actually say we should handle this, but that
is most likely an oversight - it says to check against strlen and
generate errors if length is negative. It appears they just forgot to
explicitly spell out that we should then proceed to actually handle it.
Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the KHR_debug spec, section 5.5.5 (Externally Generated Messages):
"If <length> is negative, it is implied that <buf> contains a null
terminated string. The error INVALID_VALUE will be generated if the
number of characters in <buf>, excluding the null terminator when
<length> is negative, is not less than the value of
MAX_DEBUG_MESSAGE_LENGTH."
This indicates that length should be set to strlen for all types, not
just GL_DEBUG_TYPE_MARKER. We want it to be after validate_length()
so we still generate appropriate errors.
Fixes crashes from uncaught std::string exceptions in many
dEQP-GLES31.functional.debug.error_filters.* tests.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From the KHR_debug spec:
"Applications can query the number of messages currently in the log by
obtaining the value of DEBUG_LOGGED_MESSAGES, and the string length
(including its null terminator) of the oldest message in the log
through the value of DEBUG_NEXT_LOGGED_MESSAGE_LENGTH."
Because we weren't including the null terminator, many dEQP tests
called glGetDebugMessageLog with a bufSize parameter that was 1 too
small, and unable to contain the message, so we skipped returning it,
failing many cases.
Fixes 298 dEQP-GLES31.functional.debug.* tests.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Stephane Marchesin <[email protected]>
Reviewed-by: Eduardo Lima Mitev <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This fixes a regression introduced by commit a8eea696 "st/mesa: honour sized
internal formats in st_choose_format (v2)".
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94657
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=94671
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_is_multisample_enabled.
This removes any dependency on driver validation of the number of
framebuffer samples.
Signed-off-by: Bas Nieuwenhuizen <[email protected]>
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
Tested-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Francisco Jerez <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
| |
No shader-db changes on Broadwell
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
NIR already has this optimization and it can do much better than the little
peephole in the backend.
No shader-db change on Haswell or Broadwell.
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously we were doing the lowering by hand in vec4_visitor::emit_lrp.
By doing it in NIR, we have the opportunity for NIR to do additional
optimization of the expanded code.
This also enables optimizations added by the next commit.
shader-db results:
G4X / Ironlake
total instructions in shared programs: 4024401 -> 4016538 (-0.20%)
instructions in affected programs: 447686 -> 439823 (-1.76%)
helped: 2623
HURT: 0
total cycles in shared programs: 84375846 -> 84328296 (-0.06%)
cycles in affected programs: 16964960 -> 16917410 (-0.28%)
helped: 2556
HURT: 41
Unsurprisingly, no changes on later platforms.
v2: Formatting and comment changes suggested by Matt.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I noticed some heap corruption running virgl tests, and valgrind
helped me to track it down to the following error:
==29272== Invalid write of size 4
==29272== at 0x90283D4: push_loop_stack (brw_eu_emit.c:1307)
==29272== by 0x9029A7D: brw_DO (brw_eu_emit.c:1750)
==29272== by 0x90554B0: fs_generator::generate_code(cfg_t const*, int) (brw_fs_generator.cpp:1999)
==29272== by 0x904491F: brw_compile_fs (brw_fs.cpp:5685)
==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272== by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)
==29272== by 0x8C84325: _mesa_link_program (shaderapi.c:1042)
==29272== by 0x8C851D7: _mesa_LinkProgram (shaderapi.c:1515)
==29272== by 0x4E4B8E8: add_shader_program (vrend_renderer.c:880)
==29272== Address 0xf2f3cb0 is 0 bytes after a block of size 112 alloc'd
==29272== at 0x4C2AA98: calloc (vg_replace_malloc.c:711)
==29272== by 0x8ED11F7: ralloc_size (ralloc.c:113)
==29272== by 0x8ED1282: rzalloc_size (ralloc.c:134)
==29272== by 0x8ED14C0: rzalloc_array_size (ralloc.c:196)
==29272== by 0x9019C7B: brw_init_codegen (brw_eu.c:291)
==29272== by 0x904F565: fs_generator::fs_generator(brw_compiler const*, void*, void*, void const*, brw_stage_prog_data*, unsigned int, bool, gl_shader_stage) (brw_fs_generator.cpp:124)
==29272== by 0x9044883: brw_compile_fs (brw_fs.cpp:5675)
==29272== by 0x8FC5DC5: brw_codegen_wm_prog (brw_wm.c:137)
==29272== by 0x8FC7663: brw_fs_precompile (brw_wm.c:638)
==29272== by 0x8FA4040: brw_shader_precompile(gl_context*, gl_shader_program*) (brw_link.cpp:51)
==29272== by 0x8FA4A9A: brw_link_shader (brw_link.cpp:260)
==29272== by 0x8DEF751: _mesa_glsl_link_shader (ir_to_mesa.cpp:3006)
if_depth_in_loop is an array of size p->loop_stack_array_size, and
push_loop_stack() will access if_depth_in_loop[p->loop_stack_depth+1],
thus the condition to grow the array should be
p->loop_stack_array_size <= (p->loop_stack_depth + 1) (it's currently
off by 2...)
This can be reproduced by running the following test with virgl test
server:
LIBGL_ALWAYS_SOFTWARE=y GALLIUM_DRIVER=virpipe bin/shader_runner
./tests/shaders/glsl-fs-unroll-explosion.shader_test -auto
Signed-off-by: Marc-André Lureau <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
Add code to handle GL_INTERNALFORMAT_PREFERRED.
Add code to deal with GL_RENDERBUFFER being passes into ChooseTextureFormat.
Reviewed-by: Alejandro Piñeiro <[email protected]>
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
| |
Cc: [email protected]
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Alejandro Piñeiro <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use. The ES 3.0 specification is completely clear, and has
always stated this.
However, the GL specification has changed behavior in 4.1, 4.2, and
4.4. The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording. However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.
This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Because the rules for sRGB are so insane, we change brw_blorp_miptrees
to take decode_srgb and encode_srgb flags, which control linearization
of the source and destination separately.
This should make it easy to implement whatever crazy combination of
rules people throw at us. For now, it should be equivalent.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Topi Pohjolainen <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
According to the ES 3.0 and GL 4.4 specifications, glBlitFramebuffer
is supposed to perform sRGB decoding and encoding whenever sRGB formats
are in use. The ES 3.0 specification is completely clear, and has
always stated this.
However, the GL specification has changed behavior in 4.1, 4.2, and
4.4. The original behavior stated that no sRGB encoding should occur.
The 4.4 behavior matches ES 3.0's wording. However, implementing the
new behavior appears to break applications such as Left 4 Dead 2.
This patch changes Meta to apply the ES 3.x rules in ES 3.x, but
leaves OpenGL alone for now, to avoid breaking applications.
Meta implements several other functions in terms of BlitFramebuffer,
and many of those explicitly do not perform sRGB encoding. So, this
patch explicitly disables sRGB encoding in those other functions,
preserving the existing (correct) behavior.
If you're from the future and are reading this, hi! Welcome to
the "fun" of debugging sRGB problems! Best of luck!
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
|
|
| |
Re-order flags in the order in which they appear in the OpenGL spec in the
description of MemoryBarrier().
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
|
|
| |
v2: support both TGSI_TEXTURE_2D and _RECT
Reviewed-by: Marek Olšák <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|