| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
| |
That way, if we do the usual thing of multiplying vector_elements by
matrix_columns we get the actual number of components in the type as per
component_slots().
While we're at it, we also switch to using the actual C++ field
initializers for vector_elements and matrix_columns.
Reviewed-by: Dave Airlie <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, we had a bunch of code in each stage to figure out how many
slots we needed in stage_prog_data.param. This code was mostly identical
across the stages and had been copied and pasted around. Unfortunately,
this meant that any time you did something special, you had to add code for
it to each of these places. In particular, none of the stages took
subroutines into account; they were working entirely by accident. By
taking this data from the NIR shader, we know the exact number of entries
we need and everything goes a bit smoother.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
The next commit will add code to codegen_vs_prog that requires the NIR
shader to be there in all cases. It doesn't hurt anything to just move it
from brw_vs_emit to its only caller.
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
| |
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GLSL IR vs. NIR shader-db results for vec4 programs on i965:
total instructions in shared programs: 1499328 -> 1388354 (-7.40%)
instructions in affected programs: 1245199 -> 1134225 (-8.91%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on G4x:
total instructions in shared programs: 1436799 -> 1325825 (-7.72%)
instructions in affected programs: 1205599 -> 1094625 (-9.20%)
helped: 7469
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Iron Lake:
total instructions in shared programs: 1436654 -> 1325682 (-7.72%)
instructions in affected programs: 1205503 -> 1094531 (-9.21%)
helped: 7468
HURT: 2440
GLSL IR vs. NIR shader-db results for vec4 programs on Sandy Bridge:
total instructions in shared programs: 2016249 -> 1787033 (-11.37%)
instructions in affected programs: 1850547 -> 1621331 (-12.39%)
helped: 14856
HURT: 1481
GLSL IR vs. NIR shader-db results for vec4 programs on Ivy Bridge:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Bay Trail:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
GLSL IR vs. NIR shader-db results for vec4 programs on Haswell:
total instructions in shared programs: 1848027 -> 1648216 (-10.81%)
instructions in affected programs: 1660279 -> 1460468 (-12.03%)
helped: 14668
HURT: 1369
I also ran our full suite of benchmarks on a Haswell and had the following
statistically significant (according to ministat) changes:
Test master-glsl master-nir diff
bench_OglGeomPoint 461.556 463.006 1.450
bench_OglTerrainFlyInst 184.484 187.574 3.090
bench_OglTerrainPanInst 132.412 136.307 3.895
bench_OglTexFilterAniso 19.653 19.645 -0.008
bench_OglTexFilterTri 58.333 58.009 -0.324
bench_OglVSInstancing 65.049 65.327 0.278
bench_trexoff 69.474 69.694 0.220
bench_valley 40.708 41.125 0.417
v2 (Jason Ekstrand):
- Remove more uses of NirOptions as a switch
- New shader-db numbers
- Added benchmark numbers
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 0a1adaf11d051b71b4c46aabee2e5342f2d6aef3 (nir: Report progress
from nir_lower_system_values().) introduced a bug caught by Valgrind:
==823== Conditional jump or move depends on uninitialised value(s)
==823== at 0xB09020C: convert_block (nir_lower_system_values.c:68)
==823== by 0xB079FB8: foreach_cf_node (nir.c:1310)
==823== by 0xB07A0AF: nir_foreach_block (nir.c:1336)
==823== by 0xB09026B: convert_impl (nir_lower_system_values.c:79)
...
==823== Uninitialised value was created by a stack allocation
==823== at 0xB090249: convert_impl (nir_lower_system_values.c:76)
which is trivially fixed by initializing progress.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some loops may have phi nodes that look like:
foo = ...
loop {
bar = phi(foo, bar)
...
}
in which case we can remove the phi node and replace all uses of 'bar'
with 'foo'. In particular, there are some L4D2 vertex shaders with loops
that, after optimization, look like:
/* succs: block_1 */
loop {
block block_1:
/* preds: block_0 block_4 */
vec1 ssa_2195 = phi block_0: ssa_2136, block_4: ssa_994
vec1 ssa_7321 = phi block_0: ssa_8195, block_4: ssa_7321
vec1 ssa_7324 = phi block_0: ssa_8198, block_4: ssa_7324
vec1 ssa_7327 = phi block_0: ssa_8174, block_4: ssa_7327
vec1 ssa_8139 = intrinsic load_uniform () () (232)
vec1 ssa_588 = ige ssa_2195, ssa_8139
/* succs: block_2 block_3 */
if ssa_588 {
block block_2:
/* preds: block_1 */
break
/* succs: block_5 */
} else {
block block_3:
/* preds: block_1 */
/* succs: block_4 */
}
block block_4:
/* preds: block_3 */
vec1 ssa_994 = iadd ssa_2195, ssa_2150
/* succs: block_1 */
}
where after removing the second, third, and fourth phi nodes, the loop becomes
entirely dead, and this patch will cause the loop to be deleted entirely.
No piglit regressions.
Shader-db results on bdw:
instructions in affected programs: 5824 -> 5664 (-2.75%)
total loops in shared programs: 2234 -> 2202 (-1.43%)
helped: 32
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Connor Abbott <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add a macro GL_LIB_NAME to hold the filename that configure comes up with
based on the --with-gl-lib-name and --enable-mangling options.
In driOpenDriver, use the GL_LIB_NAME macro instead of hard-coding
"libGL.so.1".
v2: Add an #ifndef/#define for GL_LIB_NAME so that non-autoconf builds will
work.
v3: Fix the library filename in the Makefile.
Signed-off-by: Kyle Brenneman <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When USE_MGL_NAMESPACE is defined, _glapi_get_stub will check for the "m"
prefix before trying to skip it, so that "glFoo" and "mglFoo" are
equivalent.
This should let it work with all the places where something calls
_glapi_get_proc_offset with a hard-coded name that starts with the normal
"gl" prefix.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Rearranged the GLX_ALIAS macro in glextensions.h so that it will pick up
the renames from glx_mangle.h.
Fixed the alias attribute for glXGetProcAddress when USE_MGL_NAMESPACE is
defined.
v2: Add a comment clarifying why GLX_ALIAS needs two macros.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55552
Signed-off-by: Kyle Brenneman <[email protected]>
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes following Piglit test:
member-invalid-binding-qualifier.frag
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When writing to a column of a row-major matrix, each component of the
vector is stored to non-consecutive memory addresses, so we generate
one instruction per component.
This patch skips the disabled components in the writemask, saving some
store instructions plus avoid storing wrong data on each disabled
component.
Signed-off-by: Samuel Iglesias Gonsalvez <[email protected]>
Reviewed-by: Iago Toral Quiroga <[email protected]>
|
|
|
|
|
|
|
|
| |
Fixes a failing subtest in:
ES31-CTS.shader_storage_buffer_object.negative-glsl-compileTime
Signed-off-by: Tapani Pälli <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
The result of POW for a negative base is undefined. Even when the result
is multiplied by zero (which is the case here whenever the base is
negative), the Inf and NaNs can propagate past that.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91342
Signed-off-by: Daniel Scharrer <[email protected]>
Cc: "10.6 11.0" <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
| |
Reading this output was really confusing. reg represents attribute
slots; reg_offset is the x/y/z/w component (0..3) within a vec4 slot.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The code for input lowering is going to get significantly more
complicated shortly, so I wanted to pull it out. Vertex shader inputs
are handled nearly identically regardless of vec4/scalar mode, so I
opted to not split that.
I thought about having each function actually do the lowering, but one
pass through nir_lower_io that handles all types (which weren't handled
earlier) is probably more efficient.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We may want to use different type_size functions for (e.g.) inputs
vs. uniforms. Passing in -1 for mode ignores this, handling all
modes as before.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
If the texture object exists, but the Name field is zero, it means
the object was created but never bound to a target. Trying to bind it
in _mesa_BindTextureUnit() should generate GL_INVALID_OPERATION.
Fixes piglit's arb_direct_state_access-bind-texture-unit test.
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This helper was only called from _mesa_BindTextureUnit(). It's simpler
to just inline it.
The error check / code / message in the helper was incorrect. It was
written for glBindTextures(), not glBindTextureUnit(). The correct
error for a bad texture unit number is GL_INVALID_VALUE. The error
message now reports the unit number rather than a GL_TEXTUREi enum.
Fixes a failure in piglit's arb_direct_state_access-bind-texture-unit test.
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Before, we were doing the actual _mesa_reference_texobj() call and
ctx->Driver.BindTexture() and misc housekeeping in three different
places. This consolidates the common code in a new bind_texture()
function.
Reviewed-by: Tapani Pälli <[email protected]>
|
| |
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
| |
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
| |
Get rid of "../glsl/" paths. Sort alphabetically.
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Since we store both in UniformBlocks, we can't just compute the index by
subtracting the array address start, we need to count the number of
buffers of the approriate type.
v2:
- Just fall back to calc_resource_index (Tapani)
Reviewed-by: Tapani Pälli <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
https://msdn.microsoft.com/en-us/library/ftsafwz3.aspx
v2: use _WIN32 instead of _MSC_VER (Brian Paul)
Signed-off-by: Tapani Pälli <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=92183
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The old code had some significant problems with respect to
sampler2DArray textures. The biggest problem was that some of the code
would use vec3 for the texture coordinate type, and other parts of the
code would use vec2. The resulting shader would not even compile.
Since there were not tests for this path, nobody noticed.
The input to the fragment shader is always treated as a vec3. If the
source data is only vec2, the vertex puller will supply 0 for the .z
component. The texture coordinate passed to the fragment shader is
always a vec2 that comes from the .xy part of the vertex shader input.
The layer, taken from the .z of the vertex shader input is passed
separately as a flat integer. If the generated fragment shader does not
use the layer integer, the GLSL linker will eliminate all the dead code
in the vertex shader.
Fixes the new piglit tests "blit-scaled samples=2 with
gl_texture_2d_multisample_array", etc. on i965.
Note for stable maintainer: This patch may depend on 46037237, and that
patch should be safe for stable.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: Topi Pohjolainen <[email protected]>
Cc: Jordan Justen <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Add comments that link the driver's miptree structures to the hardware
structures documented in the PRM. This provides sorely needed
orientation to developers new to the miptree code. And for miptree
veterans, this clarifies some of the more obscure miptree data.
For each driver struct field that closely corresponds to a
hardware struct field, add a PRM reference to that hardware field's
name. For example,
struct intel_mipmap_tree {
...
/**
* @brief One of GL_TEXTURE_2D, GL_TEXTURE_2D_ARRAY, etc.
*
* @see RENDER_SURFACE_STATE.SurfaceType
* @see RENDER_SURFACE_STATE.SurfaceArray
* @see 3DSTATE_DEPTH_BUFFER.SurfaceType
*/
GLenum target;
...
};
Also annotate the INTEL_MSAA_LAYOUT_* enums with the name of the PRM
sections that documents the layout.
v2: Replace "2D subimage" with "slice", and define what a "slice" is.
For Ben.
Reviewed-by: Anuj Phogat <[email protected]> (v1)
Reviewed-by: Ben Widawsky <[email protected]> (v1)
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The values of intel_mipmap_tree::align_w and ::align_h correspond to the
hardware enums HALIGN_* and VALIGN_*.
See the confusion?
align_h != HALIGN
align_h == VALIGN
Reduce the confusion by renaming the variables to match the hardware
enum names:
git ls-files |
xargs sed -i -e 's/align_w/halign/g' \
-e 's/align_h/valign/g'
Suggested-by: Kenneth Graunke <[email protected]>
Acked-by: Ben Widawsky <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Because that's what it is. It's an untiled, *linear* miptree.
v2:
- Add space after /*.
- Use one comment per function argument.
Reviewed-by: Anuj Phogat <[email protected]>
Acked-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The comment for intel_miptree_map::mode claimed that it was a bitmask of
GL_MAP_{READ,WRITE,INVALIDATE}_BIT. In reality, the bitmask may include
any of {GL,BRW}_MAP_*_BIT.
Reviewed-by: Anuj Phogat <[email protected]>
Acked-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
| |
Clarify that this bit extends the set of GL_MAP_*_BIT enums.
Also fix typo of "temporary".
Reviewed-by: Anuj Phogat <[email protected]>
Acked-by: Ben Widawsky <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Jason open coded this in 60befc63 when cleaning up some ugly code;
using our existing helper tidies it up a bit more.
v2: Drop inline (suggested by Matt).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Jason Ekstrand <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Bring over the following fix from i965:
commit fb3d62fe3d4fc40ba4ad9804d8b6f451316c9ae2
Author: Kenneth Graunke <[email protected]>
Date: Tue Aug 6 14:36:09 2013 -0700
i965: Remember to call intel_prepare_render() before blitting.
Fixes a crash in the following piglit tests:
bin/fbo-sys-blit -auto
bin/fbo-sys-sub-blit -auto
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
i915 fragment programs utilize the texture coordinate registers
for both texture coordinates and varyings. Unfortunately the
code doesn't check if the same index might be in use for both.
It just naively uses the index to pick a texture unit, which
could lead to collisions.
Add an extra mapping step to allocate non conflicting texture
units for both uses.
The issue can be reproduced with a pair of simple shaders like
these:
attribute vec4 in_mod;
varying vec4 mod;
void main() {
mod = in_mod;
gl_TexCoord[0] = gl_MultiTexCoord0;
gl_Position = gl_ModelViewProjectionMatrix * gl_Vertex;
}
varying vec4 mod;
uniform sampler2D tex;
void main() {
gl_FragColor = texture2D(tex, vec2(gl_TexCoord[0])) * mod;
}
Fixes many piglit tests on i915:
glsl-link-varyings-2
glsl-orangebook-ch06-bump
interpolation-none-gl_frontcolor-smooth-fixed
interpolation-none-gl_frontcolor-smooth-none
interpolation-none-gl_frontcolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-fixed
interpolation-none-gl_frontsecondarycolor-smooth-vertex
interpolation-none-gl_frontsecondarycolor-smooth-none
interpolation-none-other-flat-fixed
interpolation-none-other-flat-none
interpolation-none-other-flat-vertex
interpolation-none-other-smooth-fixed
interpolation-none-other-smooth-none
interpolation-none-other-smooth-vertex
v2 [idr]: Minor formatting tweaks.
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "11.0" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
I830_UPLOAD_RASTER_RULES and I830_UPLOAD_TEX(0) are trying to occupy
the same bit. Move the texture bits upwards a bit to make room for
I830_UPLOAD_RASTER_RULES.
Now the driver will actually upload the raster rules which is rather
important to get the provoking vertex right. Fixes the appearance
of glxgears teeth on gen2.
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "10.6 11.0" <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Jordan Justen <[email protected]>
Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Commit 1665d29ee3125743fd6daf3c43fc715f543d5669 introduced an incorrect
format specifier that operates on GLintptr indirect within the function
_mesa_DispatchComputeIndirect().
This patch mitigates the introduced GCC warning:
src/mesa/main/compute.c: In function '_mesa_DispatchComputeIndirect':
src/mesa/main/compute.c:53:7: warning: format '%d' expects argument of type 'int', but argument 3 has type 'GLintptr' [-Wformat=]
_mesa_debug(ctx, "glDispatchComputeIndirect(%d)\n", indirect);
^
v2: Amend for Boyan Ding <[email protected]> feedback.
Signed-off-by: Rhys Kidd <[email protected]>
Reviewed-by: Jordan Justen <[email protected]>
|
|
|
|
|
|
| |
They are no longer used.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
| |
They haven't been used since 1bba29ed403e735ba0bf04ed8aa2e571884fcaaf so
there's no good reason to keep them around.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
| |
Reported-by: Ilia Mirkin <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
intel_update_winsys_renderbuffer_miptree() will release the existing
miptree when wrapping a new DRI2 buffer, so we can remove the early
release and so prevent a NULL mt dereference should importing the new
DRI2 name fail for any reason. (Reusing the old DRI2 name will result
in the rendering going astray, to a stale buffer, and not shown on the
screen, but it allows us to issue a warning and not crash much later in
innocent code.)
Signed-off-by: Chris Wilson <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=86281
Reviewed-by: Martin Peres <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|