| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When I initially generalized the vec4_visitor class in preparation for
geometry shaders, I assumed that the setup_attributes() function would
need to be different between vertex and geometry shaders, but its
caller, setup_payload(), could be shared. So I made
setup_attributes() a virtual function.
It turns out this isn't true; setup_payload() needs to be different
too, since the geometry shader payload sometimes includes an extra
register (primitive ID) that has to come before uniforms.
So setup_payload() needs to be the virtual function instead of
setup_attributes().
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Both 3DSTATE_VS and 3DSTATE_GS have a dispatch_grf_start_reg control,
which determines the register where the hardware delivers data sourced
from the URB (push constants followed by per-vertex input data).
For vertex shaders, we always set dispatch_grf_start_reg to 1, since
R1 is always the first register available for push constants in vertex
shaders.
For geometry shaders, we'll need the flexibility to set
dispatch_grf_start_reg to different values depending on the behvaiour
of the geometry shader; if it accesses gl_PrimitiveIDIn, we'll need to
set it to 2 to allow the primitive ID to be delivered to the thread in
R1.
This patch eliminates the assumption that dispatch_grf_start_reg is
always 1. In vec4_visitor, we record the regnum that was passed to
vec4_visitor::setup_uniforms() in prog_data for later use. In
vec4_generator, we consult this value when converting an abstract
UNIFORM register to a concrete hardware register. And in the code
that emits 3DSTATE_VS, we set dispatch_grf_start_reg based on the
value recorded in prog_data.
This will allow us to set dispatch_grf_start_reg to the appropriate
value when compiling geometry shaders. Vertex shaders will continue
to always use a dispatch_grf_start_reg of 1.
v2: Make dispatch_grf_start_reg "unsigned" rather than "GLuint".
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch moves the following things into brw_vec4.{cpp,h}:
- struct brw_vec4_compile
- struct brw_vec4_prog_key
- brw_vec4_prog_data_compare()
- brw_vec4_prog_data_free()
This will allow us to avoid having to include brw_vs.h in
geometry-shader-specific files.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The patch that follows will move the definition of struct
brw_vec4_prog_key from brw_vs.h to brw_vec4.h, making it necessary for
brw_vs.h to include brw_vec4.h (because brw_vs.h defines struct
brw_vs_prog_key, which contains brw_vec4_prog_key as a member). Since
brw_vs.h is included from C source files, that means that brw_vec4.h
will need to be safe to include from C. Same for brw_shader.h, since
it is included by brw_vec4.h.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is backwards from what we are going to want in the long term, which is:
- brw_vec4.h declares general-purpose vec4 infrastructure needed by
both VS and GS
- brw_vs.h includes brw_vec4.h and adds VS-specific parts.
- brw_gs.h includes brw_vec4.h and adds GS-specific parts.
Note that at the moment brw_vec.h contains a fair amount of
VS-specific declarations--I plan to address that in a later patch.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Otherwise any GS that requires lowering (e.g. one that uses
gl_ClipDistance as an input or output) will fail to work.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
| |
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch extracts the following logic from
validate_vertex_shader_executable():
(a) Generate an error if the shader writes to both gl_ClipDistance and
gl_ClipVertex.
(b) Record whether the shader writes to gl_ClipDistance in
gl_shader_program for use by the back-end.
(c) Record the size of gl_ClipDistance in gl_shader_program for use by
transform feedback logic.
And moves it into a function that is shared between vertex and
geometry shaders.
Strictly speaking we only need to have shared logic for (b) and (c)
right now (since (a) only matters in compatibility contexts, and we're
only implementing geometry shaders in core contexts right now). But
the three are closely related enough that it seems sensible to keep
them together.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
enums were being converted twice resulting in incorrect values.
The extra conversion has been removed and the redundant assert is
removed also.
Cc: 9.2 <[email protected]>
Signed-off-by: Timothy Arceri <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The previous value of (GLuint64) ~0 has some problems:
GL_MAX_SERVER_WAIT_TIMEOUT is supposed to be a GLuint64 value, but has
to be queried via GetInteger64v(), which returns a GLint64. This means
that some applications are likely to treat it as a signed integer, where
~0 means -1. Negative values are nonsensical and problematic.
When interpreted correctly, ~0 translates to about 0.58 million years,
which seems rather excessive.
This patch changes it to 0x1fff7fffffff, which is about 1.11 years.
This is still plenty long, and is the same as both an int64 and uint64.
Applications that accidentally store it in a 32-bit int/unsigned also
get a non-negative value, which is again the same as both int and
unsigned. This value was suggested by Ian Romanick.
v2: Add the ULL prefix on the constant (suggested by Ian).
Fixes Piglit's spec/!OpenGL 3.2/get-integer-64v.
Signed-off-by: Kenneth Graunke <[email protected]>
Cc: [email protected]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
_mesa_meta_begin() sets up an orthographic project and initializes the
viewport based on the current drawbuffer's width and height. This is
likely the window size, since it occurs before the meta operation binds
any temporary buffers.
decompress_texture_image needs the viewport to be the size of the image
it's trying to draw. Otherwise, it may only draw part of the image.
v2: Actually set the projection properly too.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68250
Signed-off-by: Kenneth Graunke <[email protected]>
Cc: Mak Nazecic-Andrlon <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes inconsistent failure of gles2conform/GL2Tests/glUniform/glUniform.test
under gnome-shell. What follows is a description of the bug and its fix.
When intel_update_renderbuffers() allocates a miptree for a winsys
renderbuffer, it propagates the renderbuffer's format to become also the
miptree's format.
If the winsys color buffer format is SARGB, then, in the first call to
eglMakeCurrent, intel_gles3_srgb_workaround() changes the renderbuffer's
format to ARGB. That is, it changes the format from sRGB to non-sRGB.
However, it changes the renderbuffer's format *after*
intel_update_renderbuffers() has allocated the renderbuffer's miptree.
Therefore, when eglMakeCurrent returns, the miptree format (SARGB)
differs from the renderbuffer format (ARGB).
If the X server reallocates the color buffer,
intel_update_renderbuffers() will create a new miptree for the
renderbuffer. The new miptree's format (ARGB) will differ from old
miptree's format (SARGB). This mismatch between old and new miptrees
causes bugs.
Fix the bug by moving intel_gles3_srgb_workaround() to occur *before*
intel_update_renderbuffers().
CC: "9.2" <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67934
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
Except for explicit derivs with cube maps which are very bogus anyway.
Just like explicit lod this is only used if no_quad_lod is set in
GALLIVM_DEBUG env var.
Minification is terrible on cpus which don't support true vector shifts
(but should work correctly). Cannot do the min/mag filter decision (if
they are different) per pixel though, only selecting different mip levels
works.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Just a copy & paste error.
Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409.
Note that the test passing before probably simply means it doesn't verify
clamping of the border color itself as required by the OpenGL spec.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
block size depth is always 1 even for compressed formats (unless someone
invents true 3d compressed formats at least which we can't represent).
Nearest (and soa) path had it right.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
They are defined as constant 0.0/0.0/1.0.
Three more little piglits.
Cc: [email protected]
Reviewed-by: Alex Deucher <[email protected]>
|
|
|
|
|
|
| |
Same as PIPE_FORMAT_B10G10R10A2_UINT but without the swizzling.
Reviewed-by: Roland Scheidegger <[email protected]>
|
|
|
|
| |
Used for example on stream out without geometry shader.
|
|
|
|
|
|
| |
We have set up 3DSTATE_SBE (or 3DSTATE_SF on GEN6) in
ilo_shader_select_kernel_routing(). There is no need to pass the last shader
stage to the GPE function.
|
|
|
|
|
| |
Command length is ORed to the wrong place. Since the ORed value is zero,
there is no real change.
|
|
|
|
| |
Assert that gen6_emit_3DSTATE_CLIP is for GEN 6 and 7.
|
|
|
|
|
|
|
|
| |
The Gallium implementation is apparently not ready for regular
consumption, so as much as I hate adding more build-time options, here's
another.
Acked-by: Brian Paul <[email protected]>
|
|
|
|
|
|
| |
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=59648
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
| |
No longer used.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
| |
This same message is printed in the validate_matrix_layout_for_type
function.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The variable means that UBO qualifiers are allowed in a particular
context (e.g., not allowed in a struct field declaration), rather than a
particular set of UBO qualifiers are valid.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This was invaluable when debugging the global copy propagation
algorithm. We may as well commit it in case someone needs to print
out the sets in the future.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
Cc: 9.2 <[email protected]>
Tested-by: Brian Paul <brianp at vmware.com>
Reviewed-by: Brian Paul <brianp at vmware.com>
|
| |
|
|
|
|
|
|
|
|
|
|
|
| |
The (complicated!) math is all identical, there's just minimal differences how
sign bit is calculated plus there's an additional subtraction for the argument
going into the polynomial for cos.
The logic stays 100% the same (with a small exception, sign bit calculation for
sin is minimally simplified, applying sign mask after xoring the arguments
instead of applying it to each argument).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Detected this hunting some other bug, not sure if it really needs fixing but
it is definitely wrong.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
| |
Was using wrong (undefined) vector element (the elements are at 0/2 position,
not 0/1).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
IVB/BYT also has the same L3 cacheability control in MOCS as HSW,
so let's make use of it.
pts/xonotic and pts/reaction @ 1920x1080 gain ~4% on my IVB GT2. Most
other things show less gains/no regressions, except furmark which
loses some 10 points.
I didn't have a BYT at hand for testing.
v2: Don't check (brw->gen == 7) in gen7 functions. (chadv)
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Just spotted these unpopulated MOCS fields when comparing the code
against BSpec. Set the MOCS to the same as everywhere else in Haswell:
L3-cacheable.
v2: Annotate state packet fields (chadv).
Signed-off-by: Ville Syrjälä <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Writing to the source directory can cause multiple parallel builds
from the same source to fail. Create the temporary files in the
build directory.
Signed-off-by: Maarten Lankhorst <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
The NVIDIA driver doesn't expose them, and piglit's
arb_texture_compression-invalid-formats expects them to not be there.
This, with the previous commit, fixes piglit
arb_texture_compression-invalid-formats.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
| |
There is no extension for this format in desktop GL, so an application
can't give the format back to glCompressedTexImage2D.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is required by the spec, and it's a bit tricky because the default
precision is scoped. As a result, I'm slightly abusing the symbol
table.
Fixes piglit no-default-float-precision.frag tests and the piglit
default-precision-nested-scope-0[1234].frag tests that are currently on
the piglit mailing list for review.
On IRC I got confirmation from cwabbot that ARM (Mali T6xx and T400)
enforces this requirement and from kusma that NVIDIA (Tegra2) enforces
this requirement. We should be safe from regressing shipping
applications.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
We never noticed this before because we previously didn't enfoce GLSL ES
fragement shader requirements that precision be defined. There may also
have been some interaction here with the addition of
GL_ARB_shading_language_420pack, but it doesn't appear to me that it
added any new bugs (just perhaps uncovered some old ones).
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
This is used by the next patch.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
| |
Reviewed-by: Christian König <[email protected]>
|
|
|
|
| |
Reviewed-by: Christian König <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Going to need this soon (not going to bother with avx2 intrinsics at this time
but don't want to do workarounds for true vector shifts if llvm itself can use
them just fine and won't need the gazillion instruction emulation).
Not really tested other than my cpu returns 0 for these features...
(I have no idea if llvm actually would emit avx2/xop instructions neither...)
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Need to check the wrap mode of the actually used coords not a fixed 2.
While checking more than necessary would only potentially disable aos and
not cause any harm I'm pretty sure for 3d textures it could have caused
assertion failures (if s,t coords have simple filter and r not).
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Turns out it is actually very complicated to figure out what a format really
is wrt range, as using channel information for determining unorm/snorm etc.
doesn't work for a bunch of cases - namely compressed, subsampled, other.
Also while here add clamping for uint/sint as well - d3d10 doesn't actually
need this (can only use ld with these formats hence no border) and we could
do this outside the shader for GL easily (due to the fixed texture/sampler
relation) do it here too just so I can forget about it.
v2: move border color clamping out of fetch texel. Also change it to clamp
the whole border vector at once (and use vectorized load of border color),
which saves a couple of instructions - needs some different handling of
mixed signed/unsigned formats so skip the per channel stuff and just derive
this from first channel except for special formats.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's a new debug value used to disable per-quad lod optimizations
in fragment shader (ignored for vs/gs as the results are just too wrong
typically). Also trying to detect if a supplied lod value is really a
scalar (if it's coming from immediate or constant file) in which case
sampler code can use this to stay on per-quad-lod path (in fact for
explicit lod could simplify even further and use same lod for both
quads in the avx case but this is not implemented yet).
Still need to actually implement per-element lod bias (and derivatives),
and need to handle per-element lod in size queries.
v2: fix comments, prettify.
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
| |
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
The rules were writing files to e.g. util/u_indices_gen.py, but in an
out-of-tree build this directory doesn't exist in the build directory. So,
create the directories just in case.
Cc: [email protected]
Reviewed-by: Matt Turner <[email protected]>
Signed-off-by: Ross Burton <[email protected]>
|