| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
GL_APPLE_object_purgeable creates a mechanism for marking OpenGL objects
as "purgeable" so they can be thrown away when system resources become
scarce. It specifically applies to buffer objects, textures, and
renderbuffers.
The intel_buffer_objects.c file provides core functionality for GL
buffer objects, such as MapBufferRange and CopyBufferSubData. Having
texture and renderbuffer functionality in that file is a bit strange.
The 2010 copyright on the new file is because Chris Wilson first added
this code in January 2010 (commit 755915fa).
v2: Actually remember to call the new dd table setup function.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, INTEL_DEBUG=bat would dump messages like:
intel_mipmap_tree.c:1643: Batchbuffer flush with 456b used
This only reported the space used for command packets, and didn't
report any information on the space used for indirect state.
Now it dumps:
intel_context.c:366: Batchbuffer flush with 6128b (pkt) + 4288b (state)
= 10416b (31.8%)
This conveniently shows the breakdown of space used for packets vs.
state, as well as the percentage of batchbuffer space.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
We emit these before configuring depth in the normal path, or actually
using the depth buffer in BLORP - we just failed to emit them when
disabling depth altogether.
On Sandybridge, this also requires the post_sync_nonzero flush.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, copy propagation would cause bitcast_f2u(abs(float)) to
be performed in a single step, but the application of source modifiers
(abs, neg) happens after type conversion, leading to incorrect results.
That is, for bitcast_f2u(abs(float)) we would in fact generate code to
do abs(bitcast_f2u(float)).
For example, whereas bitcast_f2u(abs(float)) might result in a register
argument such as
(abs)g2.2<0,1,0>UD
v2: Set interfered = true and break in register_coalesce instead of
returning false.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Necessary to avoid combining a bitcast and a modifier into a single
operation. Otherwise if safe, the MOV should be removed by
copy-propagation or register coalescing.
With this and the next patch, there are only four changes in shader-db:
all a single extra instruction. The code does something like
mov a.w, -b.x
and copy propagation doesn't work because it only handles no-op
swizzles. Seems acceptable, given the known limitation of our copy
propagation.
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Currently single sample scaled blits with GL_LINEAR filter falls
back to meta path. Patch removes this limitation in BLORP engine
and implements single sample scaled blit with bilinear filter.
No piglit, gles3 regressions are observed with this patch on Ivybridge.
V2: Use "sample" message to utilize the linear filtering functionality
built in to hardware.
V3: Define a bool variable (bilinear_filter) to handle the conditions
for GL_LINEAR blits.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
New function clamp_tex_coords() clamps the texture coordinates
to texture boundaries. This function will also be utilized later
for the BLORP implementation of single-sample scaled blit with
bilinear filter.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
When we talk about both multi-sample and single-sample scaled blits,
rect_grid_{x1, y1} are more appropriate variable names as compared
to sample_grid_{x1, y1}. There are no functional changes in this patch.
It just prepares for the BLORP implementation of single-sample scaled
blit with bilinear filter.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes a case of framebuffer blitting with renderbuffer
as color attachment and GL_LINEAR filter. Meta implementation of
glBlitFrambuffer() converts source color buffer to a texture and
uses it to do the scaled blitting in to destination buffer. Using
the exact source rectangle to create the texture does incorrect
linear filtering along the edges. This patch makes the changes to
extend the texture edges by one pixel in x, y directions. This
ensures correct linear filtering.
It fixes failing piglit fbo-attachments-blit-scaled-linear test.
Signed-off-by: Anuj Phogat <[email protected]>
CC: "9.2" <[email protected]>
CC: "9.1" <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
| |
The error code was changed from INVALID_VALUE to INVALID_OPERATION
in OpenGL 3.3. We should also generate an error when size is BGRA
and normalized is FALSE.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Sandybridge is the only platform that supports an IF instruction
with an embedded comparison. In this case, we need to emit a CMP
to go along with the SEL.
Fixes regressions in Piglit's glsl-fs-atan-3, fs-unpackHalf2x16,
fs-faceforward-float-float-float, isinf-and-isnan fs_basic, and
isinf-and-isnan fs_fbo.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68086
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Anuj Phogat <[email protected]>
Tested-by: lu hua <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
128 bpp formats are not allowed to be Y-tiled on any architectures
except Gen7.
+11 Piglits on Sandybridge (mostly regression fixes since the
switch to Y-tiling).
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=63867
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=64261
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
It is only in OpenGL compatibility-style contexts where generic
attribute 0 and GL_VERTEX_ARRAY have a bizzare, aliasing relationship.
Moreover, it is only in OpenGL compatibility-style contexts and OpenGL
ES 1.x where one of these attributes provokes the vertex. In all other
APIs each implicit call to glArrayElement provokes a vertex regardless
of which attributes are enabled.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Robert Bragg <[email protected]>
Cc: "9.0 9.1 9.2" <[email protected]>
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=55503
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=66292
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=67548
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes "Resource leak" defect reported by Coverity.
Tested on Haswell, no Piglit regressions.
v2: Apply to i965, not just i915. (chadv)
CC: "9.2, 9.1" <[email protected]>
Signed-off-by: Vinson Lee <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Should get rid of some float-to-int conversions (with negation).
No piglit regressions (with llvmpipe).
v2: fix bogus formatting spotted by Brian.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
There's no need to use a clip flag for NEGW on these gens, so
no reason we can't just enable 8 planes.
V2: - Bump (and document!) MAX_VERTS in the clip code.
- Fix clip flag masks in the clip unit state and in the shader
prolog
- Move this to the end of the series for less breakage.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
This does the same thing as we do for triangle clipping -- select the
appropriate source (either dot(hpos,fixed plane) or a clipdistance
slot).
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Nothing in the clipper uses gl_ClipVertex any more, so we don't care
where it is.
V2: Don't bother fishing out the clipvertex offset either.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
V2: Adjust explanation of load_clip_distance()
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Soon the dp4 is only going to be used for fixed clip planes.
V2: Remove old inaccurate comment about the behavior of this function;
add a better explanation above.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
V2: - Use the new VS_OPCODE_UNPACK_FLAGS_SIMD4X2 to correctly split the
flags for the two vertices being processed together.
- Don't apply bogus masking of clip flags. The set of plane enables
aren't included in the shader key, and we wouldn't want the
recompiles anyway.
V3: - Tidy up spurious instructions, name temps properly.
Signed-off-by: Chris Forbes <[email protected]>
[V2] Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Splits the bottom 8 bits of f0.0 for further wrangling
in a SIMD4x2 program. The 4 bits corresponding to the channels in each
program flow are copied to the LSBs of dst.x visible to each flow.
This is useful for working with clipping flags in the VS.
V3: - Fixup immediate types
- Teach scheduler about the hidden dep on flags
Signed-off-by: Chris Forbes <[email protected]>
V2: Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
| |
We're about to have an instruction that depends on the flags but isn't
predicated. This lays the groundwork.
Signed-off-by: Chris Forbes <[email protected]>
|
|
|
|
|
|
|
|
| |
Previously we had disabled interpolation of the clip distances as a
special case, since they were unused.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
| |
We need to produce clip flags for the vertex header on Gen4/5, so
clip plane lowering has to be done before we try to emit the flags/psiz
attribute.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
enabled.
V2: We don't particularly care where they fall in the VUE map, as long
as they are allocated somewhere, and occupy two contiguous slots. Don't
fiddle with the SF layout at all -- there's no need.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
| |
Commit 9f9ccf707c54156b4559a4b1206022c2ca2d45cd renamed
upload_3dstate_so_decl_list to gen7_upload_3dstate_so_decl_list but
forgot to update the caller.
|
|
|
|
|
|
|
|
| |
Move the arrays to the new header brw_multisample_state.h, which will be
shared with Broadwell code.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Place each array in the brw namespace by renaming it:
sample_positions_4x -> brw_multisample_positions_4x
sample_positions_8x -> brw_multisample_positions_8x
This prepares for moving the arrays to a header shared by gen6 and gen8.
Reviewed-by: Kenneth Graunke <[email protected]>
Signed-off-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
We will reuse this for Broadwell.
v2: Prefix function name with 'gen7'. (chadv)
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
| |
We will reuse these for Broadwell.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The functional change is that now invalidate_framebuffer is called if
the texture is actually detached from one of the currently bound FBOs.
Previously this was only done for renderbuffers.
The remaining changes make the texture delete path look more similar to
the renderbuffer delete path. This includes adding relevant spec
quotations to justify the behavior.
Fixes piglit fbo-incomplete "delete texture of bound FBO" test.
v2: Move 'fb->Attachment[i].Texture == att' check from previous patch to
this patch... where it was intended to be in the first place. Noticed
by Chad.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Also add a return value indicating whether any work was done.
This will be used by the next patch.
v2: Move 'fb->Attachment[i].Texture == att' check to the next
patch... where it was intended to be in the first place. Noticed by
Chad.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Fixes failures in oglconform fbo mipmap.manual.color,
mipmap.manual.colorAndDepth, mipmap.automatic, and
mipmap.manualIterateTexTargets subtests.
Signed-off-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
CMP instructions use BRW_ARF_NULL as a destination. Prior to this
patch, dump_instruction() decoded the destination as "???".
Now it decodes BRW_ARF_NULL as "(null)" and other ARFs numerically.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
| |
This resulted in printouts like:
246: cmp.cmod.f0.0
???, vgrf152, 0.000000f, (null),
With this patch, CMP is properly printed on one line.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Many GLSL shaders contain code of the form:
x = condition ? foo : bar
The compiler emits an ir_if tree for this, since each subexpression
might be a complex tree that could have side-effects and short-circuit
logic operations.
However, the common case is to simply pick one of two constants or
variable's values---which is exactly what SEL is for. Replacing IF/ELSE
with SEL also simplifies the control flow graph, making optimization
passes which work on basic blocks more effective.
The shader-db statistics:
total instructions in shared programs: 1655247 -> 1503234 (-9.18%)
instructions in affected programs: 949188 -> 797175 (-16.02%)
2,970 shaders were helped, none hurt. Gained 181 SIMD16 programs.
This helps Valve's Source Engine games (max -41.33%), The Cave
(max -33.33%), Serious Sam 3 (max -18.64%), Yo Frankie! (max -30.19%),
Zen Bound (max -22.22%), GStreamer (max -6.12%), and GLBenchmark 2.7
(max -1.94%).
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The instruction
(+f0.0) SEL dst, src0, src1
will write either src0 or src1 to dst, depending on the predicate.
Unlike most predicated instructions, it always writes to dst.
fs_inst::is_partial_write() is supposed to return true if the whole
register is guaranteed to be written. The !inst->predicated check makes
sense for most instructions, which might not write the whole register,
but SEL is a special case.
This caused live interval analysis to ignore the destination of
predicated SEL instructions when computing "def" information.
Requires the previous commit to avoid regressions.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The existing inst->is_partial_write() already disallows predicated
instructions, so this has no functional change. However, it's worth
doing explicitly since the CSE pass does not consider the flag register.
This means it could blindly factor out operations that use the same
sources, but which have different condition codes set.
This prevents a regression in the next commit.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Usually, the driver creates both 8-wide and 16-wide variants of every
fragment shader. When 16-wide compilation fails, it logs a performance
warning explaining why only an 8-wide program exists.
However, when there are pull parameters, the driver won't even bother
trying the 16-wide compile (since it would fail). In this case, it
failed to emit a performance warning, leaving no explanation for the
missing 16-wide program.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Matt Turner <[email protected]>
|
|
|
|
|
|
|
| |
Makes this flag appear in the output for INTEL_DEBUG=state
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
INTEL_DEBUG=vue now emits a listing of each slot in the VUE map,
and the corresponding interpolation mode.
V2: Fix whitespace issues.
Signed-off-by: Chris Forbes <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
From section E.1 (Profiles and Deprecated Features of OpenGL 3.0)
of the OpenGL 3.0 spec:
"LineWidth is not deprecated, but values greater than 1.0
will generate an INVALID VALUE error"
From context it is clear that values greater than 1.0 should only
generate an INVALID VALUE error in a forward-compatible context.
The code was correctly quoting this spec text, but it was disallowing
all line widths in forward-compatible contexts, instead of just widths
greater than 1.0.
This patch introduces the correct check, so that setting a line width
of 1.0 or less is permitted.
Reviewed-by: Matt Turner <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Otherwise, blits to the window system buffer may cause crashes,
since dst_irb->mt may be NULL.
This code is lifted straight out of brw_blorp_framebuffer()'s
try_blorp_blit() helper.
Fixes crashes in Piglit's fbo-sys-blit on systems without BLORP.
Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=65919
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
Reviewed-by: Chad Versace <[email protected]>
Cc: "9.2" <[email protected]>
|
|
|
|
|
|
|
|
| |
This command reads a value from memory and writes it to a register (the
opposite of MI_STORE_REGISTER_MEM). It's only available on Gen7+.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This prevents a crash in a future patch.
_mesa_initialize_context() creates a default transform feedback object
by calling the NewTransformFeedbackObject() driver hook. Eventually,
we'll want to subclass that and allocate a buffer object. This means
passing brw->bufmgr to drm_intel_alloc_bo(), and crashing if it isn't
initialized yet.
The buffer manager is actually already initialized; we just hadn't
copied the pointer from intel_screen to intel_context quite early
enough.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Gen7+ supports four transform feedback streams. Using a function-like
macro makes it easy to access them by stream number or loop over them.
"GEN7_" prefixes are more common than "_IVB" suffixes, so use that.
Gen6 only supports a single stream, so the single #define should be
fine. However, SO_NUM_PRIM_STORAGE_NEEDED was a poor name. For one,
the word "NUM" doesn't appear in the actual name of the register.
It's also confusingly generic, as it doesn't exist on Gen7+. Add a
"GEN6_" prefix for clarity.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Paul Berry <[email protected]>
|