| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
This adds a new TGSI property to represent the GLSL layout qualifier in TGSI.
|
|
|
|
|
| |
Fixes -Wimplicit-function-declaration for ffs with GCC. Spotted/tested
by Kai Wasserbäch.
|
|
|
|
|
|
|
|
|
| |
It sets the wrong values (GL_XXX_LEFT instead of GL_XXX), and no other
Mesa driver does this, given that Mesa sets the right draw/read buffers
provided the Mesa visual has the doublebuffer flag filled correctly
which is the case.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
| |
This avoids forming invalid pointers needlessly, which even if
never dereferenced is undefined behavior. It also makes
_mesa_validate_pbo_access() more comprehensible.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
| |
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
NULL as an error indicator is meaningless, since it will return NULL
on success anyway if the caller passes in zero as the image's address
and asks to calculate the offset of the first pixel. For example,
_mesa_validate_pbo_access() does this.
This also matches the code in the non-GL_BITMAP codepath, which
already has an assert like this.
v2: Per Brian Paul's review, remove the function call entirely
and tighten the assert to only accept the two formats compatible with
GL_BITMAP. They always have one component per pixel.
Reviewed-by: Brian Paul <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
| |
The return value here is a) always zero, b) never used.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
The bug, reported to me by Vadim Girlin on IRC, was causing overzealous
elimination of code in parallel if statements such as the following:
if (x) {
r = false;
}
if (y) {
r = true;
}
Before this commit, the assignment inside the first if block would be
misdetected as dead code and removed.
|
|
|
|
|
|
|
|
|
|
|
|
| |
Number of fragment shader variants is not very representative of the
memory used by LLVM, neither is number of shader instructions, as often
texture sampling constitutes most of the generated code.
This change adds an additional trim criteria: least recently used
fragment shader variants will be freed until the total number of LLVM IR
instruction falls below a specified threshold.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
variant list.
u_simple_list.h uses a sentinel element, and not a NULL element. So
ensure list is not empty when reducing the list of shader variants.
Something I noticed while trying to free variants more aggressively.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
| |
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
| |
In a few places we need to allocate space for some number of generic
pixels. Use this new define instead of a magic number like 16 or
4 * sizeof(GLuint).
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
|
|
|
|
|
| |
We're now using the functions that live in swrast.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
We'll use the functions that live in main/
Plus, rename the remaining functions with "swrast_" prefix.
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
Copying these files is the first step in moving the software buffer
code from main/renderbuffer.c to swrast/s_renderbuffer.c
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
| |
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
The functions to allocate software color, depth, accum, etc buffers aren't
called from anywhere else.
Reviewed-by: Eric Anholt <[email protected]>
|
| |
|
| |
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Implemented in terms of renderbuffer mapping/unmapping and format
packing/unpacking functions.
The swrast and state tracker code for implementing accumulation are
unused and will be removed in the next commit.
v2: don't use memcpy() in _mesa_clear_accum_buffer()
v3: don't allocate MAX_WIDTH arrays, be more careful with mapping flags
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
| |
This code packs colors, Z, stencil, etc. in the various mesa pixel
formats. Will be used for things like glDrawPixels, glTexImage,
glAccum, etc.
|
|
|
|
|
|
|
| |
No driver implemented this and we always returned "True" for residence
queries.
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
| |
There's probably no reason to use a special version of memcpy() anymore.
|
|
|
|
|
|
|
|
|
| |
st_copy_texsubimage().
This has no piglit regressions on r600g and softpipe.
Signed-off-by: Henri Verbeet <[email protected]>
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Change and document the interpretation of the color conversion matrix
in order to make the function more versatile and to simplify the
generated shader.
Signed-off-by: Thomas Hellstrom <[email protected]>
Reviewed-by: Jakob Bornecrantz <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
In Gen6, transform feedback is accomplished by having the geometry
shader send vertex data to the data port using "Streamed Vertex Buffer
Write" messages, while simultaneously passing vertices through to the
rest of the graphics pipeline (if rendering is enabled).
This patch adds a geometry shader program that simply passes vertices
through to the rest of the graphics pipeline. The rest of transform
feedback functionality will be added in future patches.
To make the new geometry shader easier to test, I've added an
environment variable "INTEL_FORCE_GS". If this environment variable
is enabled, then the pass-through geometry shader will always be used,
regardless of whether transform feedback is in effect.
On my Sandy Bridge laptop, I'm able to enable INTEL_FORCE_GS with no
Piglit regressions.
Reviewed-by: Kenneth Graunke <[email protected]>
Acked-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
R02_PRIM_END and R02_PRIM_START don't actually refer to bits in DWORD
2 of R0 (as the name, and comments in the code, would seem to
indicate). Actually they refer to bits in DWORD 2 of the header for
URB_WRITE messages.
This patch renames the defines to reflect what they actually mean. It
also addes a define URB_WRITE_PRIM_TYPE_SHIFT, which previously was
just hardcoded in .c files.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Prior to this patch, in the Gen4 and Gen5 GS, we used GRF 0 (called
"R0" in the code) as a staging area to prepare the message header for
the FF_SYNC and URB_WRITE messages. This cleverly avoided an
unnecessary MOV operation (since the initial value of GRF 0 contains
data that needs to be included in the message header), but it made the
code confusing, since GRF 0 could no longer be relied upon to contain
its initial value once the GS started preparing its first message.
This patch avoids confusion by using a separate register ("header") as
the staging area, at the cost of one MOV instruction.
Worse yet, prior to this patch, the GS would completely overwrite the
contents of GRF 0 with the writeback data it received from a completed
FF_SYNC or URB_WRITE message. It did this because DWORD 0 of the
writeback data contains the new URB handle, and that neds to be
included in DWORD 0 of the next URB_WRITE message header. However,
that caused the rest of the message header to be corrupted either with
undefined data or zeros. Astonishingly, this did not produce any
known failures (probably by dumb luck). However, it seems really
dodgy--corrupting FFTID in particular seems likely to cause GPU hangs.
This patch avoids the corruption by storing the writeback data in a
temporary register and then copying just DWORD 0 to the header for the
next message. This costs one extra MOV instruction per message sent,
except for the final message.
Also, this patch moves the logic for overriding DWORD 2 of the header
(which contains PrimType, PrimStart, PrimEnd, and some other data that
we don't care about yet). This logic is now in the function
brw_gs_overwrite_header_dw2() rather than in brw_gs_emit_vue(). This
saves one MOV instruction in brw_gs_quads() and brw_gs_quad_strip(),
and paves the way for the Gen6 GS, which will need more complex logic
to override DWORD 2 of the header.
Finally, the function brw_gs_alloc_regs() contained a benign bug: it
neglected to increment the register counter when allocating space for
the "temp" register. This turned out not to have any effect because
the temp register wasn't used on Gen4 and Gen5, the only hardware
models (so far) to require a GS program. Now, all the registers
allocated by brw_gs_alloc_regs() are actually used, and properly
accounted for.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When the GS is not in use, the entire URB space is available for the
VS. When the GS is in use, we split the URB space 50/50.
The 50/50 split is probably not optimal--we'll probably want tune this
for performance in a future patch. For example, in most situations,
it's probably worth allocating more than 50% of the space to the VS,
since VS space is used for vertex caching. But for now this is good
enough.
Based on previous work by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
We never filled this in before because we didn't care.
I'm skeptical these are correct; my sources indicate that both the VS
and GS # of entries are 256 on both GT1 and GT2.
I'm also loathe to change it and break stuff.
Reviewed-by: Paul Berry <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Normally when outputting instructions in SPF (single program flow)
mode, we convert IF and ELSE instructions to conditional ADD
instructions applied to the IP register. On platforms prior to Gen6,
flow control instructions cause an implied thread switch, so this is a
significant savings.
However, according to the SandyBridge PRM (Volume 4 part 2, p79):
[Errata DevSNB{WA}] - When SPF is ON, IP may not be updated by
non-flow control instructions.
So we have to disable this optimization on Gen6.
On later platforms, there is no significant benefit to converting flow
control instructions to ADDs, so for the sake of consistency, this
patch disables the optimization on later platforms too.
The reason we never noticed this problem before is that so far we
haven't needed to use SPF mode on Gen6. However, later patches in
this series will introduce a Gen6 GS program which uses SPF mode.
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Previously, GS generation code contained a lookup table that mapped
primitive types POLYGON, TRISTRIP, and TRIFAN to TRILIST, mapped
LINESTRIP to LINELIST, and left all other primitives unchanged. This
was silly, because we never generate a GS program for those primitive
types anyhow.
This patch removes the unnecessary lookup table.
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch adds a new bit to the ctx->NewState bitfield,
_NEW_TRANSFORM_FEEDBACK, to track state changes that affect
ctx->TransformFeedback. This bit can be used by driver back-ends to
avoid expensive recomputations when transform feedback state has not
been modified.
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
When driCreateScreen calls driConvertConfigs to try to convert the
configs for swrast, it fails and returns NULL. Instead of checking,
it just clobbers psc->base.configs. Then, when the application asks
for the FBconfigs, there aren't any.
Instead, make the caller responsible for freeing the old modes lists
if both calls to driConvertConfigs succeed.
Without the second fix, glxinfo fails unless you run it with
LIBGL_ALWAYS_INDIRECT:
$ glxinfo
name of display: :0.0
Error: couldn't find RGB GLX visual or fbconfig
$ LIBGL_ALWAYS_INDIRECT=1 glxinfo
name of display: :0.0
display: :0 screen: 0
direct rendering: No (LIBGL_ALWAYS_INDIRECT set)
server glx vendor string: NVIDIA Corporation
server glx version string: 1.4
[...]
Signed-off-by: Aaron Plattner <[email protected]>
Reviewed-and-tested-by: Ian Romanick <[email protected]>
Signed-off-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
This patch fixes the samplerCubeShadow support in GLSL shader compiler.
shader compiler was picking the 'r' texture coordinate for shadow comparison
when the expected behaviour is to use 'q' texture coordinate in case of cube
shadow maps.
Signed-off-by: Anuj Phogat <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
Reviewed-by: Ian Romanick <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes piglit tests fbo-array, fbo-depth-array, fbo-generatemipmap-array,
and array-texture, as well as the array variants of my new textureSize
and texelFetch tests.
Not a candidate for 7.11 because EXT_texture_array wasn't supported.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Fixes many crashes on Ivybridge due to upload_sf_state calling
brw_depthbuffer_format without an actual depth buffer. This was a
recent regression on master.
+3992 piglits on Ivybridge.
Signed-off-by: Kenneth Graunke <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
|
|
|
|
|
|
|
| |
This evolved over several commits, and I also wanted to document some
new information about how we handle formats.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
Now that all RBs have miptrees, and miptree mapping covered these last
two code paths, consistently use them.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
This mimics the MapRenderbuffer code, and should improve the
performance of glGetTexImage().
v2: Fix broken error handling.
|
|
|
|
|
| |
This gets the same performance win as the miptree maps did, and
removes a pile of code duplication.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Right now the fake packed d/s RBs are creating two sub-renderbuffers
with their own storage, and the hardware setup and the mapping code
have been explicitly referencing them. By setting miptrees on them,
we'll be able to make our renderbuffer code for fake packed
depth/stencil more consistent with all our other renderbuffers.
The interesting new behavior here is that there is now a mt with a
non-depthstencil format (X8Z24) that has a stencil_mt field
associated. This looks like it should be safe, and we'll need to be
able to do this for floating point depth/stencil as well.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Before, we had an uncached read of S8 to untile, then a RMW (so
uncached penalty) of the packed S8Z24 to store the value, then the
consumer would uncached read that once per pixel. If data was written
to the map, we would then have to uncached read the written data back
out and do the scatter to the tiled S8 buffer (also uncached access
penalties, since WC couldn't actually combine). So 3 or 5 uncached
accesses per pixel in the ROI (and we we were ignoring the ROI, so it
was the whole image).
Now we get an uncached read of S8 to untile, and an uncached read of
Z. The consumer gets to do cached accesses. Then if data was
written, we do streaming Z writes (WC success), and scattered S8
tiling writes (uncached penalty). So 2 or 3 uncached accesses per
pixel in the ROI.
This should be a performance win, to the extent that anybody is doing
software accesses of packed depth/stencil buffers.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
|
| |
We don't gripe about void * arithmetic for our driver, and this
prevents silly casting when assigning the result of mapping to
non-byte types.
Reviewed-by: Chad Versace <[email protected]>
|
|
|
|
|
|
|
| |
We're going to want to reuse this logic in mapping of fake packed
miptrees wrapping separate depth/stencil miptrees.
Reviewed-by: Chad Versace <[email protected]>
|