| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
| |
Check if a surface format can be used for the specified access type.
|
|
|
|
| |
Check if a surface format can be used as a VE format.
|
|
|
|
|
|
|
|
|
| |
Use the newly-introduced NV_VRAM_DOMAIN() macro to support alternative
VRAM domains for chips that do not have dedicated video memory.
Signed-off-by: Alexandre Courbot <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Martin Peres <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Some GPUs (e.g. GK20A, GM20B) do not embed VRAM of their own and use
the system memory as a backend instead. For such systems, allocating
objects in VRAM results in errors since the kernel will not allow
VRAM objects allocations.
This patch adds a vram_domain member to struct nouveau_screen that can
optionally be initialized to an alternative domain to use for VRAM
allocations. If left untouched, NOUVEAU_BO_VRAM will be used for
systems that embed VRAM, and NOUVEAU_BO_GART will be used for VRAM-less
systems.
Code that uses GPU objects is then expected to use the NV_VRAM_DOMAIN()
macro in place of NOUVEAU_BO_VRAM to ensure correct behavior on
VRAM-less chips.
Signed-off-by: Alexandre Courbot <[email protected]>
Reviewed-by: Ilia Mirkin <[email protected]>
Reviewed-by: Martin Peres <[email protected]>
|
|
|
|
|
| |
Replace gen6_idrt_data with ilo_state_compute, which has a bunch of
validations and is now preferred.
|
|
|
|
|
|
|
| |
This fixes a regression in that r600 stopped working when
sampler views were pushed.
Signed-off-by: Dave Airlie <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
For query_levels, we generate a getinfo with writemask of (z), which RA
will consider as size==3. But we were still generating four fanouts.
Which meant that RA would see it as two different register classes,
depending on the path to definer. Ie. on the getinfo instruction itself
it would see size==3, but when chasing back through the fanouts it would
see size==4.
Easiest way to solve that is to just generate the chain of neighboring
fanouts to have the correct size in the first place.
Note: we may eventually want split_dest() to take start/end or wrmask
instead, since really we only need size==1. But RA is not clever enough
for that, query_levels is not that common, and the other two registers
that get allocated are never used so those register slots can be
immediately re-used. So bunch of work for probably no real gain.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Seems like a4xx gets this right.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
We get this information from NIR (which gets it from sview decl in tgsi
when translating from tgsi), so no need to maintain shader variants for
this.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This shuffles things around to allow the shader to have multiple basic
blocks. We drop the entire CFG structure from nir and just preserve the
blocks. At scheduling we know whether to schedule conditional branches
or unconditional jumps at the end of the block based on the # of block
successors. (Dropping jumps to the following instruction, etc.)
One slight complication is that variables (load_var/store_var, ie.
arrays) are not in SSA form, so we have to figure out where to put the
phi's ourself. For this, we use the predecessor set information from
nir_block. (We could perhaps use NIR's dominance frontier information
to help with this?)
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
| |
Without this, negative branch/jump offsets look like very large positive
offsets.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
| |
These belong in the shader, rather than the block. Mostly a lot of
churn and nothing too interesting. But splitting this out from the
rest of ir3_block reshuffling to cut down the noise in the later
patch.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Right now, just provides a cleaner way to get at the gpu-id, given the
separation between compiler and context. But we will need this also to
hold the reg-set for new register allocation.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
No longer used, or even possible, with NIR frontend.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
| |
Also remove ir3_flatten which was only used by tgsi f/e.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Use a more standard priority-queue based scheduling algo. It is simpler
and will make things easier once we have multiple basic blocks and flow
control.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Use standard list_head double-linked list and related iterators,
helpers, etc, rather than weird combo of instruction array and next
pointers depending on stage. Now block has an instrs_list. In
certain stages where we want to remove and re-add to the blocks list
we just use list_replace() to copy the list to a new list_head.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
At least for now.. right now the instruction and instruction list
printing should suffice, and the re-working of ir3_block would require
a lot of changes in that code.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
|
|
|
| |
Use ir3_MOV() builder in a couple of spots, rather than open-coding the
instruction construction. Also add ir3_NOP() builder and use that
instead of open coding.
Signed-off-by: Rob Clark <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Kenneth Graunke <[email protected]>
|
|
|
|
|
|
|
|
| |
v2: rebased on using SVIEW to hold type information
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Eric Anholt <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
| |
Some hardware needs to know the sampler type. Update the blit related
shaders to include SVIEW decl.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
To allow for shaders which use SVIEW decls for TEX* instructions, we
need to preserve the constraint that the shader either has no SVIEW's or
it has one matching SVIEW for each SAMP.
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
TODO single return_type (use enum)
v2: single return_type arg, and use enum
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
Based on mailing list discussion here:
http://lists.freedesktop.org/archives/mesa-dev/2014-November/071583.html
Signed-off-by: Rob Clark <[email protected]>
Reviewed-by: Roland Scheidegger <[email protected]>
Reviewed-by: Jose Fonseca <[email protected]>
|
|
|
|
|
| |
This doesn't fix the broken 1D cases of texsubimage, but it does prevent
segfaulting when dumping the QIR code generated in fbo-1d.
|
|
|
|
|
|
|
|
| |
We need to make sure that when we store the aligned box, we've got
initialized contents in the border. We could potentially just load the
border area, but for now let's get text rendering working in X (and fix
the GL_TEXTURE_2D errors in piglit's texsubimage test and
gl-2.1-pbo/test_tex_image)
|
|
|
|
| |
Core is more self-contained now.
|
|
|
|
| |
ilo_ib_state is not in core.
|
|
|
|
| |
It does not belong to core.
|
|
|
|
| |
It serves the same purpose as ilo_state_vertex_buffer does.
|
|
|
|
| |
It serves the same purpose as ilo_state_vertex_buffer does.
|
|
|
|
|
|
| |
Being a parameter-like state, we may want to get rid of
ilo_state_vertex_buffer_info or ilo_state_vertex_buffer eventually. But we
want them now as they are how we do cross-validation right now.
|
|
|
|
|
|
| |
3DSTATE_VF_INSTANCING specifies instancing enable and step rate. They are
specified along with 3DSTATE_VERTEX_BUFFERS instead prior to Gen8. Both
commands are added.
|
|
|
|
|
|
| |
3DSTATE_VF specifies cut index enable and cut index. Cut index enable is
specified in 3DSTATE_INDEX_BUFFER instead prior to Gen7.5. Both commands are
added.
|
|
|
|
| |
Make it obvious that we save a copy of pipe_index_buffer.
|
|
|
|
| |
Add missing parentheses in SURFTYPE_NULL initialization.
|
|
|
|
|
|
| |
ilo_shader.c: In function ‘ilo_shader_select_kernel_sbe’:
ilo_shader.c:1140:27: warning: ‘src_skip’ may be used uninitialized in this
function [-Wmaybe-uninitialized]
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
If the driver says PIPE_CAP_VERTEX_ELEMENT_SRC_OFFSET_4BYTE_ALIGNED_ONLY=1,
the driver should never receive a pipe_vertex_element::src_offset value
that's not a multiple of four. But the vbuf code wasn't actually adjusting
the src_offset value when creating the vertex element state object.
We just need to align the src_offset values put in the driver_attribs[]
array.
See the piglit gl-1.5-vertex-buffer-offsets test.
Reviewed-by: Marek Olšák <[email protected]>
|
|
|
|
|
| |
Remove trailing whitespace, move some braces, 78-column wrapping.
Trivial.
|
|
|
|
|
|
|
|
|
|
| |
There are three possible return values (not two): WGL_SWAP_COPY_ARB,
WGL_SWAP_EXCHANGE_EXT and WGL_SWAP_UNDEFINED_ARB.
VMware bug 1431184
Reviewed-by: Jose Fonseca <[email protected]>
Reviewed-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Also, print a warning if we do return NULL from wglGetProcAddress() to
help spot this sort of problem in the future.
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Viewperf 12 calls wglGetProcAddress() to get pointers to some unsupported
DSA and half-float functions. We return NULL but Viewperf doesn't check
for null before trying to jump through the pointer. That causes a crash.
This patch adds no-op functions to call instead (used by the next patch).
This avoids the crash but the rendering is incorrect.
Some DSA functions are being added to Mesa at this time so we may be
able to remove some of these no-ops in the future.
More no-op functions may be added as needed.
VMware PR1383421
Reviewed-by: José Fonseca <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
WGL_CONTEXT_PROFILE_MASK_ARB doesn't apply to desktop OpenGL versions
less than 3.2 -- applications can't specify whether they want a core or
a compat 3.1 context -- instead they are supposed the check whether the
returned context advertises GL_ARB_compatibility extension.
Mesa doesn't support compatability contexts for version higher than 3.1,
so we used to return core profile context, but this makes several Windows
applications unhappy, because they just assume they got a compatability
context without checking.
So it seems safer to on Windows to never return core profile for 3.1,
ie, just fail the context creation.
VMware PR1365920.
Reviewed-by: Brian Paul <[email protected]>
|
|
|
|
|
|
|
| |
To allow sampling from the surface for things like glCopyPixels
or glCopyTexSubImage.
Reviewed-by: Charmaine Lee <[email protected]>
|