| Commit message (Collapse) | Author | Age | Files | Lines |
|
|
|
|
|
| |
Fulfills an unused internal interface
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Architecture benefits from having more threads/work outstanding.
Patch by Jan Zielinski.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Swr caches fb contents in tiles. Those tiles are stored on a per-context
basis.
When switching contexts that share resources we need to make sure that
the tiles of the old context are being stored and the tiles of the new
context are being invalidated (marked as invalid, hence contents need
to be reloaded).
The context does not get any dirty bits to identify this case. This has
to be, then, coordinated by the resources that are being shared between
the contexts.
Add a "curr_pipe" hook in swr_resource that will allow us to identify a
MakeCurrent of the above form during swr_update_derived(). At that time,
we invalidate the tiles of the new context. The old context, will need to
have already store its tiles by that time, which happens during glFlush().
glFlush() is being called at the beginning of MakeCurrent.
So, the sequence of operations is:
- At the beginning of glXMakeCurrent(), glFlush() will store the tiles
of all bound surfaces of the old context.
- After the store, a fence will guarantee that the all tile store make
it to the surface
- During swr_update_derived(), when we validate the new context, we check
all resources to see what changed, and if so, we invalidate the
current tiles.
Fixes rendering problems with CEI/Ensight.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
swr used to build and link the rasterizer to the driver, and to support
multiple architectures we needed to have multiple versions of the
driver/rasterizer combination, which needed to link in much of mesa.
Changing to having one instance of the driver and just building
architecture specific versions of the rasterizer gives a large reduction
in disk space.
libGL.so 6464 Kb -> 7000 Kb
libswrAVX.so 10068 Kb -> 5432 Kb
libswrAVX2.so 9828 Kb -> 5200 Kb
Total 26360 Kb -> 17632 Kb
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Use the SWR rasterizer API through the table returned from
SwrGetInterface rather than referencing the functions directly.
This will allow us to move to a model of having the driver dynamically
load the appropriate swr architecture library.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Tag pStat field in swr_draw_context structure so gen_llvm_types.py
can deal with the actual structure type instead of using void.
Code cleanup, no functional change.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
v3: list piglit tests fixed by this patch. Fixed typo Tim pointed out.
v2: Reword commit message to more closely adhere to community
guidelines.
This patch moves msaa resolve down into core/StoreTiles where the
surface format conversion routines are available. The previous
"experimental" resolve was limited to 8-bit unsigned render targets.
This fixes a number of piglit msaa tests by adding resolve support for
all the render target formats we support.
Specifically:
layered-rendering/gl-layer-render: fail->pass
layered-rendering/gl-layer-render-storage: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg: crash->pass
multisample-formats *[2,4,8,16] gl_ext_texture_snorm: crash->pass
multisample-formats *[2,4,8,16] gl_arb_texture_float: fail->pass
multisample-formats *[2,4,8,16] gl_arb_texture_rg-float: fail->pass
MSAA is still disabled by default, but can be enabled with
"export SWR_MSAA_MAX_COUNT=4" (1,2,4,8,16 are options)
The default is 0, which is disabled.
This patch improves the number of multisample-formats supported by swr,
and fixes several crashes currently in the 17.1 branch. Therefore, it
should be considered for inclusion in the 17.1 stable release. Being
disabled by default, it poses no risk to most users of swr.
Reviewed-by: Tim Rowley <[email protected]>
cc: [email protected]
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This patch enables multisample antialiasing in the OpenSWR software renderer.
MSAA is a proof-of-concept/work-in-progress with bug fixes and performance
on the way. We wanted to get the changes out now to allow several customers
to begin experimenting with MSAA in a software renderer. So as not to
impact current customers, MSAA is turned off by default - previous
functionality and performance remain intact. It is easily enabled via
environment variables, as described below.
It has only been tested with the glx-lib winsys. The intention is to
enable other state-trackers, both Windows and Linux and more fully support
FBOs.
There are 2 environment variables that affect behavior:
* SWR_MSAA_FORCE_ENABLE - force MSAA on, for apps that are not designed
for MSAA... Beware, results will vary. This is mainly for testing.
* SWR_MSAA_MAX_SAMPLE_COUNT - sets maximum supported number of
samples (1,2,4,8,16), or 0 to disable MSAA altogether.
(The default is currently 0.)
Reviewed-by: George Kyriazis <[email protected]>
|
|
|
|
|
|
|
|
|
| |
The context now contains SIMD vectors which must be aligned (specifically
samplePositions in the rastState in the derived state). Failure to align
can result in segv crash on unaligned memory access in vector
instructions.
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
| |
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Passes corresponding piglit tests.
Reviewed-by: Edward O'Callaghan <[email protected]>
|
|
|
|
|
|
| |
the guards have been added to the header files that needed them.
Reviewed-by: Ilia Mirkin <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Notes:
- make sure the default size is large enough to handle all state trackers
- pipe wrappers don't receive transfer calls from stream_uploader, because
pipe_context::stream_uploader points directly to the underlying driver's
stream_uploader (to keep it simple for now)
v2: add error handling to nv50, nvc0, noop
v3: set const_uploader
Reviewed-by: Nicolai Hähnle <[email protected]>
Tested-by: Edmondo Tommasina <[email protected]> (v1)
Tested-by: Charmaine Lee <[email protected]>
|
|
|
|
|
|
|
| |
Work can now be added to fences and triggered by fence completion. This
allows for deferred resource deletion, and other asynchronous tasks.
Reviewed-by: George Kyriazis <[email protected]>
|
|
|
|
|
| |
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
| |
We now support clearing these, and actually rendering to multiple layers
would require GS support, which will fail in much more spectacular ways
for now. Once that is hooked up, there won't be anything else to do
here.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
This is a bit of a mega-commit, but unfortunately there's no great way
to break this up since a lot of different pieces have to match up. Here
we do the following:
- change surface layout to match swr's Load/StoreTile expectations
- fix sampler settings to respect all sampler view parameters
- fix stencil sampling to read from secondary resource
- respect pipe surface format, level, and layer settings
- fix resource map/unmap based on the new layout logic
- fix resource map/unmap to copy proper parts of stencil values in and
out of the matching depth texture
These fix a massive quantity of piglits, including all the
tex-miplevel-selection ones.
Note that the swr native miptree layout isn't extremely space-efficient,
and we end up using it for all textures, not just the renderable ones. A
back-of-the-envelope calculation suggests about 10%-25% increased memory
usage for miptrees, depending on the number of LODs. Single-LOD textures
should be unaffected.
There are a handful of regressions as a result of this change:
- Some textureGrad tests, these failures match llvmpipe. (There are
debug settings allowing improved gallivm sampling accurancy.)
- Some layered clearing tests as swr doesn't currently support that. It
was getting lucky before because enough other things were broken.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
|
| |
Reorder header files so that we have a chance to defined NOMINMAX before
mesa include files include windows.h
v3: split from bigger patch
Reviewed-by: Emil Velikov <[email protected]>
|
|
|
|
|
|
|
| |
Fixes the texsubimage piglit and lets the copyteximage one get further.
Signed-off-by: Ilia Mirkin <[email protected]>
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Previous fundamental change in stats gathering added a temporary
SwrWaitForIdle to begin_query and end_query. Code has been reworked to
remove stall.
Reviewed-by: George Kyriazis <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
| |
Separated FE stats out into its own structure. There are 17 FE vs 3 BE
stat fields. Since there is only one FE thread per DC then we don't have
to loop over all threads and sum up FE stats over all the worker threads.
This also reduces size of DC since we only need to store one copy of the
FE stats and not one per worker. Finally, we can use the new FE callback
mechanism to update these.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
| |
Add a per draw stats callback to update driver stats.
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
| |
1. SoWriteOffset is no longer treated as a stat
2. Added callback from core to update streamout write offset
Signed-off-by: Tim Rowley <[email protected]>
|
|
|
|
|
|
|
|
| |
required by glClientWaitSync (GL 4.5 Core spec) that can optionally flush
the context
Reviewed-by: Rob Clark <[email protected]>
Reviewed-by: Nicolai Hähnle <[email protected]>
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
to reduce the call indirections with u_resource_vtbl.
The worst call tree you could get was:
- u_transfer_inline_write_vtbl
- u_default_transfer_inline_write
- u_transfer_map_vtbl
- driver_transfer_map
- u_transfer_unmap_vtbl
- driver_transfer_unmap
That's 6 indirect calls. Some drivers only had 5. The goal is to have
1 indirect call for drivers that care. The resource type can be determined
statically at most call sites.
The new interface is:
pipe_context::buffer_subdata(ctx, resource, usage, offset, size, data)
pipe_context::texture_subdata(ctx, resource, level, usage, box, data,
stride, layer_stride)
v2: fix whitespace, correct ilo's behavior
Reviewed-by: Nicolai Hähnle <[email protected]>
Acked-by: Roland Scheidegger <[email protected]>
|
|
|
|
|
|
|
|
|
| |
A pipe pointer in the screen allows for access to current device context
in flush_frontbuffer and resource_destroy. This wasn't tracking current
context in multi-context situations.
v2: More caffeine. Corrected compare, removed unnecessary set of
screen-pipe in create_context, and added a few comments.
|
|
|
|
| |
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
| |
Fixes resource memory leaks.
Reviewed-by: Ilia Mirkin <[email protected]>
|
| |
|
|
|
|
|
|
|
|
|
|
| |
Removed bound_to_context. We now pick up the context from the screen
instead of the resource itself. The resource could be out-of-date
and point to a pipe that is already freed.
Fixes manywin mesa xdemo.
Reviewed-by: Bruce Cherniak <[email protected]>
|
|
|
|
|
|
|
|
| |
Better tracking of resource state and synchronization.
A follow on commit will clean up resource functions into a new
swr_resource.cpp file.
Reviewed-By: George Kyriazis <[email protected]>
|
|
OpenSWR is a new software rasterizer for x86 processors designed
for high performance and high scalablility on visualization workloads.
Acked-by: Roland Scheidegger <[email protected]>
Acked-by: Jose Fonseca <[email protected]>
|