aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4/vc4_state.c
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Respect a sampler view's first_layer field.Eric Anholt2018-08-071-1/+3
| | | | | | | Fixes texturing from EGL images created from cubemap faces, as in dEQP-EGL.functional.image.create.gles2_cubemap_negative_x_rgba_texture Cc: [email protected]
* vc4: use util_copy_framebuffer_stateRob Clark2018-05-151-12/+2
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* broadcom/vc4: Allow binding non-zero constant buffers.Eric Anholt2018-03-091-3/+4
| | | | | We're going to use UBO loads for implementing YUV linear-to-T-format blits.
* broadcom/vc4: Add support for YUV textures using unaccelerated blits.Eric Anholt2018-02-231-1/+2
| | | | | Previously we would assertion fail about having no hardware format. This is enough to get kmscube -M nv12-2img working.
* broadcom/vc4: Fix double-unrefcounting of prsc->next with shadows.Eric Anholt2018-02-231-6/+11
| | | | | | | When we set up the shadow resource we were copying the original resource as the template, including its prsc->next field. When we shadowed the first YUV plane's resource for linear-to-tiled conversion, we would end up unbalancing the refcount on the shadow resource's destruction.
* broadcom/vc4: Use pipe_resource_reference in sampler views.Eric Anholt2018-02-231-2/+2
| | | | Improves u_debug_refcount output.
* broadcom/vc4: Expose PIPE_CAP_TILE_RASTER_ORDEREric Anholt2017-10-101-0/+12
| | | | | | | | | | | | | | Because vc4 can control the order that tiles are rasterized in, we can use it to implement overlapping blits using normal drawing and GL_ARB_texture_barrier, as long as we can tell the kernel what order to render the tiles in. v2: Fix on the simulator. v3: Add the cap (disabled) to other drivers, add rst docs for the cap. v4: Rebase on PIPE_CAP_TGSI_ANY_REG_AS_ADDRESS v5: Split from the core gallium commit, drop some unnecessary code related to glBlitFramebuffer(), fix a crash with clears before state has been bound.
* vc4: Add labels to BOs for debug builds or with VC4_DEBUG=surf set.Eric Anholt2017-09-271-0/+3
| | | | | | | This has proven to be incredibly useful for debugging CMA allocation failures and driving memory management improvements. However, we don't want to burden entry and exit from the BO cache with the labeling ioctl's overhead on release builds.
* broadcom/vc4: Keep pipe_sampler_view->texture matching the original texture.Eric Anholt2017-09-261-13/+21
| | | | | | | | | | | I was overwriting view->texture with the shadow resource when we need to do shadow copies (retiling or baselevel rebase), but that tripped up some critical new sanity checking in state_tracker (making sure that stObj->pt hasn't changed from view->texture through TexImage-related paths). To avoid that, move the shadow resource to the vc4_sampler_view struct. Fixes: f0ecd36ef8e1 ("st/mesa: add an entirely separate codepath for setting up buffer views")
* vc4: Move rasterizer state packing to CSO creation time.Eric Anholt2017-06-301-3/+14
| | | | | | | | | | This gets our vc4_emit.c size back down a bit: before: 1020 0 0 1020 3fc src/gallium/drivers/vc4/.libs/vc4_emit.o after: 968 0 0 968 3c8 src/gallium/drivers/vc4/.libs/vc4_emit.o
* vc4: Drop the u_resource_vtbl no-op layer.Eric Anholt2017-05-171-2/+2
| | | | | We only ever attached one vtbl, so it was a waste of space and indirections.
* gallium: remove pipe_index_buffer and set_index_bufferMarek Olšák2017-05-101-19/+0
| | | | | | | | | | | | | | pipe_draw_info::indexed is replaced with index_size. index_size == 0 means non-indexed. Instead of pipe_index_buffer::offset, pipe_draw_info::start is used. For indexed indirect draws, pipe_draw_info::start is added to the indirect start. This is the only case when "start" affects indirect draws. pipe_draw_info::index is a union. Use either index::resource or index::user depending on the value of pipe_draw_info::has_user_indices. v2: fixes for nine, svga
* gallium: s/uint/enum pipe_shader_type/ for set_constant_buffer()Brian Paul2017-03-081-1/+2
| | | | Reviewed-by: Edward O'Callaghan <[email protected]>
* vc4: Add miptree/texture state support for ETC1 compressed textures.Eric Anholt2016-11-031-0/+3
| | | | | The format isn't flagged as enabled at runtime yet, because we need kernel validation support.
* vc4: Implement job shufflingEric Anholt2016-09-141-35/+1
| | | | | | | | | | | | | | | Track rendering to each FBO independently and flush rendering only when necessary. This lets us avoid the overhead of storing and loading the frame when an application momentarily switches to rendering to some other texture in order to continue rendering the main scene. Improves glmark -b desktop:effect=shadow:windows=4 by 27% Improves glmark -b desktop:blur-radius=5:effect=blur:passes=1:separable=true:windows=4 by 17% While I haven't tested other apps, this should help X rendering a lot, and I've heard GLBenchmark needed it too.
* vc4: Move the render job state into a separate structure.Eric Anholt2016-09-141-12/+13
| | | | | This is a preparation step for having multiple jobs being queued up at the same time.
* gallium: Use enum pipe_shader_type in set_sampler_views()Kai Wasserbäch2016-08-291-2/+3
| | | | | Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: Use enum pipe_shader_type in bind_sampler_states() (v2)Kai Wasserbäch2016-08-291-1/+1
| | | | | | | | | | | v1 → v2: - Fixed indentation (noted by Brian Paul) - Removed second assert from nouveau's switch statements (suggested by Brian Paul) Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* vc4: Zero-initialize the hardware sampler view structure.Eric Anholt2016-07-311-1/+1
| | | | | Fixes failure to initialize the force_first_level flag, causing failures in piglit levelclamp.
* vc4: Speed up glGenerateMipmaps by avoiding shadow baselevel.Eric Anholt2016-07-151-2/+7
| | | | | | | | | | | | | To support general GL_TEXTURE_BASE_LEVEL we have to copy to a temporary miptree. However, if a single level is being selected, we can use the existing miptree and force all the sampling to be from that particular level. This avoids a ton of software fallbacks in glGenerateMipmaps(), which uses base levels in the blit implementation in gallium. Improves "glmark2 -b terrain" from 2 fps to 3 (perhaps some more precision would be useful?), and cuts its CPU usage during the benchmarking from ~30% to ~10% (total CPU time from 8.8s to 7.6s).
* vc4: Drop VC4_DIRTY_TEXSTATE in favor of the per-stage flags.Eric Anholt2016-07-151-4/+0
| | | | | | The compiler uses the per-stage flags already, so it didn't need this. vc4_uniforms was using it, so just replace it with both of the stage flags for now.
* vc4: Remove dead dirty_samplers field.Eric Anholt2016-07-151-4/+0
| | | | We use a big VC4_DIRTY_FRAGTEX/VC4_DIRTY_VERTEX on the stage, instead.
* gallium: make constant_buffer constRob Clark2016-06-201-1/+1
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* vc4: Don't consider nr_samples==1 surfaces to be MSAA.Eric Anholt2015-12-151-2/+2
| | | | | | This is apparently a weirdness of gallium -- nr_samples==1 is occasionally used and means the same thing as nr_samples==0. Fixes a bunch of ARB_framebuffer_srgb blit cases in piglit.
* vc4: Add support for drawing in MSAA.Eric Anholt2015-12-081-0/+19
|
* vc4: Add support for loading sample mask.Eric Anholt2015-12-041-1/+1
|
* vc4: Avoid loading undefined (newly-allocated) FBO contents.Eric Anholt2015-11-091-0/+17
| | | | | | | Since X has undefined contents in new pixmaps, it will allocate new textures for an FBO and draw to them without an explicit clear. For VC4, it's much faster to emit a clear than the load of the actual undefined memory contents, so just do that instead.
* vc4: Return NULL when we can't make our shadow for a sampler view.Eric Anholt2015-11-091-0/+4
| | | | | | | I'm not sure what the caller does is appropriate (just have a NULL sampler at this slot), but it fixes the immediate crash. Cc: "11.0" <[email protected]>
* vc4: Allow user index buffers, to avoid slow readback for shadow IBs.Eric Anholt2015-10-291-1/+1
| | | | | Improves low-settings openarena performance by 31.9975% +/- 0.659931% (n=7).
* vc4: Convert blending to being done in 4x8 unorm normally.Eric Anholt2015-10-231-1/+3
| | | | | | | | | | | | | We can't do this all the time, because you want blending to be done in linear space, and sRGB would lose too much precision being done in 4x8. The win on instructions is pretty huge when you can, though. total uniforms in shared programs: 32065 -> 32168 (0.32%) uniforms in affected programs: 327 -> 430 (31.50%) total instructions in shared programs: 92644 -> 89830 (-3.04%) instructions in affected programs: 15580 -> 12766 (-18.06%) Improves openarena performance at 1920x1080 from 10.7fps to 11.2fps.
* vc4: Cache the texture p1 for the sampler.Eric Anholt2015-07-141-1/+54
| | | | | Cuts another 12% of vc4_uniforms.o, in exchange for computing it at CSO creation time.
* vc4: Cache texture p0/p1 setup for the sampler view.Eric Anholt2015-07-141-11/+24
| | | | | In exchange for a bit of space and computation in CSO setup, we cut vc4_uniform.c (draw time) code size by 4.8%.
* vc4: Fix some -Wdouble-promotion warnings.Eric Anholt2015-07-141-1/+1
| | | | | No code generation changes from this, but it'll be useful to have this next time I go checking -Wdouble-promotion.
* vc4: Just stream out fallback IB contents.Eric Anholt2015-05-271-18/+2
| | | | | | | | | | | | | | | The idea I had when I wrote the original shadow code was that you'd see a set_index_buffer to the IB, then a bunch of draws out of it. What's actually happening in openarena is that set_index_buffer occurs at every draw, so we end up making a new shadow BO every time, and converting more of the BO than is actually used in the draw. While I could maybe come up with a better caching scheme, for now just do the simple thing that doesn't result in a new shadow IB allocation per draw. Improves performance of isosurf in drawelements mode by 58.7967% +/- 3.86152% (n=8).
* vc4: Don't forget to make our raster shadow textures non-raster.Eric Anholt2015-05-271-0/+3
| | | | | | Not sure what happened in my testing that made the previous shadow code fix glxgears swapbuffering, but this also fixes lots of CopyArea in X (like dragging xlogo around in metacity).
* vc4: Update the shadow texture for public textures on every draw.Eric Anholt2015-04-151-6/+1
| | | | | We don't know who else has written to it, so we'd better update it every time. This makes the gears spin in X again.
* vc4: When asked to sample from a raster texture, make a shadow tiled copy.Eric Anholt2015-04-131-2/+9
| | | | | | | | | | | So, it turns out my simulator doesn't *quite* match the hardware. And the errata about raster textures tells you most of what's wrong, but there's still stuff wrong after that. Instead, if we're asked to sample from raster, we'll just blit it to a tiled temporary. Raster textures should only be screen scanout, and word is that it's faster to copy to tiled using the tiling engine first than to texture from an entire raster texture, anyway.
* vc4: Add support for enabling early Z discards.Eric Anholt2014-12-161-0/+18
| | | | This is the same basic logic from the original Broadcom driver.
* vc4: Don't throw out the index offset in the shadow index buffer path.Eric Anholt2014-12-111-2/+1
| | | | | When we upload shadow indices at draw time, we need the source offset. Fixes the piglit draw-elements test.
* vc4: Fix stencil writemask handling.Eric Anholt2014-10-211-2/+2
| | | | | | | If the writemask doesn't compress, then we want to put in the uncompressed writemask, not the compressed writemask failure value (all-on). Fixes glean's stencil2 and fbo-clear-formats on stencil.
* vc4: Don't look at back stencil state unless two-sided stencil is enabled.Eric Anholt2014-10-211-2/+6
| | | | | Fixes regressions in the next bugfix, because gallium util stuff leaves the back stencil state as 0 if !back->enabled.
* vc4: Translate 4-byte index buffers to 2 bytes.Eric Anholt2014-10-191-4/+21
| | | | Fixes assertion failures in 14 piglit tests (half of which now pass).
* vc4: Add support for rebasing texture levels so firstlevel == 0.Eric Anholt2014-10-191-1/+25
| | | | | | GLES2 doesn't have GL_TEXTURE_BASE_LEVEL, so the hardware doesn't. Fixes piglit levelclamp, tex-miplevel-selection, and texture-storage/2D mipmap rendering.
* vc4: Add support for user clip plane and gl_ClipVertex.Eric Anholt2014-10-151-1/+3
| | | | Fixes about 15 piglit tests about interpolation and clipping.
* vc4: Fix render target NPOT alignment at small miplevels.Eric Anholt2014-10-141-3/+12
| | | | | | | | The texturing hardware takes the POT level 0 width/height and minifies those. This is different from what we were doing, for example, for 273-wide's level 5: POT(273>>5) == 8, while POT(273)>>5 == 16. Fixes piglit-depthstencil-render-miplevels 273.
* vc4: Mostly fix offset calculation for NPOT mipmap levels.Eric Anholt2014-10-091-1/+12
| | | | | | | | | | | | | | The non-base NPOT levels are stored as POT-aligned images. We get that POT alignment by minifying the POT-aligned base level. This means that level strides are also POT aligned, so we have to tell the rendering mode config that our resource is larger than the actual requested area. Fixes the fbo-generatemipmap-formats NPOT cases. Regresses depthstencil-render-miplevels 273 * -- the texture presentation now works (where it was completely broken before), it looks like there's some overflow of image bounds happening at the lower miplevels.
* vc4: Add support for point size setting.Eric Anholt2014-09-241-2/+4
| | | | This is the support for both the global and per-vertex modes.
* vc4: Actually add support for polygon offset.Eric Anholt2014-09-241-1/+11
| | | | | Setting the bit without setting the offset values is kind of useless. Fixes piglit polygon-offset (but not polygon-mode-offset).
* vc4: Add support for flat shading.Eric Anholt2014-09-231-0/+7
| | | | | | | This is just the GL 1.1 flat shading of colors -- we don't need to support TGSI constant interpolation bits, because we don't do GLSL 1.30. Fixes 7 piglit tests.
* vc4: Add support for stencil operations.Eric Anholt2014-09-181-0/+71
| | | | | | | While depth test state is passed through the fragment shader as sideband, data, the stencil test state has to be set by the fragment shader itself. Many tests are still failing, but this gets most of hiz/ passing.