Name MESA_swap_frame_usage Name Strings GLX_MESA_swap_frame_usage Contact Ian Romanick, IBM, idr at us.ibm.com Status Deployed in DRI drivers post-XFree86 4.3. Version Date: 5/1/2003 Revision: 1.1 Number ??? Dependencies GLX_SGI_swap_control affects the definition of this extension. GLX_MESA_swap_control affects the definition of this extension. GLX_OML_sync_control affects the definition of this extension. Based on WGL_I3D_swap_frame_usage version 1.3. Overview This extension allows an application to determine what portion of the swap period has elapsed since the last swap operation completed. The "usage" value is a floating point value on the range [0,max] which is calculated as follows: td percent = ---- tf where td is the time measured from the last completed buffer swap (or call to enable the statistic) to when the next buffer swap completes, tf is the entire time for a frame which may be multiple screen refreshes depending on the swap interval as set by the GLX_SGI_swap_control or GLX_OML_sync_control extensions. The value, percent, indicates the amount of time spent between the completion of the two swaps. If the value is in the range [0,1], the buffer swap occurred within the time period required to maintain a constant frame rate. If the value is in the range (1,max], a constant frame rate was not achieved. The value indicates the number of frames required to draw. This definition of "percent" differs slightly from WGL_I3D_swap_frame_usage. In WGL_I3D_swap_frame_usage, the measurement is taken from the completion of one swap to the issuance of the next. This representation may not be as useful as measuring between completions, as a significant amount of time may pass between the issuance of a swap and the swap actually occurring. There is also a mechanism to determine whether a frame swap was missed. New Procedures and Functions int glXGetFrameUsageMESA(Display *dpy, GLXDrawable drawable, float *usage) int glXBeginFrameTrackingMESA(Display *dpy, GLXDrawable drawable) int glXEndFrameTrackingMESA(Display *dpy, GLXDrawable drawable) int glXQueryFrameTrackingMESA(Display *dpy, GLXDrawable drawable, int64_t *swapCount, int64_t *missedFrames, float *lastMissedUsage) New Tokens None Additions to Chapter 2 of the 1.4 GL Specification (OpenGL Operation) None Additions to Chapter 3 of the 1.4 GL Specification (Rasterization) None Additions to Chapter 4 of the 1.4 GL Specification (Per-Fragment Operations and the Framebuffer) None Additions to Chapter 5 of the 1.4 GL Specification (Special Functions) None Additions to Chapter 6 of the 1.4 GL Specification (State and State Requests) None Additions to the GLX 1.3 Specification The frame usage is measured as the percentage of the swap period elapsed between two buffer-swap operations being committed. In unextended GLX the swap period is the vertical refresh time. If SGI_swap_control or MESA_swap_control are supported, the swap period is the vertical refresh time multiplied by the swap interval (or one if the swap interval is set to zero). If OML_sync_control is supported, the swap period is the vertical refresh time multiplied by the divisor parameter to glXSwapBuffersMscOML. The frame usage in this case is less than 1.0 if the swap is committed before target_msc, and is greater than or equal to 1.0 otherwise. The actual usage value is based on the divisor and is never less than 0.0. int glXBeginFrameTrackingMESA(Display *dpy, GLXDrawable drawable, float *usage) glXGetFrameUsageMESA returns a floating-point value in that represents the current swap usage, as defined above. Missed frame swaps can be tracked by calling the following function: int glXBeginFrameTrackingMESA(Display *dpy, GLXDrawable drawable) glXBeginFrameTrackingMESA resets a "missed frame" count and synchronizes with the next frame vertical sync before it returns. If a swap is missed based in the rate control specified by the set by glXSwapIntervalSGI or the default swap of once per frame, the missed frame count is incremented. The current missed frame count and total number of swaps since the last call to glXBeginFrameTrackingMESA can be obtained by calling the following function: int glXQueryFrameTrackingMESA(Display *dpy, GLXDrawable drawable, int64_t *swapCount, int64_t *missedFrames, float *lastMissedUsage) The location pointed to by will be updated with the number of swaps that have been committed. This value may not match the number of swaps that have been requested since swaps may be queued by the implementation. This function can be called at any time and does not synchronize to vertical blank. The location pointed to by will contain the number swaps that missed the specified frame. The frame usage for the last missed frame is returned in the location pointed to by . Frame tracking is disabled by calling the function int glXEndFrameTrackingMESA(Display *dpy, GLXDrawable drawable) This function will not return until all swaps have occurred. The application can call glXQueryFrameTrackingMESA for a final swap and missed frame count. If these functions are successful, zero is returned. If the context associated with dpy and drawable is not a direct context, GLX_BAD_CONTEXT is returned. Errors If the function succeeds, zero is returned. If the function fails, one of the following error codes is returned: GLX_BAD_CONTEXT The current rendering context is not a direct context. GLX Protocol None. This extension only extends to direct rendering contexts. New State None New Implementation Dependent State None Revision History 1.1, 5/1/03 Added contact information. 1.0, 3/17/03 Initial version based on WGL_I3D_swap_frame_usage. 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 192 193 194 195 196 197 198 199 200 201 202 203 204 205 206 207 208 209 210 211 212 213 214 215 216 217 218 219 220 221 222 223 224 225 226 227 228 229 230 231 232 233 234 235 236 237 238 239 240 241 242 243 244 245 246 247 248 249 250 251 252 253 254 255 256 257 258 259 260 261 262 263 264 265 266 267 268 269 270 271 272 273 274 275 276 277 278 279 280 281 282 283 284 285 286 287 288 289 290 291 292 293 294 295 296 297 298 299 300 301 302 303 304 305 306 307 308 309 310 311 312 313 314 315 316 317 318 319 320 321 322 323 324 325 326 327 328 329 330 331 332 333 334 335 336 337 338 339 340 341 342 343 344 345 346 347 348 349 350 351 352 353 354 355 356 357 358 359 360 361 362 363 364 365 366 367 368 369 370 371 372 373 374 375 376 377 378 379 380 381 382 383 384 385 386 387 388
/*
 * Copyright (c) 2014 Scott Mansell
 * Copyright © 2014 Broadcom
 *
 * Permission is hereby granted, free of charge, to any person obtaining a
 * copy of this software and associated documentation files (the "Software"),
 * to deal in the Software without restriction, including without limitation
 * the rights to use, copy, modify, merge, publish, distribute, sublicense,
 * and/or sell copies of the Software, and to permit persons to whom the
 * Software is furnished to do so, subject to the following conditions:
 *
 * The above copyright notice and this permission notice (including the next
 * paragraph) shall be included in all copies or substantial portions of the
 * Software.
 *
 * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
 * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
 * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.  IN NO EVENT SHALL
 * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
 * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
 * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
 * IN THE SOFTWARE.
 */

#include "util/u_prim.h"
#include "util/u_format.h"
#include "util/u_pack_color.h"
#include "indices/u_primconvert.h"

#include "vc4_context.h"
#include "vc4_resource.h"

static void
vc4_get_draw_cl_space(struct vc4_context *vc4)
{
        /* Binner gets our packet state -- vc4_emit.c contents,
         * and the primitive itself.
         */
        cl_ensure_space(&vc4->bcl, 256);

        /* Nothing for rcl -- that's covered by vc4_context.c */

        /* shader_rec gets up to 12 dwords of reloc handles plus a maximally
         * sized shader_rec (104 bytes base for 8 vattrs plus 32 bytes of
         * vattr stride).
         */
        cl_ensure_space(&vc4->shader_rec, 12 * sizeof(uint32_t) + 104 + 8 * 32);

        /* Uniforms are covered by vc4_write_uniforms(). */

        /* There could be up to 16 textures per stage, plus misc other
         * pointers.
         */
        cl_ensure_space(&vc4->bo_handles, (2 * 16 + 20) * sizeof(uint32_t));
        cl_ensure_space(&vc4->bo_pointers,
                        (2 * 16 + 20) * sizeof(struct vc4_bo *));
}

/**
 * Does the initial bining command list setup for drawing to a given FBO.
 */
static void
vc4_start_draw(struct vc4_context *vc4)
{
        if (vc4->needs_flush)
                return;

        vc4_get_draw_cl_space(vc4);

        uint32_t width = vc4->framebuffer.width;
        uint32_t height = vc4->framebuffer.height;
        uint32_t tilew = align(width, 64) / 64;
        uint32_t tileh = align(height, 64) / 64;

        /* Tile alloc memory setup: We use an initial alloc size of 32b.  The
         * hardware then aligns that to 256b (we use 4096, because all of our
         * BO allocations align to that anyway), then for some reason the
         * simulator wants an extra page available, even if you have overflow
         * memory set up.
         *
         * XXX: The binner only does 28-bit addressing math, so the tile alloc
         * and tile state should be in the same BO and that BO needs to not
         * cross a 256MB boundary, somehow.
         */
        uint32_t tile_alloc_size = 32 * tilew * tileh;
        tile_alloc_size = align(tile_alloc_size, 4096);
        tile_alloc_size += 4096;
        uint32_t tile_state_size = 48 * tilew * tileh;
        if (!vc4->tile_alloc || vc4->tile_alloc->size < tile_alloc_size) {
                vc4_bo_unreference(&vc4->tile_alloc);
                vc4->tile_alloc = vc4_bo_alloc(vc4->screen, tile_alloc_size,
                                               "tile_alloc");
        }
        if (!vc4->tile_state || vc4->tile_state->size < tile_state_size) {
                vc4_bo_unreference(&vc4->tile_state);
                vc4->tile_state = vc4_bo_alloc(vc4->screen, tile_state_size,
                                               "tile_state");
        }

        //   Tile state data is 48 bytes per tile, I think it can be thrown away
        //   as soon as binning is finished.
        cl_start_reloc(&vc4->bcl, 2);
        cl_u8(&vc4->bcl, VC4_PACKET_TILE_BINNING_MODE_CONFIG);
        cl_reloc(vc4, &vc4->bcl, vc4->tile_alloc, 0);
        cl_u32(&vc4->bcl, vc4->tile_alloc->size);
        cl_reloc(vc4, &vc4->bcl, vc4->tile_state, 0);
        cl_u8(&vc4->bcl, tilew);
        cl_u8(&vc4->bcl, tileh);
        cl_u8(&vc4->bcl,
              VC4_BIN_CONFIG_AUTO_INIT_TSDA |
              VC4_BIN_CONFIG_ALLOC_BLOCK_SIZE_32 |
              VC4_BIN_CONFIG_ALLOC_INIT_BLOCK_SIZE_32);

        /* START_TILE_BINNING resets the statechange counters in the hardware,
         * which are what is used when a primitive is binned to a tile to
         * figure out what new state packets need to be written to that tile's
         * command list.
         */
        cl_u8(&vc4->bcl, VC4_PACKET_START_TILE_BINNING);

        /* Reset the current compressed primitives format.  This gets modified
         * by VC4_PACKET_GL_INDEXED_PRIMITIVE and
         * VC4_PACKET_GL_ARRAY_PRIMITIVE, so it needs to be reset at the start
         * of every tile.
         */
        cl_u8(&vc4->bcl, VC4_PACKET_PRIMITIVE_LIST_FORMAT);
        cl_u8(&vc4->bcl, (VC4_PRIMITIVE_LIST_FORMAT_16_INDEX |
                          VC4_PRIMITIVE_LIST_FORMAT_TYPE_TRIANGLES));

        vc4->needs_flush = true;
        vc4->draw_call_queued = true;
}

static void
vc4_update_shadow_textures(struct pipe_context *pctx,
                           struct vc4_texture_stateobj *stage_tex)
{
        for (int i = 0; i < stage_tex->num_textures; i++) {
                struct pipe_sampler_view *view = stage_tex->textures[i];
                if (!view)
                        continue;
                struct vc4_resource *rsc = vc4_resource(view->texture);
                if (rsc->shadow_parent)
                        vc4_update_shadow_baselevel_texture(pctx, view);
        }
}

static void
vc4_draw_vbo(struct pipe_context *pctx, const struct pipe_draw_info *info)
{
        struct vc4_context *vc4 = vc4_context(pctx);

        if (info->mode >= PIPE_PRIM_QUADS) {
                util_primconvert_save_index_buffer(vc4->primconvert, &vc4->indexbuf);
                util_primconvert_save_rasterizer_state(vc4->primconvert, &vc4->rasterizer->base);
                util_primconvert_draw_vbo(vc4->primconvert, info);
                perf_debug("Fallback conversion for %d %s vertices\n",
                           info->count, u_prim_name(info->mode));
                return;
        }

        /* Before setting up the draw, do any fixup blits necessary. */
        vc4_update_shadow_textures(pctx, &vc4->verttex);
        vc4_update_shadow_textures(pctx, &vc4->fragtex);

        vc4_get_draw_cl_space(vc4);

        struct vc4_vertex_stateobj *vtx = vc4->vtx;
        struct vc4_vertexbuf_stateobj *vertexbuf = &vc4->vertexbuf;

        if (vc4->prim_mode != info->mode) {
                vc4->prim_mode = info->mode;
                vc4->dirty |= VC4_DIRTY_PRIM_MODE;
        }

        vc4_start_draw(vc4);
        vc4_update_compiled_shaders(vc4, info->mode);

        vc4_emit_state(pctx);
        vc4->dirty = 0;

        vc4_write_uniforms(vc4, vc4->prog.fs,
                           &vc4->constbuf[PIPE_SHADER_FRAGMENT],
                           &vc4->fragtex);
        vc4_write_uniforms(vc4, vc4->prog.vs,
                           &vc4->constbuf[PIPE_SHADER_VERTEX],
                           &vc4->verttex);
        vc4_write_uniforms(vc4, vc4->prog.cs,
                           &vc4->constbuf[PIPE_SHADER_VERTEX],
                           &vc4->verttex);

        /* The simulator throws a fit if VS or CS don't read an attribute, so
         * we emit a dummy read.
         */
        uint32_t num_elements_emit = MAX2(vtx->num_elements, 1);
        /* Emit the shader record. */
        cl_start_shader_reloc(&vc4->shader_rec, 3 + num_elements_emit);
        cl_u16(&vc4->shader_rec,
               VC4_SHADER_FLAG_ENABLE_CLIPPING |
               ((info->mode == PIPE_PRIM_POINTS &&
                 vc4->rasterizer->base.point_size_per_vertex) ?
                VC4_SHADER_FLAG_VS_POINT_SIZE : 0));
        cl_u8(&vc4->shader_rec, 0); /* fs num uniforms (unused) */
        cl_u8(&vc4->shader_rec, vc4->prog.fs->num_inputs);
        cl_reloc(vc4, &vc4->shader_rec, vc4->prog.fs->bo, 0);