summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/vc4
Commit message (Collapse)AuthorAgeFilesLines
* vc4: Move all of our fixed function fragment color handling to NIR.Eric Anholt2015-08-146-388/+538
| | | | | | | | | | This massively reduces our dependency on VC4-specific optimization passes. shader-db: total uniforms in shared programs: 32077 -> 32067 (-0.03%) uniforms in affected programs: 149 -> 139 (-6.71%) total instructions in shared programs: 98208 -> 98182 (-0.03%) instructions in affected programs: 2154 -> 2128 (-1.21%)
* vc4: Add a helper for making driver-specific NIR load_uniform for GL stateEric Anholt2015-08-142-2/+30
| | | | | | | | In order to move more of our lowering into NIR, we need the ability to reference various pipeline state (like texture rectangle scaling factors or blend colors), so we just set those up as a load_uniform with a big offset to indicate that it's not within the shader's uniform storage and is one of our state values.
* nir: Add a nir_opt_undef() to handle csels with undef.Eric Anholt2015-08-141-0/+1
| | | | | | | | | | | | | | | | | | | We may find a cause to do more undef optimization in the future, but for now this fixes up things after if flattening. vc4 was handling this internally most of the time, but a GLB2.7 shader that did a conditional discard and assign gl_FragColor in the else was still emitting some extra code. total instructions in shared programs: 100809 -> 100795 (-0.01%) instructions in affected programs: 37 -> 23 (-37.84%) v2: Use nir_instr_rewrite_src() to update def/use on src[0] (by Thomas Helland). v3: Make sure to flag metadata dirties, and copy the swizzle and abs/neg over to src[0], too (by anholt). Reviewed-by: Thomas Helland <[email protected]> (v2) Tested-by: Thomas Helland <[email protected]> (v2)
* gallium: add an interface for EXT_depth_bounds_testMarek Olšák2015-08-141-0/+1
| | | | | Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* gallium: add support for GLES texture float extensions (v3)Marek Olšák2015-08-141-0/+2
| | | | | | | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=74329 v2: add a CAP for half floats drivers should not expose the CAPs if they don't support the formats v3: update relnotes Reviewed-by: Edward O'Callaghan <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* vc4: add missing nir include, to fix the buildEmil Velikov2015-08-071-0/+1
| | | | | | Cc: 10.6 <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: automake: remove unused includeEmil Velikov2015-08-071-1/+0
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vc4: Use nir_lower_load_const_to_scalar().Eric Anholt2015-08-041-0/+1
|
* vc4: Don't bother de-SSAing values that aren't part of phi webs.Eric Anholt2015-08-041-15/+44
| | | | We can just support them the same way we do load_const's SSA values.
* vc4: Don't bother saturating the dst color for blending.Eric Anholt2015-08-041-8/+2
| | | | | | | | | Since we just pulled it out of the destination as 8-bit unorm, we know it's in [0, 1] already. shader-db: total instructions in shared programs: 100040 -> 98208 (-1.83%) instructions in affected programs: 14084 -> 12252 (-13.01%)
* vc4: Make r4-writes implicitly move to a temp, and allocate temps to r4.Eric Anholt2015-08-048-107/+106
| | | | | | | | | | | Previously, SFU values always moved to a temporary, and TLB color reads and texture reads always lived in r4. Instead, we can have these results just be normal temporaries, and the register allocator can leave the values in r4 when they don't interfere with anything else using r4. shader-db results: total instructions in shared programs: 100809 -> 100040 (-0.76%) instructions in affected programs: 42383 -> 41614 (-1.81%)
* vc4: Drop a dead prototype.Eric Anholt2015-08-041-8/+0
|
* vc4: Lower uniform loads to scalar in NIR.Eric Anholt2015-07-302-31/+81
| | | | | This also moves the vec4-to-byte-addressing math into NIR, so that algebraic has a chance at it.
* vc4: Move some FS input lowering into NIR.Eric Anholt2015-07-302-35/+50
|
* vc4: Move program keys to the header file.Eric Anholt2015-07-302-47/+49
| | | | | I want to be able to inspect them from other files for lowering passes in NIR.
* vc4: Lower NIR inputs to scalar as well.Eric Anholt2015-07-302-4/+44
| | | | | For now this is just scalarizing, but it also means we'll get to dump a bunch of QIR-based lowering in a moment.
* vc4: Start adding a NIR-based output lowering pass.Eric Anholt2015-07-304-7/+137
| | | | | | For now, this just splits up store_output intrinsics to be scalars, and drops unused outputs in the coordinate shader. My goal is to be able to drop a bunch of my VC4-specific optimization by letting NIR handle it.
* vc4: Mark our shaders as single-threaded.Eric Anholt2015-07-302-0/+6
| | | | | I had my understanding of this bit flipped. We're using the full register space, so we need to say so.
* vc4: Avoid leaking indirect array access UBOs.Eric Anholt2015-07-301-0/+2
|
* vc4: Avoid overflowing various static tables.Eric Anholt2015-07-304-4/+4
|
* vc4: Fix return values from recent validation changes.Eric Anholt2015-07-301-4/+4
|
* vc4: Skip re-emitting the shader_rec if it's unchanged.Eric Anholt2015-07-285-43/+158
| | | | | | | | It's a bunch of work for us to emit it (and its uniforms), more work for the kernel to validate it, and additional work for the CLE to read it. Improves es2gears framerate by about 50%. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Drop unused vpm_offset value.Eric Anholt2015-07-281-3/+0
| | | | | It's been dead since we started doing VS/CS attr offset setup during shader compile.
* vc4: Simplify vc4_use_bo and make sure it's not a shader.Eric Anholt2015-07-284-39/+26
| | | | | | | Since the conversion to keeping validated shaders around for the BO's lifetime, we haven't been checking that rendering doesn't happen to shaders. Make vc4_use_bo check that always, and just don't use it for the VC4_MODE_SHADER case (so now modes are unused)
* vc4: Keep the validated shader around for the simulator execution.Eric Anholt2015-07-283-13/+17
| | | | This more closely matches the kernel behavior on shader validation now.
* vc4: Make the object be the return value from vc4_use_bo().Eric Anholt2015-07-283-23/+25
| | | | Drops 40 bytes of code from validation.
* vc4: Ensure that the bin CL is properly capped by increment/flush.Eric Anholt2015-07-283-26/+36
| | | | | | | | We don't want anything to appear after we've kicked off the render (and thus job flush), since that might then get written out to the tile allocation state. Signed-off-by: Eric Anholt <[email protected]>
* vc4: Drop NV shader reloc validation.Eric Anholt2015-07-282-120/+60
| | | | It wasn't validating enough, and we don't need the packet.
* vc4: Fix raster surface shadow updates under DRI2.Eric Anholt2015-07-281-0/+6
| | | | | | | Glamor asks GBM for the handle of the BO, then flinks it itself. We were marking the bo non-private in the flink and dmabuf (DRI3) paths, but not the GEM handle path. As a result, non-pageflipping DRI2 swapbuffers (EGL apps, in particular) were never updating the texture.
* vc4: Fix bus errors on dumping CL on hardware.Eric Anholt2015-07-281-1/+1
| | | | | The kernel can't fixup unaligned float traps for us, so deref as a uint32_t first.
* vc4: Add support for ARB_draw_elements_base_vertex.Eric Anholt2015-07-271-1/+3
| | | | | | Gallium exposes it unconditionally, so do our best to support it. It fails on the negative index cases, but those seem unlikely to be used in the wild.
* gallium: replace INLINE with inlineIlia Mirkin2015-07-211-3/+3
| | | | | | | | | | | | | | | | Generated by running: git grep -l INLINE src/gallium/ | xargs sed -i 's/\bINLINE\b/inline/g' git grep -l INLINE src/mesa/state_tracker/ | xargs sed -i 's/\bINLINE\b/inline/g' git checkout src/gallium/state_trackers/clover/Doxyfile and manual edits to src/gallium/include/pipe/p_compiler.h src/gallium/README.portability to remove mentions of the inline define. Signed-off-by: Ilia Mirkin <[email protected]> Acked-by: Marek Olšák <[email protected]>
* vc4: Switch to using a separate ioctl for making shaders.Eric Anholt2015-07-174-12/+78
| | | | | | | | | This gives the kernel a chance to validate and lock down the data, without having to deal with mmap zapping. With this, GLBenchmark stops on a texture relocations, because we'd recycled a shader BO as another shader and failed to revalidate, since we weren't clearing the cached validation state on mmap faults.
* vc4: Fix printing of shader-db debug when shader-db isn't turned on.Eric Anholt2015-07-171-4/+6
|
* vc4: Add debugging on texture relocation validation failures.Eric Anholt2015-07-171-7/+13
|
* vc4: Also consider uniform 0 in uniform lowering.Eric Anholt2015-07-171-3/+3
| | | | The hash table considers key 0 to be the empty key.
* vc4: Use the pure/const attributes on a bunch of our QPU functions.Eric Anholt2015-07-172-18/+18
| | | | | | On a release build, this makes the rest of vc4_qpu_validate.c go away (the compiler didn't know that our qpu helper function calls had no side effects).
* gallium: add PIPE_CAP_MAX_SHADER_PATCH_VARYINGSMarek Olšák2015-07-161-0/+1
| | | | Reviewed-by: Ilia Mirkin <[email protected]>
* vc4: Cache the texture p1 for the sampler.Eric Anholt2015-07-143-49/+69
| | | | | Cuts another 12% of vc4_uniforms.o, in exchange for computing it at CSO creation time.
* vc4: Cache texture p0/p1 setup for the sampler view.Eric Anholt2015-07-143-28/+43
| | | | | In exchange for a bit of space and computation in CSO setup, we cut vc4_uniform.c (draw time) code size by 4.8%.
* vc4: Move uniforms handling to a separate file.Eric Anholt2015-07-143-314/+341
| | | | | The rest of vc4_program.c is about compiling, while this is about uniform emit at draw time.
* vc4: Fix some -Wdouble-promotion warnings.Eric Anholt2015-07-143-6/+6
| | | | | No code generation changes from this, but it'll be useful to have this next time I go checking -Wdouble-promotion.
* vc4: Fix compiler warnings on release builds.Eric Anholt2015-07-144-7/+14
|
* vc4: Add better debug for register allocation failure.Eric Anholt2015-07-141-1/+5
|
* vc4: Drop reloc_count tracking for debug asserts on non-debug builds.Eric Anholt2015-07-141-0/+10
| | | | Cuts another 88 bytes of compiled code.
* vc4: Rework cl handling to be friendlier to the compiler.Eric Anholt2015-07-146-152/+203
| | | | | Drops 680 bytes of code, from avoiding a bunch of extra updates to the next pointer in the struct.
* vc4: Make a helper function for getting the current offset in the CL.Eric Anholt2015-07-144-20/+21
| | | | | | I needed to rewrite this a bit for safety checking in the next commit. Despite being a static inline of the same thing that was being done, we lose 36 bytes of code for some reason.
* vc4: Drop separate cl*_reloc_hindex().Eric Anholt2015-07-141-18/+6
| | | | | | Now that RCL generation is in the kernel, we don't have any other callers. Oddly, the compiler generates another 8 bytes of code for this, but the simplification is worth it.
* vc4: Store reloc pointers as pointers, not offsets.Eric Anholt2015-07-141-5/+5
| | | | | | | Now that we don't resize the CL as we build (it's set up at the top by vc4_start_draw()), we can store the pointers instead of offsets from the base. Saves a bit of math in emitting relocs (about 60 bytes of code).
* vc4: Add perf debug for when we wait on BOs.Eric Anholt2015-07-144-43/+72
|