summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno: add support for laying out MRTs in gmemIlia Mirkin2015-04-022-16/+43
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: add core infrastructure support for MRTsIlia Mirkin2015-04-024-8/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add support for FS_COLOR0_WRITES_ALL_CBUFS propertyIlia Mirkin2015-04-022-1/+10
| | | | | | | This will enable the driver to tell which regids to link up to which MRT outputs. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add independent blend function supportIlia Mirkin2015-04-022-8/+9
| | | | | | This is needed for MRT support Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: remove alpha key from ir3_shaderIlia Mirkin2015-04-029-42/+8
| | | | | | | This complication is unnecessary and makes MRTs more complicated and likely to generate tons of variants. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add support for point sprite coordinate replacementIlia Mirkin2015-03-284-30/+28
| | | | | | | This does not (yet) support different coordinate origins, so the tests still fail due to fbo flipping. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: make vs-set point size workIlia Mirkin2015-03-283-2/+10
| | | | | | | | | | This appears to need the A2XX version of the point list, so select it at draw time if necessary. Experimentally, always using the A2XX version causes hangs when PSIZE isn't actually emitted. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: point size should not be divided by 2Ilia Mirkin2015-03-282-5/+5
| | | | | | | | | | | The division is probably a holdover from the days when the fixed point inline functions generated by headergen were broken. Also reduce the maximum point size to 4092 (vs 4096), which is what the blob does. Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: fix 3d texture layoutIlia Mirkin2015-03-282-7/+16
| | | | | | | | | | | | | The SZ2 field contains the layer size of a lower miplevel. It only contains 4 bits, which limits the maximum layer size it can describe. In situations where the next miplevel would be too big, the hardware appears to keep minifying the size until it hits one of that size. Unfortunately the hardware's ideas about sizes can differ from freedreno's which can still lead to issues. Minimize those by stopping to minify as soon as possible. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno/a3xx: LAYERSZ2 appears to have no effect on arraysIlia Mirkin2015-03-281-2/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: implement get_device_vendor() for existing driversGiuseppe Bilotta2015-03-231-0/+8
| | | | | | | | | The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* freedreno/ir3: fix infinite recursion in schedRob Clark2015-03-181-1/+1
| | | | | | | One more case we need to handle. One of the src instructions for the indirect could also end up being ourself. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix spellingRob Clark2015-03-181-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium: add FMA and DFMA opcodes (v3)Marek Olšák2015-03-161-0/+1
| | | | | | | | | Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno: update generated headersRob Clark2015-03-155-6/+6
| | | | | | | Fix a3xx texture layer-size. Signed-off-by: Rob Clark <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno/ir3: remove old compilerRob Clark2015-03-157-1574/+10
| | | | | | | Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: avoid scheduler deadlockRob Clark2015-03-153-0/+45
| | | | | | | | | | | Deadlock can occur if we schedule an address register write, yet some instructions which depend on that address register value also depend on other unscheduled instructions that depend on a different address register value. To solve this, before scheduling an address register write, ensure that all the other dependencies of the instructions which consume this address register are already scheduled. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: bit of cleanupRob Clark2015-03-153-19/+23
| | | | | | | | Add an array_insert() macro to simplify inserting into dynamically sized arrays, add a comment, and remove unused prototype inherited from the original freedreno.git/fdre-a3xx test code, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix slice pitch calculationsIlia Mirkin2015-03-131-1/+1
| | | | | | | | | | | | | For example if width were 65, the first slice would get 96 while the second would get 32. However the hardware appears to expect the second pitch to be 64, based on halving the 96 (and aligning up to 32). This fixes texelFetch piglit tests on a3xx below a certain size. Going higher they break again, but most likely due to unrelated reasons. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/a3xx: use the same layer size for all slicesIlia Mirkin2015-03-131-1/+8
| | | | | | | | | | | | | | | | | | We only program in one layer size per texture, so that means that all levels must share one size. This makes the piglit test bin/texelFetch fs sampler2DArray have the same breakage as its non-array version instead of being completely off, and makes bin/ext_texture_array-gen-mipmap start passing. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]> Reviewed-by: Rob Clark <[email protected]>
* freedreno/ir3: get the # of miplevels from getinfoIlia Mirkin2015-03-091-0/+20
| | | | | | | | | This fixes ARB_texture_query_levels to actually return the desired value. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno/ir3: fix array count returned by TXQIlia Mirkin2015-03-091-2/+42
| | | | | | Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno: move fb state copy after checking for size changeIlia Mirkin2015-03-091-2/+2
| | | | | | | Fixes: 1f3ca56b ("freedreno: use util_copy_framebuffer_state()") Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Rob Clark <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno: replace glsl130 debug flag with glsl120Rob Clark2015-03-082-12/+10
| | | | | | | | | | | | | | Now that relative-dst works, we should never fall back to the old compiler. (Which is almost true, other than a couple edge case sched fails in piglit). So replace glsl130 flag to force GLSL 130 and integers on a3xx/a4xx with a glsl120 flag to force GLSL 120 and !integers. If this commit breaks any game/app/etc use FD_MESA_DEBUG=glsl120 as a workaround and please let me know. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: relative dstRob Clark2015-03-085-42/+244
| | | | | | | | | To simplify RA, assign arrays that are written to first. Since enough dependency information is in the graph to preserve order of reads and writes of array, so all SSA names for the array collapse into one, just assign the entire thing by array-id. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split out array_fanin() helperRob Clark2015-03-081-17/+30
| | | | | | We'll need this too for relative dst.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: drop deref nodesRob Clark2015-03-087-80/+53
| | | | | | | | | | | | | | | | | | | | | The meta-deref instruction doesn't really do what we need for relative destination. Instead, since each instruction can reference at most a single address value, track the dependency on the address register via instr->address. This lets us express the dependency regardless of whether it is used for dst and/or src. The foreach_ssa_src{_n} iterator macros now also iterates the address register so, at least in SSA form, the address register behaves as an additional virtual src to the instruction. Which is pretty much what we want, as far as scheduling/etc. TODO: For now, the foreach_src{_n} iterators are unchanged. We could wrap the address in an ir3_register and make the foreach_src_{_n} iterators behave the same way. But that seems unnecessary at this point, since we mainly care about the address dependency when in SSA form. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: helpful iterator macrosRob Clark2015-03-089-111/+109
| | | | | | | | | I remembered that we are using c99.. which makes some sugary iterator macros easier. So introduce iterator macros to iterate all src registers and all SSA src instructions. The _n variants also return the src #, since there are a handful of places that need this. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix register usage calculationsRob Clark2015-03-081-7/+14
| | | | | | | | | For cat1 instructions, use reg() as well for relative src, to ensure proper accounting of register usage. Also, for relative instructions, use reg->size rather than reg->wrmask to determine the number of components read/written. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: couple tweaks for cmdline compilerRob Clark2015-03-081-1/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split up ssa_dstRob Clark2015-03-081-17/+25
| | | | | | And a couple other trivial renames, to prepare for relative dst. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix failed assert in groupingRob Clark2015-03-081-27/+44
| | | | | | | | | | | | | | | | | | | | | Turns out there are scenarios where we need to insert mov's in "front" of an input. Triggered by shaders like: VERT DCL IN[0] DCL IN[1] DCL OUT[0], POSITION DCL OUT[1], GENERIC[9] DCL SAMP[0] DCL TEMP[0], LOCAL 0: MOV TEMP[0].xy, IN[1].xyyy 1: MOV TEMP[0].w, IN[1].wwww 2: TXF TEMP[0], TEMP[0], SAMP[0], 1D_ARRAY 3: MOV OUT[1], TEMP[0] 4: MOV OUT[0], IN[0] 5: END Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix silly typo for binning pass shadersRob Clark2015-03-051-1/+1
| | | | | | | | | Was resulting in gl_PointSize write being optimized out, causing particle system type shaders to hang if hw binning enabled. Fixes neverball, OGLES2ParticleSystem, etc. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix old compiler after f6b2e8af742Rob Clark2015-03-041-0/+1
| | | | | | | If first_driver_param is left as zero (calloc'd struct), the result is c0 getting clobbered. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: re-enable int (conditional on glsl130)Rob Clark2015-03-031-1/+1
| | | | | | | | Re-enable integer, now that we can handle flat varyings. Still, ofc, conditional on FD_MESA_DEBUG=glsl130, until we can deprecate _old compiler.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle flat bypass for a4xxRob Clark2015-03-038-5/+99
| | | | | | | | | | | We may not need this for later a4xx patchlevels, but we do at least need this for patchlevel 0. Bypass bary.f for fetching varyings when flat shading is needed (rather than configure via cmdstream). This requires a special dummy bary.f w/ (ei) flag to signal to scheduler when all varyings are consumed. And requires shader variants based on rasterizer flatshade state to handle TGSI_INTERPOLATE_COLOR. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add support for memory (cat6) instructionsRob Clark2015-03-033-4/+8
| | | | | | | Scheduled basically the same as texture (cat5) instructions, using (sy) flag for synchronization. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix up cat6 instruction encodingsRob Clark2015-03-033-139/+121
| | | | | | | I think there is at least one more sub-encoding, but these two should be enough to cover the common load/store instructions. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx,a4xx: silence some warningsRob Clark2015-03-032-5/+2
| | | | | | | | | | | | | | fd3_emit.c: In function ‘fd3_emit_vertex_bufs’: fd3_emit.c:377:11: warning: unused variable ‘semantic’ [-Wunused-variable] uint8_t semantic = sem2name(vp->inputs[i].semantic); and fd4_emit.c: In function ‘fd4_emit_vertex_bufs’: fd4_emit.c:304:11: warning: unused variable ‘semantic’ [-Wunused-variable] uint8_t semantic = sem2name(vp->inputs[i].semantic); Signed-off-by: Rob Clark <[email protected]>
* freedreno: drop ARRAY_SIZE macroRob Clark2015-02-251-2/+0
| | | | | | | | | | | | | | | | | Since now ARRAY_SIZE has been added to util/macros.h. Fixes a bunch of: freedreno_util.h:79:0: warning: "ARRAY_SIZE" redefined #define ARRAY_SIZE(arr) (sizeof(arr) / sizeof((arr)[0])) ^ In file included from ../../../../src/gallium/include/pipe/p_compiler.h:36:0, from ../../../../src/gallium/include/pipe/p_context.h:31, from freedreno_context.h:32, from freedreno_context.c:29: ../../../../src/util/macros.h:29:0: note: this is the location of the previous definition # define ARRAY_SIZE(x) (sizeof(x) / sizeof(*(x))) ^ Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: aniso filteringRob Clark2015-02-241-4/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-02-245-5/+20
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: add ARB_instanced_arrays supportRob Clark2015-02-242-5/+4
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: handle index_bias (i.e. base_vertex)Rob Clark2015-02-241-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: add support for vertexid and instanceid sysvalsRob Clark2015-02-242-11/+24
| | | | | | ir3 bits of it already in place from a3xx patch.. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: pass number of instances to drawRob Clark2015-02-243-6/+7
| | | | | | | a4xx has it's own draw packet, so needs equivalent update to what a3xx already got. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: set PC_PRIM_VTX_CNTL.VAROUT properlyRob Clark2015-02-211-1/+6
| | | | | | | Fixes xonotic, some webgl stuff, and really pretty much anything with more than 4 varyings. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2015-02-217-16/+44
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a4xx: bit of cleanupRob Clark2015-02-214-33/+27
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno: implement fenceRob Clark2015-02-214-74/+65
| | | | | | | | | | I never actually implemented the stubbed out fence stuff back in the early days. Fix that. We'll need a few libdrm_freedreno changes to handle timeout properly, so ignore that for now to avoid a libdrm_freedreno dependency bump. Signed-off-by: Rob Clark <[email protected]>