summaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers/freedreno
Commit message (Collapse)AuthorAgeFilesLines
* freedreno/ir3/nir: couple little fixesRob Clark2015-04-111-2/+10
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: handle system valuesRob Clark2015-04-111-3/+50
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: handle txs and query_levels tex opsRob Clark2015-04-111-4/+81
| | | | | | | | These correspond to the tgsi TXQ opcode (plus sneak in a fix for two-sided color) Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: split out tex helpersRob Clark2015-04-111-34/+72
| | | | | | We'll need these in one or two other spots. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: simplify emit_tex()Rob Clark2015-04-112-61/+66
| | | | | | | | Just build up arrays for src0/src1, and use create_collect().. Also add back missing .3d flag for 3d/cube textures. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cp: handle indirect properlyRob Clark2015-04-111-13/+20
| | | | | | | | | I noticed some cases where we where trying to copy-propagate indirect src's into places they cannot go, like 2nd src for cat3 (mad, etc). Expand out valid_flags() to be aware of relativ flag, and fix up a few related spots. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/sched: avoid getting stuck on addr conflictsRob Clark2015-04-111-32/+42
| | | | | | | | | | | | | | | | | | When we get in a scenario where we cannot schedule any more instructions due to address register conflict, clone the instruction that writes the address register, and switch the remaining unscheduled users for the current address register over to the new clone. This is simpler and more robust than the previous attempt (which tried and sometimes failed to ensure all other dependencies of users of the address register were scheduled first).. hint it would try to schedule instructions that were not actually needed for any output value. We probably need to do the same with predicate register, although so far it isn't so heavily used so we aren't running into problems with it (yet). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: add variable-indexing supportRob Clark2015-04-111-16/+204
| | | | | | | | A bit fugly.. try and make this cleaner.. note if we hoist all the get_addr() out of the loop we can drop the hashtable and just use create_addr().. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/asm: change assert to warningRob Clark2015-04-111-1/+4
| | | | | | | | It probably *should* be an assert, but for now TGSI f/e isn't very good about dealing w/ CONST vs ABS/NEG. So for debug builds, print a warning instead of crashing with an assert for now. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/nir: set first_driver_paramRob Clark2015-04-111-0/+2
| | | | | | | Without this, a3xx breaks.. a4xx would too if it had already implemented support for passing driver params. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cp: support to swap mad src'sRob Clark2015-04-114-9/+43
| | | | | | | | | | | | | | | | For a normal MAD (ie. not MADSH), if first source is gpr and second source is const, we can swap the first two sources to avoid needing a mov instruction. This gives back the biggest advantage TGSI f/e had over NIR f/e for common shaders, since TGSI f/e had this logic in the f/e. Note that doing this in copy-prop step has the advantage that it will also work for cases like: MOV TEMP[b], CONST[x] MAD TEMP[d], TEMP[a], TEMP[b], TEMP[c] Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add NIR compilerRob Clark2015-04-057-5/+1762
| | | | | | | | | | | | | | | | The NIR compiler frontend is an alternative to the TGSI f/e, producing the same ir3 IR and using the same backend passes for scheduling, etc. It is not enabled by default yet, as there are still some regressions. To enable, use 'FD_MESA_DEBUG=nir'. It is enough to use with, for example, xonotic or supertuxkart. With the NIR f/e, scalarizing and a number of other lowering steps happen in NIR, so we don't have to do them in ir3. Which simplifies the f/e and allows the lowered instructions to pass through other optimization stages. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: don't decode srgb on mem2gmemIlia Mirkin2015-04-051-6/+12
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: pass sprite coord mode through to program emitIlia Mirkin2015-04-053-1/+4
| | | | | | | | Use the correct sprite replacement depending on the flip of the coord mode, using either T or 1-T depending on whether we have an upper-left or lower-left coordinate origin. This fixes all the point sprite piglits. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add UBO supportIlia Mirkin2015-04-056-38/+132
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: insert nop between sfu/mem operationsIlia Mirkin2015-04-051-1/+6
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: dirty context when reallocating a bound boIlia Mirkin2015-04-051-0/+40
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: keep track of buffer valid rangesIlia Mirkin2015-04-052-0/+27
| | | | | | | | Copies nouveau_buffer and radeon_buffer. This allows a write to proceed to an uninitialized part of a buffer even when the GPU is using the previously-initialized portions. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: mark resources as being read so that writes flush the queueIlia Mirkin2015-04-055-1/+59
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: don't bother setting resource timestampsIlia Mirkin2015-04-051-9/+0
| | | | | | | Waiting on a bo being ready is handled in fd_bo_cpu_prep. No need to keep separate timestamps around. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: add a reading flag to indicate gpu is reading rscIlia Mirkin2015-04-052-2/+3
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: fix resource flushing confusionIlia Mirkin2015-04-051-14/+10
| | | | | | | | | A resource flush is an upload of a hypothetically-staging texture to the GPU. For a UMA system, this will largely be a no-op or cache-maintenance. Move the render flush logic into transfer_map where it belongs, and clear out the transfer_flush function. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: remove tex_resourceIlia Mirkin2015-04-059-11/+3
| | | | | | | pipe_sampler_view already contains a texture, remove the redundant tex_resource member which pointed at the same thing. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: handle FRAG IN's without interpolation specifiedRob Clark2015-04-051-7/+15
| | | | | | Fallback to picking based on semantic name. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cmdline: add @const headers for immediatesRob Clark2015-04-051-0/+9
| | | | | | | | | | Since NIR f/e currently encodes immediates in instructions (rather than passing via const), we need to ensure that when const's are used the get initialized to the proper values. Otherwise comparing NIR to TGSI compiler, it will use proper immediate values in one case, and randomly initialize values in the other. Which confuses ir3test. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3/cmdline: remove hack for old compilerRob Clark2015-04-051-23/+0
| | | | | | Since we dropped the old compiler, we don't need this hack anymore. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: handle const/immed/abs/neg in cpRob Clark2015-04-053-31/+314
| | | | | | | | | | | | | Be smarter about propagating copies from const or immed, or with abs/neg modifiers. Also, realize that absneg.s and absneg.f are really "fancy" mov instructions. This opens up the possibility to remove more copies. It helps the TGSI frontend a bit, but will be really needed for the NIR f/e which builds everything up in SSA form (ie. will *always* insert a mov from const or immediate). Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: split float/int abs/negRob Clark2015-04-055-64/+213
| | | | | | | | | | | | Even though in the end, they map to the same bits, the backend will need to be able to differentiate float abs/neg vs integer abs/neg. Rather than making the backend figure it out based on instruction opcode (which when combined with mov/absneg instructions, can be awkward), just split out different flags for each so the frontend can signal it's intentions more clearly. Also, since (neg) for bitwise op's is actually a bitwise- not, split it out into bnot flag. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: add ir3 builder helpersRob Clark2015-04-053-4/+162
| | | | | | | | Add helpers for constructing SSA forms of instructions. Only partial cat5/cat6 coverage.. but we can add stuff as needed. Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: fix sam argument order commentRob Clark2015-04-051-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* xa: support for drivers which use NIRRob Clark2015-04-051-0/+4
| | | | | | | | | | We need to pull in libnir.la and it's dependency libglsl_util.la. Also, _mesa_error_no_memory() must be defined. Fortunately with libnir.la (vs pulling in all of libglsl.la) we don't also need libstdc++. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: add MRT supportIlia Mirkin2015-04-028-138/+219
| | | | | | | The hardware only supports 4 MRTs. It should be possible to emulate support for 8, but doesn't seem worth the trouble. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: convert blit program to array for each number of rtsIlia Mirkin2015-04-0212-21/+45
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: add support for laying out MRTs in gmemIlia Mirkin2015-04-022-16/+43
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: add core infrastructure support for MRTsIlia Mirkin2015-04-024-8/+14
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/ir3: add support for FS_COLOR0_WRITES_ALL_CBUFS propertyIlia Mirkin2015-04-022-1/+10
| | | | | | | This will enable the driver to tell which regids to link up to which MRT outputs. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add independent blend function supportIlia Mirkin2015-04-022-8/+9
| | | | | | This is needed for MRT support Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno: remove alpha key from ir3_shaderIlia Mirkin2015-04-029-42/+8
| | | | | | | This complication is unnecessary and makes MRTs more complicated and likely to generate tons of variants. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: add support for point sprite coordinate replacementIlia Mirkin2015-03-284-30/+28
| | | | | | | This does not (yet) support different coordinate origins, so the tests still fail due to fbo flipping. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: make vs-set point size workIlia Mirkin2015-03-283-2/+10
| | | | | | | | | | This appears to need the A2XX version of the point list, so select it at draw time if necessary. Experimentally, always using the A2XX version causes hangs when PSIZE isn't actually emitted. Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: point size should not be divided by 2Ilia Mirkin2015-03-282-5/+5
| | | | | | | | | | | The division is probably a holdover from the days when the fixed point inline functions generated by headergen were broken. Also reduce the maximum point size to 4092 (vs 4096), which is what the blob does. Cc: "10.4 10.5" <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]>
* freedreno/a3xx: fix 3d texture layoutIlia Mirkin2015-03-282-7/+16
| | | | | | | | | | | | | The SZ2 field contains the layer size of a lower miplevel. It only contains 4 bits, which limits the maximum layer size it can describe. In situations where the next miplevel would be too big, the hardware appears to keep minifying the size until it hits one of that size. Unfortunately the hardware's ideas about sizes can differ from freedreno's which can still lead to issues. Minimize those by stopping to minify as soon as possible. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno/a3xx: LAYERSZ2 appears to have no effect on arraysIlia Mirkin2015-03-281-2/+1
| | | | Signed-off-by: Ilia Mirkin <[email protected]>
* gallium: implement get_device_vendor() for existing driversGiuseppe Bilotta2015-03-231-0/+8
| | | | | | | | | The only hackish ones are llvmpipe and softpipe, which currently return the same string as for get_vendor(), while ideally they should return the CPU vendor. Signed-off-by: Giuseppe Bilotta <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* freedreno/ir3: fix infinite recursion in schedRob Clark2015-03-181-1/+1
| | | | | | | One more case we need to handle. One of the src instructions for the indirect could also end up being ourself. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix spellingRob Clark2015-03-181-1/+1
| | | | Signed-off-by: Rob Clark <[email protected]>
* gallium: add FMA and DFMA opcodes (v3)Marek Olšák2015-03-161-0/+1
| | | | | | | | | Needed by ARB_gpu_shader5. v2: select DMAD for FMA with double precision v3: add and select DFMA Reviewed-by: Ilia Mirkin <[email protected]>
* freedreno: update generated headersRob Clark2015-03-155-6/+6
| | | | | | | Fix a3xx texture layer-size. Signed-off-by: Rob Clark <[email protected]> Cc: "10.4 10.5" <[email protected]>
* freedreno/ir3: remove old compilerRob Clark2015-03-157-1574/+10
| | | | | | | Now that piglit is no longer falling back to old compiler for any tests, we can remove it. Hurray \o/ Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: avoid scheduler deadlockRob Clark2015-03-153-0/+45
| | | | | | | | | | | Deadlock can occur if we schedule an address register write, yet some instructions which depend on that address register value also depend on other unscheduled instructions that depend on a different address register value. To solve this, before scheduling an address register write, ensure that all the other dependencies of the instructions which consume this address register are already scheduled. Signed-off-by: Rob Clark <[email protected]>