summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* gallivm: don't calculate square root of rho if we use accurate rho methodRoland Scheidegger2013-08-301-39/+74
| | | | | | | | | | | | | | | | | | | | While a sqrt here and there shouldn't hurt much (depending on the cpu) it is possible to completely omit it since rho is only used for calculating lod and there log2(x) == 0.5*log2(x^2). Depending on the exact path taken for calculating lod this means we get a simple mul instead of sqrt (in case of nearest mip filter in fact we don't need to replace the sqrt with something else at all), only in some not very useful path this doesn't work (combined brilinear calculation of int level and fractional lod, accurate rho calc but brilinear filtering seems odd). Apart from being faster as an added bonus this should increase our crappy fractional accuracy of lod, since fast_log2 is only good for ~3bits and this should increase accuracy by one bit (though not used if dimension is just one as we'd need an extra mul there as we never had the squared rho in the first place). v2: use separate ilog2_sqrt function if we have squared rho. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: refactor num_lods handlingRoland Scheidegger2013-08-304-131/+169
| | | | | | | | | | | | | This is just preparation for per-pixel (or per-quad in case of multiple quads) min/mag filter since some assumptions about number of miplevels being equal to number of lods no longer holds true. This change does not change behavior yet (though theoretically when forcing per-element path it might be slower with different min/mag filter since the code will respect this setting even when there's no mip maps now in this case, so some lod calcs will be done per-element just ultimately still the same filter used for all pixels). Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: Early return if no depth or stencil on release builds.Vinson Lee2013-08-291-0/+1
| | | | | | | Fixes "Missing break in switch" defect reported by Coverity. Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* freedreno: pipe loader for either kgsl or msmRob Clark2013-08-294-10/+39
| | | | | | | | The downstream android kernel driver is "kgsl", the upstream drm/kms driver is called "msm". Since libdrm_freedreno handles the differences between the two, we need to load the same thing for either device. Signed-off-by: Rob Clark <[email protected]>
* freedreno: updates for msm drm/kms driverRob Clark2013-08-298-30/+55
| | | | | | | There where some small API tweaks in libdrm_freedreno to enable support for msm drm/kms driver. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: handle sync flags betterRob Clark2013-08-291-16/+34
| | | | | | | | We need to set the flag on all the .xyzw components that are written by the instruction, not just on .x. Otherwise a later use of rN.y (for example) will not trigger the appropriate sync bit to be set. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: better const handlingRob Clark2013-08-291-90/+121
| | | | | | | | | Seems like most/all instructions have some restrictions about const src registers. In seems like the 2 src (cat2) instructions can take at most one const, and the 3 src (cat3) instructions can take at most one const in the first 2 arguments. And so on. Handle this properly now. Signed-off-by: Rob Clark <[email protected]>
* radeonsi: Make sure libdrm_radeon headers are picked up from the right placeJonathan Gray2013-08-292-2/+3
| | | | | | And remove libdrm/ from a winsys include statement. Signed-off-by: Jonathan Gray <[email protected]>
* draw: fix point/line/triangle determination in draw_need_pipeline()Brian Paul2013-08-291-25/+6
| | | | | | The previous point/line/triangle() functions didn't handle GS primitives. Reviewed-by: Roland Scheidegger <[email protected]>
* radeon/uvd: fix MPEG2/4 ref frame index limitChristian König2013-08-291-2/+2
| | | | | | Otherwise the first few frames have an incorrect reference index. Signed-off-by: Christian König <[email protected]>
* nouveau: Copy m4x4 and m8x8 separately.Vinson Lee2013-08-281-1/+2
| | | | | | Silences Coverity "Out-of-bounds access" defect. Signed-off-by: Vinson Lee <[email protected]>
* r300g: enable MSAA on r300-r400, be careful about using color compressionMarek Olšák2013-08-274-5/+14
| | | | | | | | | | MSAA was tested by one user on RS690 and it works for him with color compression (CMASK) disabled. Our theory is that his chipset lacks CMASK RAM. Since we don't have hardware documentation about which chipsets actually have CMASK RAM, I had to take a guess based on the presence of HiZ. Reviewed-by: Alex Deucher <[email protected]>
* draw: clean up setting stream out information a bitRoland Scheidegger2013-08-279-34/+39
| | | | | | | | | | | | | | | | | In particular noone is interested in the vertex count, so drop that, and also drop the duplicated num_primitives_generated / so.primitives_storage_needed variables in drivers. I am unable for now to figure out if primitives_storage_needed in SO stats (used for d3d10) should increase if SO is disabled, though the equivalent num_primitives_generated used for OpenGL definitely should increase. In any case we were only counting when SO is active both in softpipe and llvmpipe anyway so don't pretend there's an independent num_primitives_generated counter which would count always. (This means the PIPE_QUERY_PRIMITIVES_GENERATED count will still be wrong just as before, should eventually fix this by doing either separate counting for this query or adjust the code so it always counts this even if SO is inactive depending on what's correct for d3d10.) Reviewed-by: Brian Paul <[email protected]>
* llvmpipe: support nested/overlapping queries for all query typesRoland Scheidegger2013-08-273-18/+20
| | | | | | | There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <[email protected]>
* softpipe: support nested/overlapping queries for all query typesRoland Scheidegger2013-08-272-18/+17
| | | | | | | There's just no way resetting the counters is working with nested/overlapping queries. Reviewed-by: Brian Paul <[email protected]>
* clover: Don't use PIPE_TRANSFER_UNSYNCHRONIZED for blocking copiesTom Stellard2013-08-261-1/+1
| | | | | | CC: "9.2" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* st/clover: Add event to deps even if it has been triggeredNiels Ole Salscheider2013-08-261-1/+1
| | | | | | | | The command is submitted once the event has been triggered, but it might not have completed yet. Therefore, we have to add it to deps in order to wait on it. Signed-off-by: Niels Ole Salscheider <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* st/clover: Profiling supportNiels Ole Salscheider2013-08-263-18/+142
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Acked-by: Francisco Jerez <[email protected]>
* tgsi_build: fix order of arguments for ind register buildDave Airlie2013-08-271-1/+1
| | | | | | | This was broken when arrayid was added. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* tgsi: finish declaration parsing for arrays.Dave Airlie2013-08-271-1/+31
| | | | | | | | | I previously fixed this partly in 9e8400f4c95bde1f955c7977066583b507159a10, however I didn't go far enough in testing it, now when I parse a TGSI shader with arrays in it my iterator can see the ArrayID set to the proper value. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* svga: replace 0 with PIPE_OK in a few placesBrian Paul2013-08-263-5/+5
|
* radeonsi: Also set the depth component mask bit for stencil-only exportsMichel Dänzer2013-08-261-1/+4
| | | | | | | | The stencil values come out wrong without this for some reason. 50 more little piglits. Cc: [email protected]
* r600g: Implement the new float comparison instructions for Cayman as well.Henri Verbeet2013-08-251-4/+4
| | | | | | | | I assume this should have been part of commit 7727fbb7c5d64348994bce6682e681d6181a91e9. This (obviously) fixes a lot tests. Signed-off-by: Henri Verbeet <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nv30: add forgotten PIPE_CAP_CUBE_MAP_ARRAY cap to listIlia Mirkin2013-08-251-0/+1
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: "9.2" <[email protected]>
* nouveau/video: avoid overwriting base codec init with templateIlia Mirkin2013-08-252-2/+2
| | | | | | | | | | Commit 53e20b8b introduced the use of a template to initialize some common fields. Move this copying of fields to before the common vp3 fields are initialized. Reported-by: Martin Peres <[email protected]> Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Christian König <[email protected]>
* freedreno/a3xx: don't leak so muchRob Clark2013-08-241-0/+11
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: fix SGT/SLT/etcRob Clark2013-08-241-29/+125
| | | | | | | | | | The cmps.f.* instruction doesn't actually seem to give a float 1.0 or 0.0 output. It either needs a cov.u16f16 or add.s + sel.f16. This makes SGT/SLT/etc more similar to CMP, so handle them in trans_cmp(). This fixes a bunch of piglit tests. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: bit of re-arrange/cleanupRob Clark2013-08-241-61/+71
| | | | | | | | It seems there are a number of cases where instructions have limitations about taking reading src's from const register file, so make get_unconst() a bit easier to use. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: make compiler errors more usefulRob Clark2013-08-242-17/+33
| | | | | | | | | | | | | | We probably should get rid of assert() entirely, but at this stage it is more useful for things to crash where we can catch it in a debugger. With compile_error() we have a single place to set an error flag (to bail out and return an error on the next instruction) so that will be a small change later when enough of the compiler bugs are sorted. But re-arrange/cleanup the error/assert stuff so we at least get a dump of the TGSI that triggered it. So we see some useful output in piglit logs. Signed-off-by: Rob Clark <[email protected]>
* freedreno: fix segfault when no color buffer boundRob Clark2013-08-247-18/+40
| | | | | | | Don't crash when no color buffer bound. Something caught when starting to run piglit, fixes a hanful of piglit tests. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: cat4 cannot use const reg as srcRob Clark2013-08-241-10/+27
| | | | | | | | | | | | | Category 4 instructions (rsq, rcp, sqrt, etc) seem to be unable to take a const register as src. In these cases we need to move the src to a temporary gpr first. This is the second case of such a restriction, where the instruction encoding appears to support a const src, but in fact the hw appears to ignore that bit. So split things out into a helper that can be re-used for any instructions which have this limitation. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: use max_reg rather than file_countRob Clark2013-08-241-7/+7
| | | | | | | | | | Our current (rather naive) register assignment is based on mapping different register files (INPUT, OUTPUT, TEMP, CONST, etc) based on the max register index of the preceding file. But in some cases, the lowest used register in a file might not be zero. In which case file_count[file] != file_max[file] + 1. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: handle saturate on dstRob Clark2013-08-241-0/+49
| | | | | | | | Sometimes things other than color dst need saturating, like if there is a 'clamp(foo, 0.0, 1.0)'. So for saturated dst add the extra instructions to fix up dst. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx/compiler: fix CMPRob Clark2013-08-241-2/+2
| | | | | | | | | | | | | | | | The 1st src to add.s needs (r) flag (repeat), otherwise it will end up: add.s dst.xyzw, tmp.xxxx -1 instead of: add.s dst.xyzw, tmp.xyzw, -1 Also, if we are using a temporary dst to avoid clobbering one of the src registers, we actually need to use that as the dst for the sel instruction. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: some texture fixesRob Clark2013-08-241-1/+24
| | | | | | Stop hard coding bits that indicate texture type (2d/3d/cube/etc). Signed-off-by: Rob Clark <[email protected]>
* freedreno: update register headersRob Clark2013-08-248-111/+758
| | | | | | resync w/ rnndb database Signed-off-by: Rob Clark <[email protected]>
* freedreno: add debug option to disable scissor optimizationRob Clark2013-08-243-14/+22
| | | | | | Useful for testing and debugging. Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix viewport on gmem->mem resolveRob Clark2013-08-241-0/+8
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a3xx: fix color inversion on mem->gmem restoreRob Clark2013-08-241-3/+3
| | | | Signed-off-by: Rob Clark <[email protected]>
* radeonsi: Handle additional PIPE_COMPUTE_CAP_*Niels Ole Salscheider2013-08-231-1/+14
| | | | | | | | | | | This patch adds support for: PIPE_COMPUTE_CAP_MAX_INPUT_SIZE PIPE_COMPUTE_CAP_MAX_LOCAL_SIZE Return the values reported by the closed source driver for now. Signed-off-by: Niels Ole Salscheider <[email protected]> Reviewed-by: Tom Stellard <[email protected]>
* radeonsi: copy r600_get_timestampNiels Ole Salscheider2013-08-231-0/+9
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: Implement PIPE_QUERY_TIMESTAMPNiels Ole Salscheider2013-08-234-2/+46
| | | | | Signed-off-by: Niels Ole Salscheider <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallivm: fix min/mag switchover point for nearest/none mip filterRoland Scheidegger2013-08-235-66/+81
| | | | | | | | | | | | | | Previously, the min/mag switchover point when using nearest/none mip filter was effectively -0.5 which can't be right. Looks like new OpenGL thinks it's ok if it's always 0.0 (older versions required 0.5 in some cases), let's hope everybody else thinks that's fine too. Refactor this slightly and get the per-quad/per-pixel min/mag decision values further down to sampling, though still only the first component is used yet. While here also fix code trying to skip lod bias application etc. when mipfilter is none, as this is still needed for determining min/mag filter. Reviewed-by: Jose Fonseca <[email protected]>
* gallium/osmesa: Link, not copy, the shared library to the LIB_DIR.Jon Severinsson2013-08-231-1/+1
| | | | | | | Just like all other mesa libraries... CC: "9.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium/osmesa: Always link with the c++ linker.Jon Severinsson2013-08-231-9/+2
| | | | | | | Just like all other gallium targets... CC: "9.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallium/osmesa: Make and install an osmesa.pc.Jon Severinsson2013-08-232-3/+14
| | | | | | | | | As of "2f142d59 build: Add --enable-gallium-osmesa flag." the pkgconfig file from classic osmesa is no longer installed when building gallium osmesa, so copy it to gallium osmesa and install the copy instead. CC: "9.2" <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* gallivm: do per-element lod for lod bias and explicit derivs tooRoland Scheidegger2013-08-222-31/+74
| | | | | | | | | | | | Except for explicit derivs with cube maps which are very bogus anyway. Just like explicit lod this is only used if no_quad_lod is set in GALLIVM_DEBUG env var. Minification is terrible on cpus which don't support true vector shifts (but should work correctly). Cannot do the min/mag filter decision (if they are different) per pixel though, only selecting different mip levels works. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix int/uint border color clampingRoland Scheidegger2013-08-221-2/+2
| | | | | | | | | Just a copy & paste error. Fixes https://bugs.freedesktop.org/show_bug.cgi?id=68409. Note that the test passing before probably simply means it doesn't verify clamping of the border color itself as required by the OpenGL spec. Reviewed-by: Jose Fonseca <[email protected]>
* gallivm: (trivial) fix linear aos sampling of 3d compressed formatsRoland Scheidegger2013-08-221-2/+2
| | | | | | | | block size depth is always 1 even for compressed formats (unless someone invents true 3d compressed formats at least which we can't represent). Nearest (and soa) path had it right. Reviewed-by: Jose Fonseca <[email protected]>
* radeonsi: Fix y/z/w component values of TGSI_SEMANTIC_FOG pixel shader inputsMichel Dänzer2013-08-221-0/+18
| | | | | | | | | They are defined as constant 0.0/0.0/1.0. Three more little piglits. Cc: [email protected] Reviewed-by: Alex Deucher <[email protected]>