summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* svga: fix shader variant memory leakBrian Paul2015-09-102-0/+6
| | | | | | | Fixes a small leak in a seldom-hit corner case for VS/FS compilation. Found with coverity. Reviewed-by: Charmaine Lee <[email protected]>
* svga: remove useless MAX2() callBrian Paul2015-09-101-1/+1
| | | | | | The sum of two unsigned ints is always >= 0. Found with Coverity. Reviewed-by: Charmaine Lee <[email protected]>
* winsys/svga: remove useless assertionBrian Paul2015-09-101-1/+0
| | | | | | An unsigned int is always >= 0. Found with Coverity. Reviewed-by: Charmaine Lee <[email protected]>
* softpipe: Implement and enable textureQueryLodKrzesimir Nowak2015-09-102-2/+55
| | | | | | | | | | | | | | Passes the shader piglit tests and introduces no regressions. This commit finally makes use of the refactoring in previous commits. v2: - adapted the code to changes in previous commits (renames, need_cube_convert stuff) - splitted too long lines Reviewed-by: Brian Paul <[email protected]>
* tgsi: Add code for handling lodq opcodeKrzesimir Nowak2015-09-102-0/+56
| | | | | | | | | | | | This introduces new vfunc in tgsi_sampler just for this opcode. I decided against extending get_samples vfunc to return the mipmap level and LOD - the function's prototype is already too scary and doing the sampling for textureQueryLod would be a waste of time. v2: - splitted too long lines Reviewed-by: Brian Paul <[email protected]>
* softpipe: Add functions for computing relative mipmap levelKrzesimir Nowak2015-09-102-0/+120
| | | | | | | | | | | | | | | | | These functions will be used by textureQueryLod. v2: - renamed mip_level_* funcs to mip_rel_level_* to indicate that these functions return mip level relative to base level and documented them - renamed a level member in sp_filter_funcs struct to relative_level - changed mip_rel_level_none and mip_rel_level_nearest to return mip level relative to base level, mip_rel_level_linear already did that - documented clamp_lod function Reviewed-by: Brian Paul <[email protected]>
* softpipe: Split 3D to 2D coords conversion into separate functionKrzesimir Nowak2015-09-102-51/+45
| | | | | | | | | | | | | | | | | | | This is to avoid tying the conversion to the sampling - textureQueryLod will need to do the conversion too, but it does not do any sampling. So instead of a "get_samples" vfunc, there is just a bool saying whether the conversion is needed or not. This solution keeps a nice property of not adding any overhead for the common case (2D textures). v2: - replaced the "convert_coords" vfunc with a "need_cube_convert" boolean to avoid overhead of copying arrays in common case - removed an unused typedef - splitted too long lines in convert_cube - const fixes in convert_cube Reviewed-by: Brian Paul <[email protected]>
* softpipe: Split code getting a filter into separate functionKrzesimir Nowak2015-09-101-17/+41
| | | | | | | | | | | | | | This function will be later used by textureQueryLod. The img_filter_func are optional, because textureQueryLod will not need them. v2: - adapted to changes in previous commit (renames) - simplified conditions a bit - updated docs - splitted too long lines Reviewed-by: Brian Paul <[email protected]>
* softpipe: Put mip_filter_func inside a structKrzesimir Nowak2015-09-102-12/+38
| | | | | | | | | | | | | | | | | Putting this function pointer into a struct enables grouping of several related functions in a single place. For now it is just a single function, but the struct will be later extended with a mip_level_func for returning relative mip level. v2: - renamed sp_mip struct to sp_filter_funcs - renamed sp_filter_funcs instances from mip_foo to funcs_foo - splitted too long lines - sp_sampler now holds a pointer to sp_filter_funcs instead of an instance of it - some const fixes Reviewed-by: Brian Paul <[email protected]>
* softpipe: Split compute_lambda_lod into two functionsKrzesimir Nowak2015-09-101-17/+40
| | | | | | | | | | | | textureQueryLod returns a vec2 with a mipmap information and a LOD. The latter needs to be not clamped. v2: - changed the "not_clamped" part to "unclamped" - corrected "clamp into" to "clamp to" - splitted too long lines Reviewed-by: Brian Paul <[email protected]>
* softpipe: Fix textureLod with nonzero GL_TEXTURE_LOD_BIAS valueKrzesimir Nowak2015-09-101-1/+1
| | | | | | | | | The level-of-detail bias wasn't simply added in the explicit LOD case. This case seems to be tested only in piglit's fs-texturequerylod-nearest-biased test, which is currently skipped, as softpipe does not support textureQueryLod at the moment. Reviewed-by: Brian Paul <[email protected]>
* tgsi: Remove trailing backslash in commentKrzesimir Nowak2015-09-101-1/+1
| | | | | | It clearly is here by accident. Reviewed-by: Brian Paul <[email protected]>
* gallium/radeon: handle PIPE_TRANSFER_FLUSH_EXPLICITMarek Olšák2015-09-103-22/+44
| | | | | | | | Basically, do the same thing as for buffer_unmap, but use the explicit range instead. It's for apps which want to map a whole buffer and mark touched ranges explicitly. Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: don't update polygon offset state if it has no effectMarek Olšák2015-09-102-1/+4
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: decrease the size of si_pm4_stateMarek Olšák2015-09-101-3/+2
| | | | Acked-by: Michel Dänzer <[email protected]>
* radeonsi/compute: add buffers to the CS directlyMarek Olšák2015-09-101-7/+11
| | | | | | Packets are emitted immediately anyway. Acked-by: Michel Dänzer <[email protected]>
* radeonsi: only use new versions of LLVM image and sample intrinsicsMarek Olšák2015-09-101-283/+186
| | | | | | | | Just a cleanup I had made a long time ago and forgot about. v2: use tgsi_is_shadow_target Reviewed-by: Tom Stellard <[email protected]>
* gallium/radeon: drop support for LLVM 3.4Marek Olšák2015-09-106-25/+7
| | | | | | This allows using the new tex instrinsics unconditionally. Reviewed-by: Michel Dänzer <[email protected]>
* r600/llvm: remove dead code for LLVM 3.3Marek Olšák2015-09-101-106/+0
| | | | | | LLVM 3.3 has been unsupported for quite a while. Reviewed-by: Michel Dänzer <[email protected]>
* r600g: use pipe_resource::width0 instead pb_buffer::sizeMarek Olšák2015-09-102-6/+6
| | | | | | | | | | | pb_buffer::size was aligned by 29aaab2b5f55cc6d9a84f58ce2bb8607e76a9dde, which broke the CMASK code I think. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91881 Cc: 11.0 <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* radeonsi: enable VGPR spilling on VIMarek Olšák2015-09-101-3/+1
| | | | | | | This fixes corruption in Unigine Heaven on VI Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* winsys/amdgpu: calculate the maximum number of compute unitsMarek Olšák2015-09-101-2/+13
| | | | | | | Required for register spilling. Cc: 11.0 <[email protected]> Reviewed-by: Alex Deucher <[email protected]>
* clover: Avoid using typename to allow compilation of clover by clangAlbert Freeman2015-09-101-1/+1
| | | | | | | | | | | | | | | | When parsing an variable declaration qualified with the typename keyword, clang attempted to declare a variable with the type of non type member "enum type type" of module::argument (within the header file clover/core/module.hpp) instead of the typed member of module::argument "enum type". Replaced "typename" with "enum" to force clang to declare the variable marg_type with type "enum type" of module::argument. CC: "11.0" <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Signed-off-by: Albert Freeman <[email protected]>
* nv50/ir: don't fold immediate into mad if registers are too highIlia Mirkin2015-09-101-0/+4
| | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50/ir: fix emission of 8-byte wide interp instructionIlia Mirkin2015-09-101-5/+6
| | | | | | | | | This can come up if the target register number is > 63, which is fairly rare. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91551 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50/ir: r63 is only 0 if we are using less than 63 registersIlia Mirkin2015-09-101-1/+4
| | | | | | | | | It is advantageous to use r63 instead of r127 since r63 can fit into the shorter encoding. However if we've RA'd over 63 registers, we must use r127 as the replacement instead. Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50/ir: make edge splitting fix up phi node sourcesIlia Mirkin2015-09-101-13/+77
| | | | | | | | | | | | | | | | | | | | | Unfortunately nv50_ir phi nodes aren't directly connected to the CFG, so the mapping between source and the actual BB is by inbound edge order. So when manipulating edges one has to be extremely careful. We were insufficiently careful when splitting critical edges which resulted in the phi nodes being confused as to where their sources were coming from. This primarily manifests itself with the TXL-lowering logic on nv50, when it is inside of a conditional. I've been unable to trigger the issue anywhere else so far. This resolves rendering failures in a number of games like Two Worlds 2, Trine: Enchanted Edition, Trine 2, XCOM:Enemy Unknown, Stacking. It also improves the situation in Hearthstone, Sonic Generations, and The Raven: Legacy of a Master Thief. However more work needs to be done there (splitting a lot more edges solves it, so it's some other sort of RA-related issue). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=90887 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nvc0: remove BGRA4 format supportIlia Mirkin2015-09-091-0/+2
| | | | | | | | | | | Something is wrong with the support somewhere. I couldn't get the blob driver to use it either, although it happily used RGB5_A1. teximage-colors works, but WoW seems to fail in the menus for drawing text. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91526 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* gallium/ttn: fix cursor handling vs builderRob Clark2015-09-091-8/+6
| | | | | | | | | | | After inserting instructions the cursor.option becomes _after_instr (even if it started life as an _after_block). So we cannot simply stash the current cursor on the if/loop_stack. Otherwise we end up inserting instructions after the endif/endloop in the block preceeding the if/ loop. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nvc0: keep track of cb bindings per buffer, use for upload settingsIlia Mirkin2015-09-097-12/+58
| | | | | | | | | | | | | | | | | | | CB updates to bound buffers need to go through the CB_DATA endpoints, otherwise the shader may not notice that the updates happened. Furthermore, these updates have to go in to the same address as the bound buffer, otherwise, again, the shader may not notice updates. So we keep track of all the places where a constbuf is bound, and iterate over all of them when updating data. If a binding is found that encompasses the region to be updated, then we use the settings of that binding for the upload. Otherwise we upload as a regular data update. This fixes piglit 'arb_uniform_buffer_object-rendering offset' as well as blurriness in Witcher2. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=91890 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv30: Disable msaa unless requested from the env by NV30_MAX_MSAAHans de Goede2015-09-092-1/+21
| | | | | | | | | | | | | | | | | | | | Some modern apps try to use msaa without keeping in mind the restrictions on videomem of older cards. Resulting in dmesg saying: [ 1197.850642] nouveau E[soffice.bin[3785]] fail ttm_validate [ 1197.850648] nouveau E[soffice.bin[3785]] validating bo list [ 1197.850654] nouveau E[soffice.bin[3785]] validate: -12 Because we are running out of video memory, after which the program using the msaa visual freezes, and eventually the entire system freezes. To work around this we do not allow msaa visauls by default and allow the user to override this via NV30_MAX_MSAA. Signed-off-by: Hans de Goede <[email protected]> [imirkin: move env var lookup to screen so that it's only done once] Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* nv30: Fix color resolving for nv3x cardsHans de Goede2015-09-091-1/+37
| | | | | | | | | | | | | We do not have a generic blitter on nv3x cards, so we must use the sifm object for color resolving. This commit divides the sources and dest surfaces in to tiles which match the constraints of the sifm object, so that color resolving will work properly on nv3x cards. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* gallium/docs: clairify dmabuf fd ownershipRob Clark2015-09-091-0/+8
| | | | | | | Since debugging issues w/ fd's close()d at the wrong time can be quite fun, this should probably be made more explicit in the docs. Signed-off-by: Rob Clark <[email protected]>
* android: radeonsi: add support for sid_tables.h generated sourcesMauro Rossi2015-09-093-3/+15
| | | | | | | | | | This patch is necessary to avoid building error on android, due to missing sid_tables.h generated sources v2:[Emil Velikov] Correctly split the lists. Fixes: fbbebeae10f(radeonsi: inline si_cmd_context_control) Signed-off-by: Emil Velikov <[email protected]>
* nouveau: android: add space before PRIx64 macroMauro Rossi2015-09-091-1/+1
| | | | | | | | | | | | | | | | | | | | | Otherwise the android build fails with error : unable to find string literal operator ‘operator"" PRIx64’ There are several resources referring to the problem, which is related to c++11, in our case used when building mesa for lollipop. http://comments.gmane.org/gmane.comp.graphics.opensg.user/5883 I've not investigated all the semantics, some people even suggested a bug in the gcc compiler, I just saw the building error was solved with one little space for lollipop and no side effect when c+11 not used. v2: [Emil Velikov] add an alternative commit message from Mauro. Cc: 11.0 <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* svga: pick all the files into the tarballEmil Velikov2015-09-091-5/+26
| | | | Signed-off-by: Emil Velikov <[email protected]>
* auxiliary: rework the python generated sources rulesEmil Velikov2015-09-091-12/+17
| | | | | | | | | | | | | | | | | | There are a few bits this commit aims to resolve: One can generalise the mkdir rule to a simple MKDIR_P $(@D) which will expand appropriately for even if we change the subdir name, and/or add new rules. We can also drop the explicit $(srcdir) prefix for the dependency rules, they they are not strictly required, nor used elsewhere in mesa. Finally replace $< with explicit filename to be consistent through the file, and honour PYTHON_FLAGS. v2: Add comprehensive commit summary/message (Ian, Matt) Cc: 11.0 <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* r600: don't use shader key without verifying shader type (v2)Dave Airlie2015-09-091-7/+12
| | | | | | | | | | | | | | | | | Since 7a32652231f96eac14c4bfce02afe77b4132fb77 r600: Turn 'r600_shader_key' struct into union we were accessing key fields that might be aliased in the union with other fields, so we should check what shader type we are compiling for before using key values from it. v1.1: make it compile v2: have caffeine, make it work - we don't set type until later, so don't reference it until we've set it. Reviewed-by: Edward O'Callaghan <[email protected]> Cc: "11.0" <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* nvc0: always emit a full shader colormaskIlia Mirkin2015-09-081-1/+1
| | | | | | | | | | | | | | Indications are that if the colormask indicates a single bit set on fermi, that value will always be read from $r0 instead of a potentially higher register (if e.g. green is set). Not to upset the counting logic, always set the header up with a full color mask for each RT. Such a situation can basically only ever happen with generated blit shaders. Fixes the following piglit on Fermi (Kepler is unaffected): fbo-stencil blit GL_DEPTH32F_STENCIL8 Signed-off-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* nv30: Fix max width / height checks in nv30 sifm codeHans de Goede2015-09-071-2/+2
| | | | | | | | | | | | | The sifm object has a limit of 1024x1024 for its input size and 2048x2048 for its output. The code checking this was trying to be clever resulting in it seeing a surface of e.g 1024x256 being outside of the input size limit. This commit fixes this. Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]> Cc: "10.6 11.0" <[email protected]>
* svga: Fix surface view error handlingThomas Hellstrom2015-09-071-22/+26
| | | | | | | | | Make sure errors are correcly propagated. Also don't flush during state emission if emission fails. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Charmaine Lee <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* xa: add xa_surface_from_handle2 v2Rob Clark2015-09-073-11/+45
| | | | | | | | | | | | | | | | | Like xa_surface_from_handle(), but takes a handle type, rather than hard-coding 'shared' handle. This is needed to fix bugs seen with xf86-video-freedreno with xrandr rotation, for example. The root issue is that doing a GEM_OPEN ioctl on a bo that already has a GEM handle associated with the drm_file will result in two unique handles for the same bo. Which causes all sorts of follow-on fail. v2: - Add support for for fd handles. - Avoid duplicating code. - Bump xa version minor. Signed-off-by: Rob Clark <[email protected]> Signed-off-by: Thomas Hellstrom <[email protected]>
* nouveau: don't mark full range as used on unmap with explicit flushIlia Mirkin2015-09-051-5/+7
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: avoid using inline vertex data submit when gl_VertexID is usedIlia Mirkin2015-09-054-2/+14
| | | | | | | | | | | The hardware only generates vertexid when vertices come from a VBO. This fixes: vertexid-drawelements vertexid-drawarrays Signed-off-by: Ilia Mirkin <[email protected]> Cc: "11.0" <[email protected]>
* nv50: don't flush vertex arrays when index buffer changesIlia Mirkin2015-09-051-4/+0
| | | | | | | | The index buffer is fed in inline over a pushbuf. It's not related to vertices or any caching that might be done on them. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: rebind bo to bufctx when invalidating idxbuf storageIlia Mirkin2015-09-051-1/+5
| | | | | | | | There is nothing to be done on a dirty idxbuf, but the bo may have changed, so we have to rebind it to the bufctx. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: clear buffer status on all vertex bufs, not just the first oneIlia Mirkin2015-09-051-1/+0
| | | | | Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* nv50: fix drawing from tfb, direct-to-pushbuf submitsIlia Mirkin2015-09-054-14/+15
| | | | | | | | | | | The stride was being set to 0, which is illegal (and also non-sensical). Also we must wait for the buffer to become available for reading as otherwise a wrong value may be prefetched. Since we must wait for the buffer anyways, and it's mapped and in GART, we may as well avoid the annoyance of the indirect pushbuf submit. Signed-off-by: Ilia Mirkin <[email protected]> Cc: [email protected]
* llvmpipe: convert double to long long instead of unsigned long longOded Gabbay2015-09-041-1/+1
| | | | | | | | | | | | | | | | | round(val*dscale) produces a double result, as val and dscale are double. However, LLVMConstInt receives unsigned long long, so there is an implicit conversion from double to unsigned long long. This is an undefined behavior. Therefore, we need to first explicitly convert the round result to long long, and then let the compiler handle conversion from that to unsigned long long. This bug manifests itself in POWER, where all IMM values of -1 are being converted to 0 implicitly, causing a wrong LLVM IR output. Signed-off-by: Oded Gabbay <[email protected]> CC: "10.6 11.0" <[email protected]> Reviewed-by: Tom Stellard <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* nv30: Implement color resolve for msaaHans de Goede2015-09-042-14/+8
| | | | | | | | | | | Note this is not ideal. Since the sifm can only do source sizes upto 1024x1024 we end up using the blitter on nv4x, which is not that fast. And on nv3x we end up using the cpu which is really slow. Cc: "10.6 11.0" <[email protected]> Signed-off-by: Hans de Goede <[email protected]> Reviewed-by: Ilia Mirkin <[email protected]>