summaryrefslogtreecommitdiffstats
path: root/src/gallium
Commit message (Collapse)AuthorAgeFilesLines
* etnaviv: replace translate_clear_color with util_pack_colorLucas Stach2017-06-162-48/+12
| | | | | | | | | | | | | | | This replaces the open coded etnaviv version of the color pack with the common util_pack_color. Fixes piglits: arb_color_buffer_float-clear fcc-front-buffer-distraction fbo-clearmipmap Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: remove bogus assertLucas Stach2017-06-161-2/+0
| | | | | | | | | | | etna_resource_copy_region handles resources with multiple samples by falling back to the software path. There is no need to kill the application there. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: use padded width/height for resource copiesLucas Stach2017-06-161-2/+2
| | | | | | | | | | | | When copying a resource fully we can just blit the whole level. This allows to use the RS even for level sizes not aligned to the RS min alignment. This is especially useful, as etna_copy_resource is part of the software fallback paths (used in etna_transfer), that are used for doing unaligned copies. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: don't try RS blit if blit region is unalignedLucas Stach2017-06-161-1/+2
| | | | | | | | | | If the blit region is not aligned to the RS min alignment don't try to execute the blit, but fall back to the software path. Fixes: c9e8b49b ("etnaviv: gallium driver for Vivante GPUs") Cc: [email protected] Signed-off-by: Lucas Stach <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* r600g,compute: provide local copy of functions from ac_binary.cJan Vesely2017-06-167-46/+199
| | | | | | | | | | | | | | This is a verbatim copy of the code. The functions can be cleaned up since r600 does not use all the stuff that gcn does. The symbol names have been changed since we still use ac_binary.h header (for struct definition) v2: Add ifdef guard around r600_binary_clean call (Aaron) Remove stray comment Signed-off-by: Jan Vesely <[email protected]> Tested-By: Aaron Watry <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* r600: android: amdgpu_common is only required when building OpenCLJan Vesely2017-06-161-5/+0
| | | | | | | v2: split off Android changes Signed-off-by: Jan Vesely <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* svga: Relax the format checks for copy_region_vgpu10 somewhatThomas Hellstrom2017-06-161-2/+26
| | | | | | | | | The new generic checks were actually more restrictive than the previous svga- specific tests and not vice versa. So bypass the common format checks for copy_region_vgpu10. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: Fix incorrect format conversion blit destinationThomas Hellstrom2017-06-161-1/+3
| | | | | | | | | | The blit.dst.resource member that was used as destination was modified earlier in the function, effectively making us try to blit the content onto itself. Fix this and also add a debug printout when the format conversion blits fail. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Neha Bhende <[email protected]>
* svga: Fix srgb copy_region regressionThomas Hellstrom2017-06-161-1/+4
| | | | | | | | This fixes a tf2 srgb copy_region regression from "svga: Rework the blit and resource_copy_region functionality v3" Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* svga: Prefer accelerated blits over cpu copy regionThomas Hellstrom2017-06-161-5/+3
| | | | | | | | | | | | | | | | | This reduces the number of cpu copy_region fallbacks on a Nvidia system running the piglit command ./publish/bin/piglit run -1 -t copy -t blit tests/quick from 64789 to 780 Previously this has caused a regression in piglit test spec@!opengl [email protected], but I'm currently not able to reproduce that regression. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Support accelerated conditional blittingThomas Hellstrom2017-06-164-43/+62
| | | | | | | | | | | | | | | | | | The blitter has functions to save and restore the conditional rendering state, but we currently don't save the needed info. Since also the copy_region_vgpu10 path supports conditional blitting, we instead use the same function as the clearing routines and move that function to svga_pipe_query.c Note that we still haven't implemented conditional blitting with the software fallbacks. Fixes piglit nv_conditional_render::copyteximage Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Use utility functions to help determine whether we can use copy_regionThomas Hellstrom2017-06-161-6/+3
| | | | | | | | | | | | | | It seems like the SVGA tests are in general more stringent than the utility tests, but they also miss some blitter features like filters and window rectangles, and if new blitter features are added in the future, it might be possible that we forget adding tests for those. So in addition to the SVGA tests, use the utility tests to restrict the situations where we can use copy_region. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* svga: Rework the blit and resource_copy_region functionality v3Thomas Hellstrom2017-06-161-201/+445
| | | | | | | | | | | | | | | | | | | | | | | | | | | | This work was initially trigged by the fact that imported surfaces may be backed by other SVGA3D formats than the default. Therefore some fixes were needed to avoid using the copy_region_vgpu10() functionality for incompatible SVGA3D formats where the pipe formats were OK. This situation happens when using dri3. Also in some situations, for example where a R8G8_UNORM surface is backed by an SVGA3D_NV12 format, we can't use the copy_region functionality at all and thus need to fall back to the quad blitter also for the resource_copy_region function. This situation doesn't happen currently, but will if we start using video textures. The patch makes the blit- and copy_region paths similar and the decision whether to use a certain gpu command should now be easy to locate. Probably the resource_copy_region path will suffer from a minor additional cpu overhead, but on the other hand there are more cases now that we accelerate, since we try harder before falling back to cpu copies / blits. v2: Addressed review comments and fixed up piglit failures by sometimes preferring cpu_copy_region() over blit(). v3: Removed a stray test statement. Updated commit message. Signed-off-by: Thomas Hellstrom <[email protected]> Reviewed-by: Brian Paul <[email protected]> Reviewed-by: Charmaine Lee <[email protected]>
* draw: check for line_width != 1.0f in validate_pipeline()Brian Paul2017-06-151-3/+4
| | | | | | | | | We shouldn't use the wide line stage if the line width is 1. This check isn't strictly needed because all drivers are (now) specifying a line wide threshold of at least 1.0 pixels, but let's play it safe. Reviewed-by: Charmaine Lee <[email protected]>
* svga: clamp device line width to at least 1 to fix HWv8 line stipplingBrian Paul2017-06-151-4/+4
| | | | | | | | | | | | | | | | | | | | | | | | The line stipple fallback code for virtual HW version 8 didn't work. With HW version 8, we were getting zero when querying the max line widths (AA and non-AA). This means we were setting the draw module's wide line threshold to zero. This caused the wide line stage to always get enabled. That caused the line stipple module to fall because the wide line stage was clobbering the rasterization state with a state object setting the line stipple pattern to 0xffff. Now the wide_lines variable in draw's validate_pipeline() will not be incorrectly set. Also improve debug output. BTW, also this fixes several other piglit tests: polygon-mode, primitive- restart-draw-mode, and line-flat-clip-color since they all use the draw module fallback. See VMware bug 1895811. Reviewed-by: Charmaine Lee <[email protected]>
* draw: whitespace and formatting fixesBrian Paul2017-06-152-60/+58
| | | | Trivial.
* gallium: Add renderonly-based support for pl111+vc4.Eric Anholt2017-06-1522-4/+320
| | | | | | | | | | | | | | | | | | | This follows the model of imx (display) and etnaviv (render): pl111 is a display-only device, so when asked to do GL for it, we see if we have a vc4 renderer, make the vc4 screen, and have vc4 call back to pl111 to do scanout allocations. The difference from etnaviv is that we share the same BO between vc4 and pl111, rather than having a vc4 bo and a pl11 bo and copies between the two. The only mismatch between their requirements is that vc4 requires 4-pixel (at 32bpp) stride alignment, while pl111 requires that stride match width. The kernel will reject any modesets to an incorrect stride, so the 3D driver doesn't need to worry about that. v2: Rebase on Android rework, drop unused include. v3: Fix another Android bug, from Rob Herring's build-testing. Reviewed-by: Christian Gmeiner <[email protected]>
* etnaviv: Only use renderonly_get_handle for GEM handles.Eric Anholt2017-06-152-1/+3
| | | | | | | | | | | | Note that for requests for Prime FDs or flink names, we return handles to the etanviv BO, not the scanout BO. This is at least better than previous behavior of returning GEM handles for a request for an FD or flink name. And add an assert that renderonly_get_handle is only used for getting the GEM handle. Signed-off-by: Eric Anholt <[email protected]> Reviewed-by: Christian Gmeiner <[email protected]>
* android: r600/eg: add support for tracing IBs after a hang.Mauro Rossi2017-06-151-0/+10
| | | | | | | The rules to generate egd_tables.h are added in Android makefile Fixes: f42fb00 "r600/eg: add support for tracing IBs after a hang." Reviewed-by: Emil Velikov <[email protected]>
* svga: fix git_sha1.h include path in Android.mk (v3)Mauro Rossi2017-06-151-0/+2
| | | | | | | | | | | | | | | | | | | | Adds libmesa_git_sha1 static (dummy) library to generate git_sha1.h with some polishing to header dependency on .git/HEAD and scripted rules. The now redundant generation rules are removed from Android.gen.mk libmesa_git_sha1 whole static depedency is added to libmesa_pipe_svga, libmesa_dricore and libmesa_st_mesa modules Fixes the following building error: external/mesa/src/gallium/drivers/svga/svga_screen.c:26:10: fatal error: 'git_sha1.h' file not found ^ 1 error generated. Fixes: 1ce3a27 ("svga: Add the ability to log messages to vmware.log on the host.") Reviewed-by: Emil Velikov <[email protected]>
* gallium/radeon: fix initialization of new resource bindless fieldsSamuel Pitoiset2017-06-151-0/+2
| | | | | | | r600_resource objects are not calloc'd. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* gallium/util: Break recursion in pipe_resource_referenceMichel Dänzer2017-06-151-2/+8
| | | | | | | | | | | | | | | | | | | | It calling itself recursively prevented it from being inlined, resulting in a copy being generated in every compilation unit referencing it. This bloated the text segment of the Gallium mega-driver *_dri.so by ~4%, and might also have impacted performance. Fixes: ecd6fce2611e ("mesa/st: support lowering multi-planar YUV") v2: * Add comment above pipe_resource_next_reference [Samuel Pitoiset] v3: * Use loop to unreference the full chain of resources referenced via the next members [Timothy Arceri] v4: * Stop chasing ->next chain at the first sub-resource which isn't destroyed [Nicolai Hähnle] Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeon/winsys: Limit max allocation size to 70% of VRAMAaron Watry2017-06-141-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | The CL CTS queries the max allocation size, and then attempts to allocate buffers of that size. If not enough contiguous RAM/VRAM is available, this causes errors in the radeon kernel module due to inability to allocate the required memory. It's a bit of a hack, but experimentally on my system, I can use ~3/4 of the card's VRAM for a single global/constant buffer allocation given current GUI/compositor use. For a 1GB Pitcairn (HD7850) this gets me from the reported clinfo values of: Global memory size 2143076352 (1.996GiB) Max memory allocation 1500153446 (1.397GiB) Max constant buffer size 1500153446 (1.397GiB) To: Global memory size 2143076352 (1.996GiB) Max memory allocation 751619276 (716MiB) Max constant buffer size 751619276 (716MiB) Fixes: OpenCL CTS test/conformance/api/min_max_mem_alloc_size, OpenCL CTS test/conformance/api/min_max_constant_buffer_size Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: add missing 'static' to tgsi_is_bindless_image_file()Samuel Pitoiset2017-06-141-1/+1
| | | | | | | | This should fix compilation errors in some situations. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101418 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: enable ARB_bindless_textureSamuel Pitoiset2017-06-141-1/+3
| | | | | | | This has only been tested on RX480. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add support for loading bindless imagesSamuel Pitoiset2017-06-141-7/+21
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add support for loading bindless samplersSamuel Pitoiset2017-06-141-3/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: invalidate buffers which are made resident if neededSamuel Pitoiset2017-06-141-0/+34
| | | | | | | | When a buffer becomes resident, check if it has been invalidated, if so update the descriptor and the dirty flag. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: upload new descriptors when resident buffers are invalidatedSamuel Pitoiset2017-06-143-0/+152
| | | | | | | | | | | | | When texture buffers are invalidated the addr in the resident descriptor has to be updated but we can't create a new descriptor because the resident handle has to be the same. Instead, use the WRITE_DATA packet which allows to update memory directly but graphics/compute have to be idle in case the GPU is reading the descriptor. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only decompress resident textures/images when usedSamuel Pitoiset2017-06-141-2/+11
| | | | | | | | When the current bound shaders don't use any bindless textures or images, it's useless to decompress the resident resources. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: track use of bindless samplers/images from tgsi_shader_infoSamuel Pitoiset2017-06-145-5/+46
| | | | | | | | | This adds some new helper functions to know if the current draw call (or dispatch compute) is using bindless samplers/images, based on TGSI analysis. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress resident textures/images before graphics/computeSamuel Pitoiset2017-06-143-0/+114
| | | | | | | | Similar to the existing decompression code path except that it loops over the list of resident textures/images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: decompress DCC for resident textures/imagesSamuel Pitoiset2017-06-142-0/+83
| | | | | | | | | | Analogous to bound textures/images. We should also update the resident descriptors and disable COMPRESSION_EN for avoiding useless DCC fetches, but I postpone this optimization for a separate series. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: only add descriptors in presence of resident handlesSamuel Pitoiset2017-06-141-0/+6
| | | | | | | | | This won't help much except for applications that use a ton of resident handles. Though, this will reduce the winsys overhead a little bit. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add all resident buffers to the current CSSamuel Pitoiset2017-06-143-0/+52
| | | | | | | | | | | Resident buffers have to be added to every new command stream. Though, this could be slightly improved when current shaders don't use any bindless textures/images but usually applications tend to use bindless for almost every draw call, and the winsys thread might help when buffers are added early. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: implement ARB_bindless_textureSamuel Pitoiset2017-06-143-0/+285
| | | | | | | | This implements the Gallium interface. Decompression of resident textures/images will follow in the next patches. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add a slab allocator for bindless descriptorsSamuel Pitoiset2017-06-144-0/+119
| | | | | | | | | | | | | | | | | | | For each texture/image handles, we need to allocate a new buffer for the bindless descriptor. But when the number of buffers added to the current CS becomes high, the overhead in the winsys (and in the kernel) is important. To reduce this bottleneck, the idea is to suballocate the bindless descriptors using a slab similar to the one used in the winsys. Currently, a buffer can hold 1024 bindless descriptors but this limit is arbitrary and could be changed in the future for some reasons. Once a slab is allocated the "base" buffer is added to a per-context list. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_shader_image_desc() helperSamuel Pitoiset2017-06-141-32/+47
| | | | | | | To share some common code between bound and bindless images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_set_sampler_view_desc() helperSamuel Pitoiset2017-06-141-43/+52
| | | | | | | To share some common code between bound and bindless textures. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_init_descriptor_list() helperSamuel Pitoiset2017-06-141-0/+15
| | | | | | | | This will be used in order to initialize resident descriptors for bindless textures/images. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/scan: record bindless samplers/images usageSamuel Pitoiset2017-06-142-0/+49
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* tgsi/ureg: accept TGSI_FILE_{CONSTANT,INPUT} for dst registersSamuel Pitoiset2017-06-141-2/+0
| | | | | | | | | For example, TGSI_OPCODE_STORE for bindless images might use a constant buffer or a shader input. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* tc: add ARB_bindless_texture supportSamuel Pitoiset2017-06-143-1/+133
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* trace: add ARB_bindless_texture supportSamuel Pitoiset2017-06-141-0/+112
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* ddebug: add ARB_bindless_texture supportSamuel Pitoiset2017-06-141-0/+60
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add ARB_bindless_texture interfaceSamuel Pitoiset2017-06-142-0/+79
| | | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* gallium: add PIPE_CAP_BINDLESS_TEXTURESamuel Pitoiset2017-06-1417-0/+18
| | | | | | | | | Whether bindless texture operations are supported by the underlying driver. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* clover/device: Get device/host unified memory from pipe driverAaron Watry2017-06-133-1/+7
| | | | | | | clinfo no longer reports my discrete GCN card as unified memory Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* gallium/radeon: Include the family name in the renderer string if it's not ↵Henri Verbeet2017-06-131-14/+18
| | | | | | | | | | | equal to the marketing name. The "family" name is often more informative than the "marketing" name. More importantly, applications, like for example Wine, may recognise GPUs based on the existing "family" names. Reviewed-by: Marek Olšák <[email protected]> Signed-off-by: Henri Verbeet <[email protected]>
* gallium/docs: clarify TGSI_SEMANTIC_SAMPLEMASK, againBrian Paul2017-06-131-4/+11
| | | | | | | | | I've since discovered the fragment shader sample mask system value (which corresponds to gl_SampleMaskIn). v2: It's a system value, not a shader input. Reviewed-by: Nicolai Hähnle <[email protected]>