summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* gallium/radeon: disable the shader cache if dumping shadersMarek Olšák2017-03-131-0/+5
| | | | | | otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: mark all bound shader buffer ranges as initializedMarek Olšák2017-03-131-0/+3
| | | | | | | | This should prevent cases when a buffer was incorrectly mapped without synchronization just because this wasn't done. Cc: 13.0 17.0 <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* st/mesa: disable the shader cache if dumping shadersMarek Olšák2017-03-131-4/+4
| | | | | | otherwise, cached shaders aren't dumped. Reviewed-by: Timothy Arceri <[email protected]>
* anv: Use vk_outarray in vkGetPhysicalDeviceQueueFamilyPropertiesChad Versace2017-03-131-55/+18
| | | | | | No intended change in behavior. Just a refactor. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Use vk_outarray in vkEnumeratePhysicalDevices (v2)Chad Versace2017-03-131-27/+4
| | | | | | | | | No intended change in behavior. Just a refactor. v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For Jason. Reviewed-by: Jason Ekstrand <[email protected]>
* util/vulkan: Add vk_outarray (v2)Chad Versace2017-03-131-0/+140
| | | | | | | | | | | This is a wrapper for a Vulkan output array. A Vulkan output array is one that follows the convention of the parameters to vkGetPhysicalDeviceQueueFamilyProperties(). v2: Replace vk_outarray_is_incomplete() with vk_outarray_status(). For Jason. Reviewed-by: Jason Ekstrand <[email protected]>
* intel: genxml: prevent missing ; with address fields dwordsLionel Landwerlin2017-03-131-28/+26
| | | | | | | | | | | | | | | | Before this change, the generator could print this kind of things : const uint32_t v0 = __gen_uint(values->ValidBit, 0, 0) | __gen_uint(values->FaultType, 1, 2) | __gen_uint(values->SRCIDofFault, 3, 10) | __gen_uint(values->GTTSEL, 11, 1) | dw[0] = __gen_combine_address(data, &dw[0], values->VirtualAddressofFault, v0); This change fix the trailing '|'. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* gallium/hud: check NULL return from u_upload_allocJulien Isorce2017-03-131-0/+5
| | | | | | | | | | | | | | | | | | | | Fixes the following segmentation fault: signal SIGSEGV: invalid address (fault address: 0x0) frame #0: 0x00007fffe718e117 radeonsi_dri.so hud_draw_background_quad hud_context.c:170 167 168 assert(hud->bg.num_vertices + 4 <= hud->bg.max_num_vertices); 169 -> 170 vertices[num++] = (float) x1; 171 vertices[num++] = (float) y1; 172 173 vertices[num++] = (float) x1; (lldb) bt * frame #0: 0x00007fffe718e117 radeonsi_dri.so`hud_draw_background_quad frame #1: 0x00007fffe718f458 radeonsi_dri.so`hud_draw frame #2: 0x00007fffe712967f radeonsi_dri.so`dri_flush Signed-off-by: Marek Olšák <[email protected]>
* winsys/radeon: check null return from radeon_cs_create_fence in cs_flushJulien Isorce2017-03-131-11/+13
| | | | | | | | | | | | Follow-up of patch: "radeon_cs_create_fence: check null return from radeon_winsys_bo_create" radeon_drm_cs_flush radeon_cs_create_fence radeon_winsys_bo_create Signed-off-by: Julien Isorce <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* winsys/radeon: check null in radeon_cs_create_fenceJulien Isorce2017-03-131-0/+3
| | | | | | | | | | | | | | Fixes the following segmentation fault: radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c -> if (!bo->handle) (gdb) bt 0 radeon_drm_cs_add_buffer (bo=0x0) at radeon_drm_cs.c 1 0x00007fffe73575de in radeon_cs_create_fence radeon_drm_cs.c 2 0x00007fffe7358c48 in radeon_drm_cs_flush radeon_drm_cs.c Signed-off-by: Julien Isorce <[email protected]> Signed-off-by: Marek Olšák <[email protected]>
* vulkan/wsi: include builddir for generated headersJuan A. Suarez Romero2017-03-131-0/+1
| | | | | | | | wayland-drm-client-protocol.h is generated in builddir, so when builddir != srcdir the header is not found, and compilation of wsi_common_wayland.c will fail. Reviewed-by: Emil Velikov <[email protected]>
* anv: Use on-the-fly surface states for dynamic buffer descriptorsJason Ekstrand2017-03-137-245/+86
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We have a performance problem with dynamic buffer descriptors. Because we are currently implementing them by pushing an offset into the shader and adding that offset onto the already existing offset for the UBO/SSBO operation, all UBO/SSBO operations on dynamic descriptors are indirect. The back-end compiler implements indirect pull constant loads using what basically amounts to a texelFetch instruction. For pull constant loads with constant offsets, however, we use an oword block read message which goes through the constant cache and reads a whole cache line at a time. Because of these two things, direct pull constant loads are much faster than indirect pull constant loads. Because all loads from dynamically bound buffers are indirect, the user takes a substantial performance penalty when using this "performance" feature. There are two potential solutions I have seen for this problem. The alternate solution is to continue pushing offsets into the shader but wire things up in the back-end compiler so that we use the oword block read messages anyway. The only reason we can do this because we know a priori that the dynamic offsets are uniform and 16-byte aligned. Unfortunately, thanks to the 16-byte alignment requirement of the oword messages, we can't do some general "if the indirect offset is uniform, use an oword message" sort of thing. This solution, however, is recommended for a few of reasons: 1. Surface states are relatively cheap. We've been using on-the-fly surface state setup for some time in GL and it works well. Also, dynamic offsets with on-the-fly surface state should still be cheaper than allocating new descriptor sets every time you want to change a buffer offset which is really the only requirement of the dynamic offsets feature. 2. This requires substantially less compiler plumbing. Not only can we delete the entire apply_dynamic_offsets pass but we can also avoid having to add architecture for passing dynamic offsets to the back- end compiler in such a way that it can continue using oword messages. 3. We get robust buffer access range-checking for free. Because the offset and range are baked into the surface state, we no longer need to pass ranges around and do bounds-checking in the shader. 4. Once we finally get UBO pushing implemented, it will be much easier to handle pushing chunks of dynamic descriptors if the compiler remains blissfully unaware of dynamic descriptors. This commit improves performance of The Talos Principle on ULTRA settings by around 50% and brings it nicely into line with OpenGL performance. Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Stall before fast-clear operationsJason Ekstrand2017-03-131-6/+19
| | | | | | | | | | | | | | | During initial CCS bring-up, I discovered that you have to do a full CS stall prior to doing a CCS resolve as well as afterwards. It appears that the same is needed for fast-clears as well. This fixes rendering corruptions on The Talos Principle on Sky Lake GT4. The issue hasn't been demonstrated on any other hardware however, given that this appears to be a "too many things in the pipe" problem, having it be easier to reproduce on a system with more EUs makes sense. The issues with resolves is demonstrable on a GT3 or GT2 so this is probably also a problem on all GTs. Reviewed-by: Topi Pohjolainen <[email protected]> Cc: "13.0 17.0" <[email protected]>
* anv: Accurately advertise dynamic descriptor limitsJason Ekstrand2017-03-131-2/+2
| | | | | | | | | | The number of dynamic descriptors is limited by both the number of descriptors and the total number of dynamic things. Because there isn't a single "maximum dynamic things" limit, we need to divide by two so that they can create the maximum of both UBOs and SSBOs. Reviewed-by: Eduardo Lima Mitev <[email protected]> Cc: "17.0 13.0" <[email protected]>
* anv: Add a helper for working with VK_WHOLE_SIZE for buffersJason Ekstrand2017-03-134-11/+28
| | | | Reviewed-by: Plamena Manolova <[email protected]>
* freedreno/ir3: fragz cannot be half precisionRob Clark2017-03-131-0/+6
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/ir3: optimize less in glslRob Clark2017-03-131-1/+1
| | | | | | | | | | | | | | | | | | | Rely on nir for optimization, to reduce compile times. Very minimal impact on shader-db: total instructions in shared programs: 104170 -> 104199 (0.03%) total dwords in shared programs: 209664 -> 209728 (0.03%) total full registers used in shared programs: 7156 -> 7161 (0.07%) total half registers used in shader programs: 109 -> 109 (0.00%) total const registers used in shared programs: 24222 -> 24224 (0.01%) half full const instr dwords helped 12 107 103 112 98 hurt 11 104 105 115 102 But shader db runtime dropped from ~29.3s user to ~20.4s user. Signed-off-by: Rob Clark <[email protected]>
* aubinator/genxml: use gzipped files to store embedded genxmlLionel Landwerlin2017-03-134-18/+66
| | | | | | | | | | | | This reduces the size of the aubinator binary from ~1.4Mb to ~700Kb. With can now drop the checks on xxd in configure. v2: Fix incorrect makefile dependency (Lionel) v3: use $(PYTHON2) (Emil) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* intel: genxml: add script to generate gzipped genxmlLionel Landwerlin2017-03-132-0/+48
| | | | | | | | | | | | | v2 (from Dylan): Add main function Add missing Copyright Use print_function v3: Add actually license (Dylan) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* util/u_thread.h: Include stdint.h for int64_t definition.Jose Fonseca2017-03-131-0/+2
| | | | Fixes MinGW build. Trivial.
* intel: fix compiler buildIago Toral Quiroga2017-03-132-8/+7
| | | | | | | | | compiler/brw_vec4_gs_visitor.cpp:744:39: error: ‘GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES’ was not declared in this scope output_vertex_size_bytes <= GEN7_MAX_GS_OUTPUT_VERTEX_SIZE_BYTES); Fixes: d0d4a5f43b4 ("i965: split EU defines to brw_eu_defines.h") Reviewed-by: Emil Velikov <[email protected]>
* svga: handle P016 format as wellChristian König2017-03-131-0/+1
| | | | | | Fixes: 62cff793785 ("gallium: add P016 format") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100180 Reviewed-by: Emil Velikov <[email protected]>
* configure.ac: require pthread-stubs only where availableEmil Velikov2017-03-131-2/+3
| | | | | | | | | | | | | | | | | | The project is a thing only for BSD platforms. Or in other words - for any other platforms building/installing pthread-stubs results only in a pthread-stub.pc file. And even where it provides a DSO, there's a fundamental design issue with it - see the pthread-stubs mailing list for the specifics. v2: Update comment above the switch statement (Jon Turney). Reviewed-by: Jeremy Huddleston Sequoia <[email protected]> Acked-by: Gary Wong <[email protected]> Tested-by: Eric Engestrom <[email protected]> Acked-by: Randy Fishel <[email protected]> Cc: Niveditha Rau <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* configure.ac: do not require the i965 driver for ANVEmil Velikov2017-03-131-3/+2
| | | | | | | | | | | As of last few commits we have the two split, thus we no longer require the i965 in order to have the ANV driver. Even though ANV does not link against libdrm nor libdrm_intel, we still require those as dependencies due to the headers they provide. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/vulkan: Get rid of recursive makeJason Ekstrand2017-03-139-307/+305
| | | | | | | | v2 [Emil Velikov] - Various fixes and initial stab at the Android build. - Keep the generation rules/EXTRA_DIST outside the conditional Reviewed-by: Jason Ekstrand <[email protected]>
* intel/tools: Use a makefile included from intel/Makefile.amJason Ekstrand2017-03-134-42/+19
| | | | | | Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: whitespace cleanupsEmil Velikov2017-03-132-5/+0
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: link all tests again gtest, even test_eu_compact"Emil Velikov2017-03-132-27/+13
| | | | | | | | | | | | At the moment all the tests but test_eu_compact are actual C++ gtests. To simplify things, we can move the gtest.la to the common TEST_LIBS. As we're here, we can rename change the test extension [to .cpp] to avoid using the confusing dummy.cpp. Add a nice comment in the makefile for posterity. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove i965_symbols_test reference from .gitignoreEmil Velikov2017-03-131-1/+0
| | | | | | | | | | The test/binary was removed back in 2012. With that one gone, we can drop the .gitignore file all together. Cc: Eric Anholt <[email protected]> Fixes: c8850394423 ("i965: Drop the missing symbols link test.") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Move the back-end compiler to src/intel/compilerJason Ekstrand2017-03-13138-270/+306
| | | | | | | | | | | | | | | | | | | | | | Mostly a dummy git mv with a couple of noticable parts: - With the earlier header cleanups, nothing in src/intel depends files from src/mesa/drivers/dri/i965/ - Both Autoconf and Android builds are addressed. Thanks to Mauro and Tapani for the fixups in the latter - brw_util.[ch] is not really compiler specific, so it's moved to i965. v2: - move brw_eu_defines.h instead of brw_defines.h - remove no-longer applicable includes - add missing vulkan/ prefix in the Android build (thanks Tapani) v3: - don't list brw_defines.h in src/intel/Makefile.sources (Jason) - rebase on top of the oa patches [Emil Velikov: commit message, various small fixes througout] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: split EU defines to brw_eu_defines.hEmil Velikov2017-03-1319-1198/+1264
| | | | | | | | | | | | | | | | | | | Split out the EU defines from the 'generic' ones, as the former are more compiler oriented. With a later commit we'll move brw_eu_defines.h alongside the compiler infra to src/intel/. Pulling all the defines in there seems overzealous. Some defines are used by both i965 and the i965 compiler. Those are moved to brw_eu_defines.h, and annotated accordingly. The i965 users were updated to have the extre include to indicate that. With future work we might provide a better, split but for now this seems reasonable. Cc: Kenneth Graunke <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* util/bitscan: use correct signature for ffs/ffsllEmil Velikov2017-03-132-6/+6
| | | | | | | | | | | | | Otherwise we'll get errors such as error: conflicting types for ‘ffs’ error: conflicting types for ‘ffsll’ We might want to improve the heuristics and provide a definition only when a native one is missing. We can address that at a later stage. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: add missing brw_defines.h include in brw_program.cEmil Velikov2017-03-131-0/+1
| | | | | | | | File is using MI_LOAD_REGISTER_IMM, GEN7_CACHE_MODE_1 and others as defined in the header. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: add missing brw_defines.h include in brw_program.cEmil Velikov2017-03-131-0/+1
| | | | | | | File is using the PIPE_CONTROL_* macros as defined in the header. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: add missing #include <assert.h> in brw_inst.hEmil Velikov2017-03-131-0/+1
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: move brw_define.h ifndef guard to the topEmil Velikov2017-03-131-3/+3
| | | | | | Cc: [email protected] Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove unused macros from brw_defines.hEmil Velikov2017-03-131-19/+1
| | | | | | | | | | | The follow three groups are not used by neither the DRI module nor the compiler. BRW_POLYGON_*_FACING BRW_POLYGON_FACING_* BRW_STATELESS_BUFFER_* Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove unused brw_program.h includeEmil Velikov2017-03-137-7/+0
| | | | | | | | | | | | Neither of the changed files requires the brw_program.h include. Since we're about to move them [to src/intel/compiler] with the next commit there's no point in having the include. Let alone the very confusing compiler include directive [-I${top_srcdir}/src/mesa/drivers/dri/i965/] that one would have to use. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove duplicate declaration of brw_mark_surface_usedEmil Velikov2017-03-131-4/+0
| | | | | | | | | Function was made static and moved to another header with earlier commit. Fixes: 760c8a1d950 ("i965: Make mark_surface_used a static inline in brw_compiler.h") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove dead brw_new_shader() declarationEmil Velikov2017-03-131-2/+0
| | | | | | | Cc: Timothy Arceri <[email protected]> Fixes: 194537ebe44 ("mesa/glsl/i965: remove Driver.NewShader()") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove unused brw_cs.h includeEmil Velikov2017-03-131-1/+0
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Stop including brw_context.hJason Ekstrand2017-03-131-1/+1
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* intel/isl: Stop linking libi965_compiler.la into testsJason Ekstrand2017-03-131-1/+0
| | | | | Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan/wsi: Generate wayland protocol headers separately from EGLJason Ekstrand2017-03-136-15/+23
| | | | | | | | | | | | | | | | Previously, we were depending on EGL for generating the headers and providing the protocol symbols. However, since neither Vulkan driver actually wants to link against EGL, this is kind of pointless. It also creates a weird build dependency. v2 [Jason] - Add missing wsi/ prefix, MKDIR_GEN v3 [Emil Velikov] - include BUILT_SOURCES/generation rules outside of conditional Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv/wsi: Don't include wayland headersEmil Velikov2017-03-131-3/+0
| | | | | | | | | Unused and we'll rework the way wayland-drm-client-protocol.h is generated with later commit. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Acked-by: Dave Airlie <[email protected]>
* anv/wsi: Don't include wayland headersJason Ekstrand2017-03-131-3/+0
| | | | | | | | | | | Unused and we'll rework the way wayland-drm-client-protocol.h is generated with later commit. v2 [Emil] - Also remove wayland-client.h Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* configure.ac: provide a fall-back define for WAYLAND_SCANNEREmil Velikov2017-03-131-2/+2
| | | | | | | | | In some cases, we can end up calling WAYLAND_SCANNER even when there's no binary. Do follow the other's approach set by AX_PROG_FLEX/BISON and set the variable to : Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* wayland: move .gitignore where applicableEmil Velikov2017-03-131-0/+0
| | | | | | | | Strictly speaking things work as-is, but let's move the file alongside the artefacts it references. Analogous to all other places in mesa. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/va: add config support for 10bit decoding v2Christian König2017-03-132-4/+21
| | | | | | | | | Advertise 10bpp support if the driver supports decoding to a P016 surface. v2: Advertise 10bpp for the decoder as well. Signed-off-by: Christian König <[email protected]> Signed-off-by: Mark Thompson <[email protected]>
* st/va: add support for allocating 10bpp surfacesChristian König2017-03-131-9/+15
| | | | | | | We support P010 and P016 as targets for 10bpp video decoding. Signed-off-by: Christian König <[email protected]> Reviewed-by: Mark Thompson <[email protected]>