summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* python: Stabilize some script outputsMathieu Bridon2018-07-055-5/+8
| | | | | | | | | | | | In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by: Eric Engestrom <[email protected]>
* intel: tools: remove drm-uapi definesLionel Landwerlin2018-07-051-29/+1
| | | | | | | We already embed the headers, no need to redefine defines/structs. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: intel_dump_gpu: use simulator id in capturesLionel Landwerlin2018-07-052-2/+2
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: devinfo: add simulator idLionel Landwerlin2018-07-052-4/+48
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: dump-gpu: dump 48-bit addressesScott D Phillips2018-07-052-167/+151
| | | | | | | | | | | | | For gen8+, write out PPGTT tables in aub files so that full 48-bit addresses can be serialized. v2: Fix handling of `end` index in map_ppgtt v3: Correctly mark GGTT entry as present (Rafael) Signed-off-by: Scott D Phillips <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: import intel_aubdumpLionel Landwerlin2018-07-054-0/+1440
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel: tools: update intel_aub.hLionel Landwerlin2018-07-051-0/+26
| | | | | | | Scott added new stuff in IGT. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: batch-decoder: add missing return lineLionel Landwerlin2018-07-051-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: batch-decoder: don't asks for constant BO until decodingLionel Landwerlin2018-07-051-6/+11
| | | | | | | | | | With PPGTT mappings, our aubinator implementation can be quite slow if we request a buffer that doesn't exist. Instead of doing a PPGTT walk for invalid addresses (0 lengths), wait until we're sure we want to decode the data. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/batch-decoder: handle non-contiguous binding table / surface stateScott D Phillips2018-07-051-4/+14
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/aubinator: aubinate ppgtt aubsScott D Phillips2018-07-051-1/+72
| | | | | | | | | | | | v2: by Lionel Fix memfd_create compilation issue Fix pml4 address stored on 32 instead of 64bits Return no buffer if first ppgtt page is not mapped v3: Drop additional memfd_create() (Rafael) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: handle GGTT mappingsLionel Landwerlin2018-07-051-13/+244
| | | | | | | | | | We use memfd to store physical pages as they get read/written to and the GGTT entries translating virtual address to physical pages. Based on a commit by Scott Phillips. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* util: rb-tree: A simple, invasive, red-black treeJason Ekstrand2018-07-054-0/+694
| | | | | | | | | | | | | | | | | | | | | | | | This is a simple, invasive, liberally licensed red-black tree implementation. It's an invasive data structure similar to the Linux kernel linked-list where the intention is that you embed a rb_node struct the data structure you intend to put into the tree. The implementation is mostly based on the one in "Introduction to Algorithms", third edition, by Cormen, Leiserson, Rivest, and Stein. There were a few other key design points: * It's an invasive data structure similar to the [Linux kernel linked list]. * It uses NULL for leaves instead of a sentinel. This means a few algorithms differ a small bit from the ones in "Introduction to Algorithms". * All search operations are inlined so that the compiler can optimize away the function pointer call. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: aubinator: drop the 1Tb GTT mappingLionel Landwerlin2018-07-051-55/+75
| | | | | | | | | | | | | | | Now that we're softpinning the address of our BOs in anv & i965, the addresses selected start at the top of the addressing space. This is a problem for the current implementation of aubinator which uses only a 40bit mmapped address space. This change keeps track of all the memory writes from the aub file and fetch them on request by the batch decoder. As a result we can get rid of the 1<<40 mmapped address space and only rely on the mmap aub file \o/ Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: rework register writes handlingLionel Landwerlin2018-07-051-28/+54
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: remove standard input processing optionLionel Landwerlin2018-07-051-90/+12
| | | | | | | | | | | On a follow up commit in this series, we stop copying the data from the mmap'ed file into our big gtt mmap, and start referencing data in it directly. So reallocating the read buffer and adding more data from stdin wouldn't work. For that reason, let's stop supporting stdin process. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: remove unused variablesLionel Landwerlin2018-07-051-5/+0
| | | | | | | These memory offsets are stored in the gen_batch_decode_ctx. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* gallium/auxiliary: Fix string matchingMathieu Bridon2018-07-051-1/+1
| | | | | | | | | | | | | | | | | | | | | | Commit f69bc797e15fe6beb9e439009fab55f7fae0b7f9 did the following: - if format.layout in ('bptc', 'astc'): + if format.layout in ('astc'): The intention was to go from matching either 'bptc' or 'astc' to matching only 'astc'. But the new code doesn't respect this intention any more, because in Python `('astc')` is not a tuple containing a string, it is just the string. (the parentheses are simply ignored) That means we now match any substring of 'astc', for example 'a'. This commit fixes the test to respect the original intention. Fixes: f69bc797e15fe6beb9e4 "gallium/auxiliary: Add helper support for bptc format compress/decompress" Reviewed-by: Eric Engestrom <[email protected]>
* radv: optimize vkCmd{Set,Reset}Event() a little bitSamuel Pitoiset2018-07-051-8/+38
| | | | | | | | | | | | Always emitting a bottom-of-pipe event is quite dumb. Instead, start to optimize these functions by syncing PFP for the top-of-pipe and syncing ME for the post-index-fetch event. This can still be improved by emitting EOS events for syncing PS and CS stages. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: optimize radv_CmdWaitEvents()Samuel Pitoiset2018-07-051-43/+60
| | | | | | | | | | | | This introduces radv_barrier() (same as the draw/dispatch codepath). This helper is used for merging the code from CmdWaitEvents() and CmdPipelineBarrier because it's quite similar. We do ignore the source stage mask for CmdWaitEvents because it's irrelevant when event objects are used. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nir/linker: fix msvc buildRoland Scheidegger2018-07-051-1/+1
| | | | | | | | | Empty initializer braces aren't valid c (it's a gnu extension, and it's valid in c++). Hopefully fixes appveyor / msvc build... Fixes 6677e131b806b10754adcb7cf3f427a7fcc2aa09 Reviewed-by: Timothy Arceri <[email protected]>
* r600: compare structure elements instead of doing a memcmpGert Wollny2018-07-051-1/+4
| | | | | | | | | | | | | | | | | | | | | | | | | Structures might be padded by the compiler and these padding bytes remain un-initialized which in turn makes memcmp return a difference where from the logical point of view there is none. Fixes valgrind: Conditional jump or move depends on uninitialised value(s) at 0x4C32CBA: __memcmp_sse4_1 (vg_replace_strmem.c:1099) by 0xB8D2537: r600_set_vertex_buffers (r600_state_common.c:573) by 0xB71D44A: u_vbuf_set_driver_vertex_buffers (u_vbuf.c:1129) by 0xB71F7BB: u_vbuf_draw_vbo (u_vbuf.c:1153) by 0xB3B92CB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3948FD: st_update_array (st_atom_array.c:388) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: Add R4G4B4A4 and A1B5G5R5 to supported vertex formatsGert Wollny2018-07-051-0/+15
| | | | | | | | | | | | | | | | | | | | | | | | | | Below tests would fail with an error message "Vertex format (R4G4B4A4|R5G5B5A1) not supported." Add the formate to the translation routine to enable these formats. Fixes: dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.teximage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgba4_cube dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_2d dEQP-GLES3.functional.texture.specification.texsubimage2d_pbo.rgb5_a1_cube dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.teximage3d_pbo.rgb5_a1_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgba4_3d dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_2d_array dEQP-GLES3.functional.texture.specification.texsubimage3d_pbo.rgb5_a1_3d Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* r600: force LOD range to be only one value when mip.min filter is NONEGert Wollny2018-07-051-1/+9
| | | | | | | | | | | | | | | | For a texture that has only one LOD defined, but for which GL_TEXTURE_MAX_LEVEL is the default (1000) and GL_TEXTURE_MIN_LOD != GL_TEXTURE_MAX_LOD the reading from the texture does not properly resolve the LOD level and texture lookup might fail. Hence, when no mipmap filter is given (indicating that no mip-mapping takes place), force the LOD range to contain only value. Fixes: dEQP-GLES3.functional.shaders.texture_functions.texture*.(i|u)sampler2d* dEQP-GLES3.functional.texture.format.sized.cube.rgb* out of VK_GL_CTS/android/cts/master/gles3-master.txt Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* mesa/st: draw_vbo: initialize restart_index tooGert Wollny2018-07-051-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | restart_index is later always used in a comparison, so it should be initialized properly. Fixes valgrind warning: Conditional jump or move depends on uninitialised value(s) at 0xB8D682F: r600_draw_vbo (r600_state_common.c:2153) by 0xB71F743: u_vbuf_draw_vbo (u_vbuf.c:1156) by 0xB3B92DB: st_draw_vbo (st_draw.c:235) by 0xB36B1AE: vbo_draw_arrays (vbo_exec_array.c:391) by 0xB36BB0D: vbo_exec_DrawArrays (vbo_exec_array.c:550) by 0x10A989: piglit_display (textureSize.c:157) by 0x4F8F174: run_test (piglit_fbo_framework.c:52) by 0x4F7BA12: piglit_gl_test_run (piglit-framework-gl.c:229) by 0x10A60A: main (textureSize.c:71) Uninitialised value was created by a stack allocation at 0xB3B90B0: st_draw_vbo (st_draw.c:143) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* mesa: enable ARB_direct_state_access in OpenGL 4.5 compat profileTimothy Arceri2018-07-053-197/+197
| | | | | | | | | Its unlikely anyone will add proper ARB_direct_state_access compat support before we branch 18.2. Enabling the extension in 4.5 at least allows users to make use of MESA_GL_VERSION_OVERRIDE=4.5COMPAT for games like No Mans Sky. Reviewed-by: Marek Olšák <[email protected]>
* util/drirc: turn on force_glsl_extensions_warn for No Mans SkyTimothy Arceri2018-07-051-0/+4
| | | | | | | | | The game forgets to enable multiple extensions in its shaders, one of those extesions is EXT_texture_array. But enabling this config entry fixes at least one other rendering issue that enabling EXT_texture_array on its own doesn't fix. Reviewed-by: Marek Olšák <[email protected]>
* util/queue: remove leftover debug codeMarek Olšák2018-07-041-1/+0
|
* Shorten u_queue namesMarek Olšák2018-07-046-7/+7
| | | | | | | | There is a 15-character limit for thread names shared by the queue name and process name. Shorten the thread name to make space for the process name. Reviewed-by: Timothy Arceri <[email protected]>
* kutil/queue: add a process name into a thread nameMarek Olšák2018-07-042-3/+31
| | | | | | | v2: simplifications Reviewed-by: Timothy Arceri <[email protected]> (v1) Reviewed-by: Eric Engestrom <[email protected]> (v1)
* gallium/os: use util_get_process_name when possibleMarek Olšák2018-07-042-14/+4
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: extract get_process_name from xmlconfig.cMarek Olšák2018-07-045-84/+156
| | | | | Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* ac: fold LLVMContext creation into ac_llvm_context_initMarek Olšák2018-07-044-14/+9
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: reorder code in si_llvm_context_initMarek Olšák2018-07-041-13/+13
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: use ac_compile_module_to_binary to reduce compile timesMarek Olšák2018-07-042-31/+4
| | | | | | | Compile times of simple shaders are reduced by ~20%. Compile times of prologs and epilogs are reduced by up to 40%. Reviewed-by: Dave Airlie <[email protected]>
* ac: add reusable helpers for direct LLVM compilationMarek Olšák2018-07-043-4/+76
| | | | | | | | | | | | | | | This is basically LLVMTargetMachineEmitToMemoryBuffer inlined and reworked. struct ac_compiler_passes (opaque type) contains the main pass manager. ac_create_llvm_passes -- the result can go to thread local storage ac_destroy_llvm_passes -- can be called by a destructor in TLS ac_compile_module_to_binary -- from LLVMModuleRef to ac_shader_binary The motivation is to do the expensive call addPassesToEmitFile once per context or thread. Reviewed-by: Dave Airlie <[email protected]>
* nvc0: implement multisampled images on Maxwell+Rhys Perry2018-07-046-39/+48
| | | | | | | | | | Changes in v2: - make loadSuInfo32() protected without making the rest protected - move NVC0_SU_INFO_* into nv50_ir_lowering_nvc0.h instead of duplicating NVC0_SU_INFO_MS Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* i965: Fix output register sizes when variable ranges are interleavedNeil Roberts2018-07-041-7/+18
| | | | | | | | | | | | | | | | | | | | | | | | | | | | In 6f5abf31466aed this code was fixed to calculate the maximum size of an attribute in a seperate pass and then allocate the registers to that size. However this wasn’t taking into account ranges that overlap but don’t have the same starting location. For example: layout(location = 0, component = 0) out float a[4]; layout(location = 2, component = 1) out float b[4]; Previously, if ‘a’ was processed first then it would allocate a register of size 4 for location 0 and it wouldn’t allocate another register for location 2 because it would already be covered by the range of 0. Then if something tries to write to b[2] it would try to write past the end of the register allocated for ‘a’ and it would hit an assert. This patch changes it to scan for any overlapping ranges that start within each range to calculate the maximum extent and allocate that instead. Fixed Piglit’s arb_enhanced_layouts/execution/component-layout/ vs-fs-array-interleave-range.shader_test Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Fixes: 6f5abf31466 "i965: Fix output register sizes when multiple variables share a slot."
* r600/sb: cleanup if_conversion iterator to be legal C++Dave Airlie2018-07-041-7/+4
| | | | | | | | | | | | | | The current code causes: /usr/include/c++/8/debug/safe_iterator.h:207: Error: attempt to copy from a singular iterator. This is due to the iterators getting invalidated, fix the reverse iterator to use the return value from erase, and cast it properly. (used Mathias suggestion) Cc: <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* radeonsi: fix compiler breakageMarek Olšák2018-07-041-0/+1
| | | | Broken by d853d3a59bd5f8720a5b021bcd64a193d370b623.
* ac: make some fns staticDave Airlie2018-07-042-13/+6
| | | | | | | Some of the compiler functions are no longer called outside the util file. Reviewed-by: Marek Olšák <[email protected]>
* ac/radv: move llvm compiler info to struct and init in one placeDave Airlie2018-07-046-34/+31
| | | | | | | | | | | | This ports radv to the shared code, however due to a bug in LLVM version prior to 7, radv cannot add target info at this stage, as it would leak one for every shader compile, however I'd prefer to keep this llvm damage in the shared code, since it isn't the driver at fault here. We just add a flag to denote if the driver can support leaking the target info or not, and the common code does the right thing depending on the llvm version. Reviewed-by: Marek Olšák <[email protected]>
* ac/radeonsi: port compiler init/destroy out of radeonsi.Dave Airlie2018-07-043-25/+51
| | | | | | | | | | We want to share this code with radv in the future, so port it out of radeonsi. Add a return value as radv will want that to know if this succeeds Reviewed-by: Marek Olšák <[email protected]>
* radv/radeonsi: add a check ir tm optionsDave Airlie2018-07-043-3/+7
| | | | | | | This doesn't do much yet, but it makes it easier to move the code to a common shared code base. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: rename si_compiler -> ac_llvm_compilerDave Airlie2018-07-049-36/+37
| | | | | | | | As precursor to moving init to common code, just rename the struct and move it. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* ac: add target library info helpersDave Airlie2018-07-042-0/+15
| | | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: create/destroy passmgr at the higher level.Dave Airlie2018-07-043-10/+14
| | | | | | | This is prep work for moving this to a per-thread struct Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* radv: port to use common passmgr code.Dave Airlie2018-07-042-25/+6
| | | | | | | | This adds a inline always pass, but otherwise should work the same. Reviewed-by: Marek Olšák <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* ac/radeonsi: refactor out pass manager init to common code.Dave Airlie2018-07-043-25/+34
| | | | Reviewed-by: Marek Olšák <[email protected]>
* radv: drop copy of ac_create_target_machine.Dave Airlie2018-07-041-31/+1
| | | | | | Once we split the init once stuff out, this can be shared again. Reviewed-by: Marek Olšák <[email protected]>