summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* vbo: Move vbo_bind_arrays into a dd_driver_functions draw callback.Mathias Fröhlich2018-03-224-46/+27
| | | | | | | | | | | | Factor out that common call into the almost single place. Remove the _mesa_set_drawing_arrays call from vbo_{exec,save}_draw code paths as the function is now called through vbo_bind_arrays. Prepare updating the list of struct gl_vertex_array entries via calling _vbo_update_inputs for being pushed into those drivers that finally work on that long list of gl_vertex_array pointers. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* mesa: Move vbo draw functions into dd_function_table.Mathias Fröhlich2018-03-229-48/+162
| | | | | | | | Move vbo draw functions into struct dd_function_table. For now just wrap the underlying vbo functions. Reviewed-by: Brian Paul <[email protected]> Signed-off-by: Mathias Fröhlich <[email protected]>
* clover/llvm: Fix build against LLVM/Clang 4.0Aaron Watry2018-03-212-1/+3
| | | | | | | | The opencl 1.0 langstandard was renamed in 5.0+ v2: Move preprocessor check into compat.hpp Reviewed-by: Francisco Jerez <[email protected]>
* ac/nir_to_llvm: add frexp supportTimothy Arceri2018-03-221-0/+11
| | | | | | | | | | | | | | Fixes CTS tests: KHR-GL40.gpu_shader_fp64.builtin.frexp_double KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec2 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec3 KHR-GL40.gpu_shader_fp64.builtin.frexp_dvec4 And piglit test: tests/spec/arb_gpu_shader_fp64/execution/built-in-functions/fs-frexp-dvec4.shader_test Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* nir: add frexp_exp and frexp_sig opcodesTimothy Arceri2018-03-222-0/+5
| | | | Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv/pipeline: don't pass constant view index in multiviewCaio Marcelo de Oliveira Filho2018-03-211-6/+11
| | | | | | | | | | | If view mask has only one bit set, view index is effectively a constant, so doesn't need to be passed to the next stages, just always set it. Part of this was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pipeline: use less instructions for multiviewCaio Marcelo de Oliveira Filho2018-03-211-1/+1
| | | | | | | | | | | | | | | The view_index is encoded in the remainder of dividing instance id by the number of views in the view mask (n). In the general case (handled by the else clause), there is a need to map from 0..n-1 into the number of the view being masked. For that a map is encoded. In the case only the first n bits in the mask are set, the mapping is trivial, 0..n-1 already represent what view is being referred to. That case was in the original patch that added anv_nir_lower_multiview.c but disabled. Reviewed-by: Jason Ekstrand <[email protected]>
* broadcom/vc5: Fix up the NIR types of FS outputs generated by NIR-to-TGSI.Eric Anholt2018-03-214-0/+54
| | | | | | | | | | Unfortunately TGSI doesn't record the type of the FS output like GLSL does, but VC5's TLB writes depend on the output's base type. Just record the type in the key at variant compile time when we've got a TGSI input and then fix it up. Fixes KHR-GLES3.packed_pixels.pbo_rectangle.rgba32i/ui and apparently a GPU hang that breaks most tests that come after it.
* spirv: Add a 64-bit implementation of FrexpNeil Roberts2018-03-211-4/+56
| | | | | | | | | | The implementation is inspired by lower_instructions_visitor::dfrexp_sig_to_arith. This has been tested against the arb_gpu_shader_fp64/fs-frexp-dvec4 test using the ARB_gl_spirv branch. Reviewed-by: Jason Ekstrand <[email protected]>
* aubinator_error_decode: Compare only the class_name of the ring.Rafael Antognolli2018-03-211-1/+1
| | | | | | | | | | | ring_name is "<class_name> + <instance_id>" (e.g. rcs0). So we need to first compare the class name only, then get the instance id. Without this, INSTDONE is not being decoded. Signed-off-by: Rafael Antognolli <[email protected]> Cc: Chris Wilson <[email protected]> Reviewed-by: Chris Wilson <[email protected]>
* nir: Migrate nir_dce to instr worklistThomas Helland2018-03-211-35/+18
| | | | | | | | | Shader-db runtime change avarage of five runs: Before 125,77 seconds (+/- 0,09%) After 124,48 seconds (+/- 0,07%) Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>
* nir: Initial implementation of a nir_instr_worklistThomas Helland2018-03-211-0/+67
| | | | | | | | | | Make a simple worklist by basically just wrapping u_vector. This is intended used in nir_opt_dce to reduce the number of calls to ralloc, as we are currenlty spamming ralloc quite bad. It should also give better cache locality and much lower memory usage. Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de> Reviewed-by: Eric Anholt <eric at anholt.net>
* intel/tools: aubinator: Catch gen11 "enhanced execlist" submissionScott D Phillips2018-03-211-6/+20
| | | | | | | | | Different registers are used for execlist submission in gen11, so also watch those. This code only watches element zero of the submit queue, which is all aubdump currently writes. Tested-by: Rafael Antognolli <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* radeonsi: fix a snprintf warning on gcc 7.3.0Marek Olšák2018-03-211-1/+1
|
* radeonsi/gfx9: print the swizzle mode for testdmaMarek Olšák2018-03-211-2/+16
| | | | Tested-by: Dieter Nützel <[email protected]>
* ac/surface: compute tile swizzle for GFX9Marek Olšák2018-03-214-3/+91
| | | | Tested-by: Dieter Nützel <[email protected]>
* broadcom/vc5: Don't skip job submit just because everything is scissored.Eric Anholt2018-03-212-10/+7
| | | | | | | | The coordinate shaders may now have side effects in the form of transform feedback. Part of fixing GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_misc
* broadcom/vc5: Handle sparsely populated SO target array.Eric Anholt2018-03-211-7/+14
| | | | | Fixes GTF-GLES3.gtf.GL3Tests.transform_feedback.transform_feedback_state_variables
* broadcom/vc5: Fix 3D miplevel limit to match other texture targets.Eric Anholt2018-03-211-2/+1
| | | | | | Fixes segfault in GTF-GLES3.gtf.GL3Tests.texture_storage.texture_storage_texture_levels on level 13.
* broadcom/vc5: Clamp the instance divisor to 16 bits.Eric Anholt2018-03-211-1/+2
| | | | | | | Fixes debug assert on GTF-GLES3.gtf.GL3Tests.instanced_arrays.instanced_arrays_divisor Signed-off-by: Eric Anholt <[email protected]>
* i965: fix android buildLionel Landwerlin2018-03-211-2/+5
| | | | | | | | | | | | | This is the equivalent of commit 5770e1d89e0eb49eb3c9547e8657d636b6e7e5d7 for android. v2: fix xml files path and file given to --header Signed-off-by: Lionel Landwerlin <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Fixes: 2d2b15fbcab ("i965: fix autotools/android build") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105634 Reviewed-by: Emil Velikov <[email protected]>
* docs: fix typo in 17.3.6 release notesJuan A. Suarez Romero2018-03-211-1/+1
| | | | | | | | Title is about 17.3.5, when it must be about 17.3.6. CC: Emil Velikov <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir/dead_cf: also remove useless ifsCaio Marcelo de Oliveira Filho2018-03-211-14/+30
| | | | | | | | | | | | | | | | | | | | | | | | | | Generalize the code for remove dead loops to also remove dead if nodes. The conditions are the same in both cases, if the node (and it's children) don't have side-effects AND the nodes after it don't use the values produced by the node. The only difference is when evaluating side effects: loops consider only return jumps as a side-effect -- they can stop execution of nodes after it; 'if' nodes outside loops should consider all kinds of jumps (return, break, continue) since all of them can cause execution of nodes after it to be skipped. After this patch, empty ifs (those which both then and else blocks are empty) will be removed by nir_opt_dead_cf. It caused no change to shader-db, in part because the removal of empty ifs is currently covered by nir_opt_peephole_select. v2: Improve the identification of cases where break/continue can cause side-effects. (Jason) v3: Move code comment changes to a different patch. (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* nir/dead_cf: rephrase definition of a dead loop nodeCaio Marcelo de Oliveira Filho2018-03-211-8/+7
| | | | Reviewed-by: Jason Ekstrand <[email protected]>
* docs: update calendar, add news and link release notes to 17.3.7Juan A. Suarez Romero2018-03-213-7/+8
| | | | Signed-off-by: Juan A. Suarez Romero <[email protected]>
* docs: add sha256 checksums for 17.3.7Juan A. Suarez Romero2018-03-211-1/+2
| | | | | Signed-off-by: Juan A. Suarez Romero <[email protected]> (cherry picked from commit 13dd6016d749c07bfe2f20206a0bb8929ee585e8)
* docs: add release notes for 17.3.7Juan A. Suarez Romero2018-03-211-0/+311
| | | | | Signed-off-by: Juan A. Suarez Romero <[email protected]> (cherry picked from commit 8a51f3857c22cfa5feab8e72abcdab8802e711df)
* radeon/vce: move feedback command inside of destroy functionLeo Liu2018-03-213-9/+12
| | | | | | | | | | | | | | | | | On the CI family, firmware requires the destory command have to be the last command in the IB, moving feedback command after destroy is causing issues on CI cards, so we have to keep the previous logic that moves destroy back to the last command. But as the original issue fixed previously, with the newer family like Vega10, feedback command have to be included inside of the task info command along with destroy command. Fixes: 6d74cb25("radeon/vce: move destroy command before feedback command") Signed-off-by: Leo Liu <[email protected]> Acked-by: Christian König <[email protected]> Cc: [email protected]
* egl: pull update from Khronos and drop local defineEric Engestrom2018-03-212-7/+1
| | | | | | | | | | | Added in Khronos in 2b6bb4ee45cc46c89d4a "EGL_MESA_drm_image: add EGL_DRM_BUFFER_USE_CURSOR_MESA to egl.xml" [1] as part of PR #36 [2]. [1] https://github.com/KhronosGroup/EGL-Registry/commit/2b6bb4ee45cc46c89d4a4349f2ca94e80d77cd97 [2] https://github.com/KhronosGroup/EGL-Registry/pull/36 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl: align the formatting of Haiku section of eglplatform.h with Khronos'Eric Engestrom2018-03-211-4/+6
| | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* egl: add Ozone section to eglplatform.hEric Engestrom2018-03-211-0/+6
| | | | | | | | | | | | This pulls in commit a93f559e9c11fa53fb5f1cc255b8f75433f85d2a "Add Ozone section to eglplatform.h" from Khronos [1] added by Brian Anderson [2] a few months ago. [1] https://github.com/KhronosGroup/EGL-Registry/commit/a93f559e9c11fa53fb5f1cc255b8f75433f85d2a [2] https://github.com/KhronosGroup/EGL-Registry/pull/26 Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* clover: Dynamically calculate __OPENCL_VERSION__ and CLC language versionAaron Watry2018-03-211-2/+5
| | | | | | | | | | | | | | | | Use get_language_version to calculate default cl standard based on device capabilities and -cl-std specified in build options. v5; move dev_clc_version declaration from an earlier patch v4: Squash the __OPENCL_VERSION__ and CLC language version patches v3: (Jan) Allow device_version up to 2.2 while device_clc_version only goes to 2.0 Use get_cl_version to calculate version instead v2: Split out from the previous patch (Pierre) Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> CC: Jan Vesely <[email protected]>
* clover/llvm: Add get_[cl|language]_version, validation and some helpersAaron Watry2018-03-211-0/+88
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Used to calculate the default CLC language version based on the --cl-std in build args and the device capabilities. According to section 5.8.4.5 of the 2.0 spec, the CL C version is chosen by: 1) If you have -cl-std=CL1.1+ use the version specified 2) If not, use the highest 1.x version that the device supports Curiously, there is no valid value for -cl-std=CL1.0 Validates requested cl-std against device_clc_version Signed-off-by: Aaron Watry <[email protected]> Reviewed-by: Pierre Moreau <[email protected]> v7: (Pierre) Split cl/clc versions into separate lists and make more references const. v6: (Pierre) Add more const and fix some whitespace v5: (Aaron) Use a collection of cl versions instead of switch cases Consolidates the string, numeric version, and clc langstandard::kind v4: (Pierre) Split get_language_version addition and use into separate patches Squash patches that add the helpers and validate the language standard v3: Change device_version to device_clc_version v2: (Pierre) Move create_compiler_instance changes to correct patch to prevent temporary build breakage. Convert version_str into unsigned and use it to find language version Add build_error for unknown language version string Whitespace fixes
* docs: add 17.3.{8,9} in the release calendarJuan A. Suarez Romero2018-03-211-1/+13
| | | | | | | | | | | Mesa 18.0 series has not been released yet, so let's extend 17.3 lifetime. v2: add 17.3.9 in the calendar (Andres Gomez) CC: Andres Gomez <[email protected]> CC: Emil Velikov <[email protected]> Reviewed-by: Andres Gomez <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* intel/blorp: Fix compiler warning about num_layers.Eric Anholt2018-03-201-1/+1
| | | | | | | | | | The compiler doesn't notice that the condition for num_layers to be undefined already defined it above (as our assert checked in a debug build). v2: Move the pair of assignments to one outside of the block. Reviewed-by: Lionel Landwerlin <[email protected]>
* radv: add support for VK_EXT_depth_range_unrestrictedSamuel Pitoiset2018-03-202-0/+23
| | | | | | | | | | | | | | | | This extension removes the restrictions on minDepth/maxDepth, minDepthBounds/maxDepthBounds and VkClearDepthStencilValue::depth. The following CTS tests now pass: dEQP-VK.glsl.builtin_var.fragdepth.line_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.point_list_d32_sfloat_large_depth dEQP-VK.glsl.builtin_var.fragdepth.triangle_list_d32_sfloat_large_depth dEQP-VK.draw.inverted_depth_ranges.nodepthclamp_depth_range_unrestricted dEQP-VK.draw.inverted_depth_ranges.depthclamp_depth_range_unrestricted Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only enable one channel when exporting prim idSamuel Pitoiset2018-03-201-1/+1
| | | | | | | It's a 32-bit integer like the layer. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965: fix out of tree autotools buildLionel Landwerlin2018-03-201-1/+4
| | | | | | | Fixes: 2d2b15fbcab ("i965: fix autotools/android build") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Mathias Fröhlich <[email protected]>
* virgl: Implement seamless cube mapsStéphane Marchesin2018-03-212-1/+3
| | | | | | | | | | This was previously ignored. Along with the virglrenderer patch, this fixes ~100 dEQP tests: dEQP-GLES3.functional.texture.filtering.cube.* Signed-off-by: Stéphane Marchesin <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* i965: annotate brw_oa.py's --header and --code as requiredEmil Velikov2018-03-201-21/+14
| | | | | | | | | | | | As of earlier commit, the --header was made a hard requirement when using --code. Hence - annotate both as required and drop a few no longer needed checks. Fixes: 035cc7a12dc0 ("i965: perf: reduce i965 binary size") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: pipecontrol: add LRI write immediate flagLionel Landwerlin2018-03-201-0/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: genxml: add INSTPM/CS_DEBUG_MODE2 registersLionel Landwerlin2018-03-207-0/+46
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: fix autotools/android buildLionel Landwerlin2018-03-202-13/+6
| | | | | | | | | | | | | Autotools/android builds generate the header & code files in 2 steps, but the code generation requires the name of the header file to include it. This change generates both files in one command. Fixes: 035cc7a12dc ("i965: perf: reduce i965 binary size") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* dri3: Fix typo in version checkDaniel Stone2018-03-201-1/+1
| | | | | | | | | | | | | The have-new-DRI3 codepaths would never actually properly trigger, since there was a typo in configure.ac which broke the version check. This went unnoticed but for an error in config.log if you looked closely enough. Signed-off-by: Daniel Stone <[email protected]> Reported-by: Lukas F. Hartmann <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Fixes: 7aeef2d4efdc ("dri3: allow building against older xcb (v3)") Cc: Dave Airlie <[email protected]>
* meson: Don't build svga by default on ARM/AArch64Daniel Stone2018-03-201-1/+1
| | | | | | | | VMware has no (published) support for Arm-architecture guests. Signed-off-by: Daniel Stone <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reported-by: Dylan Baker <[email protected]>
* meson: Add default DRI drivers for ARM/AArch64Daniel Stone2018-03-201-0/+2
| | | | | | | | | | | | On all Arm architectures (ARMv7 and below as 'arm', ARMv8 and above as 'aarch64'), only build swrast for DRI drivers. The only classic drivers which could be used are r200 and NV20 cards, which seems unlikely enough that it shouldn't be the default. Signed-off-by: Daniel Stone <[email protected]> Reported-by: Javier Jardón <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* st/mesa: add compiler/nir/ prefix for nir includesEmil Velikov2018-03-201-2/+2
| | | | | | | | | | | | Stay consistent with the rest of the codebase, effectively fixing the autotools build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621 Fixes: ffa4bbe4665 ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker") Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* anv: off-by-one in GetDescriptorSetLayoutSupportScott D Phillips2018-03-201-1/+1
| | | | | | | | | Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: ddc4069122 ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: perf: reduce i965 binary sizeLionel Landwerlin2018-03-206-221/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance metric numbers are calculated the following way : - out of the 256 bytes long OA reports, we accumulate the deltas into an array of uint64_t - the equations' generated code reads the accumulated uint64_t deltas and normalizes them for a particular platform Our hardware is such that a number of counters in the OA reports always return the same values (i.e. they're not programmable), and they return the same values even across generations, and as a result a number of equations are identical in different metric sets across different generations. Up to now we've kept the generated code of the equations separated in different files (per generation/GT), and didn't apply any factorization of the common equations. We could have make some improvement by reusing equations within a given metrics file, but we can go even further and reuse across generations (i.e. all files). This change changes the code generation to emit a single file in which we reuse equations emitted code based on the hash of equations' strings. Here are the savings in a meson build : Before(.old)/after : $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old 43M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so 47M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old text data bss dec hex filename 13054002 409424 671856 14135282 d7aff2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so 14550386 409552 671856 15631794 ee85b2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old As a side comment here is the size of the drivers if we remove all of the metrics from the build : $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so 40M build/src/mesa/drivers/dri/libmesa_dri_drivers.so v2: Fix an issue with hashing of counter equations (Lionel) Build system rework (Emil) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (build system part) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: fix a counter return type on hswLionel Landwerlin2018-03-201-1/+1
| | | | | | | | The equation code computes a float (percentage) yet the return type was an uint64_t. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>