summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir: Add an array copy optimizationJason Ekstrand2018-08-234-0/+415
| | | | | | | | | | | | This peephole optimization looks for a series of load/store_deref or copy_deref instructions that copy an array from one variable to another and turns it into a copy_deref that copies the entire array. The pattern it looks for is extremely specific but it's good enough to pick up on the input array copies in DXVK and should also be able to pick up the sequence generated by spirv_to_nir for a OpLoad of a large composite followed by OpStore. It can always be improved later if needed. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/nir: Use nir_shrink_vec_array_varsJason Ekstrand2018-08-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15176765 (<.01%) instructions in affected programs: 4259 -> 3419 (-19.72%) helped: 1 HURT: 0 total spills in shared programs: 10954 -> 10855 (-0.90%) spills in affected programs: 295 -> 196 (-33.56%) helped: 1 HURT: 0 total fills in shared programs: 22222 -> 22117 (-0.47%) fills in affected programs: 417 -> 312 (-25.18%) helped: 1 HURT: 0 The helped shader is from the OglCSDof synmark test. On my Kaby Lake laptop, the actual framerate of the benchmark didn't appear to improve beyond the noise. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a array-of-vector variable shrinking passJason Ekstrand2018-08-232-0/+718
| | | | | | | This pass looks for variables with vector or array-of-vector types and narrows the type to only the components used. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/nir: Use the new structure and array splitting passesJason Ekstrand2018-08-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | We call structure splitting once because it is guaranteed to split all the structures in the entire shader in one go. We call array splitting in the loop in case future optimizations turn indirects into direct dereferences and we can split more arrays. Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15177605 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 This is unsurprising because nir_lower_vars_to_ssa already effectively does structure and array splitting internally. It doesn't actually split the variables but it's ability to reason about aliasing in the presence of arrays and structures and pick out scalars or vectors to be lowered to SSA values is fairly advanced. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add an array splitting passJason Ekstrand2018-08-232-0/+584
| | | | | | | | | | | | | | | | | | | | | | | This pass looks for array variables where at least one level of the array is never indirected and splits it into multiple smaller variables. This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through arrays of arrays and can detect indirects on just one level or even see that arr[i][0][5] does not alias arr[i][1][j]. This pass exists to help other passes more easily see through arrays of arrays. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. v2 (Jason Ekstrand): - Better comments and naming (some from Caio) - Rework to use one hash map instead of two v2.1 (Jason Ekstrand): - Fix a couple of bugs that were added in the rework including one which basically prevented it from running Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add a structure splitting passJason Ekstrand2018-08-234-0/+278
| | | | | | | | | | | This pass doesn't really do much now because nir_lower_vars_to_ssa can already see through structures and considers them to be "split". This pass exists to help other passes more easily see through structure variables. If a back-end does implement arrays using scratch or indirects on registers, having more smaller arrays is likely to have better memory efficiency. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir/types: Add array_or_matrix helpersJason Ekstrand2018-08-232-0/+17
| | | | Reviewed-by: Thomas Helland<[email protected]>
* i965: don't include compute resources in "Combined" limitsKenneth Graunke2018-08-231-11/+11
| | | | | | | | | | | The combined limits should only include shader stages that can be active at the same time. We don't need to include compute. See also cff290df4c09547cd2cb3b129ec59bdebdadba90 for st/mesa. Unbreaks i965 from assert failing on driver load since Marek's 45f87a48f94148b484961f18a4f1ccf86f066b1c, which dropped the core Mesa capabilities before adjusting driver limits down to match.
* radeonsi: increase the maximum UBO size to 2 GBMarek Olšák2018-08-231-1/+1
| | | | | | | | | Same as the closed driver. This causes a failure in GL45-CTS.compute_shader.max, which has a trivial bug. Tested-by: Dieter Nützel <[email protected]>
* radeonsi: bump MAX_GS_INVOCATIONSMarek Olšák2018-08-232-3/+3
| | | | | | same as the closed driver Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_SHADER_BUFFER_SIZEMarek Olšák2018-08-2319-1/+39
| | | | Tested-by: Dieter Nützel <[email protected]>
* gallium: add PIPE_CAP_MAX_GS_INVOCATIONSMarek Olšák2018-08-2319-1/+40
| | | | Tested-by: Dieter Nützel <[email protected]>
* tgsi/ureg: don't call tgsi_sanity when it's too slowMarek Olšák2018-08-231-1/+12
| | | | Tested-by: Dieter Nützel <[email protected]>
* st/mesa: fix up uniform limits to be able to expose large UBOsMarek Olšák2018-08-231-8/+23
| | | | Tested-by: Dieter Nützel <[email protected]>
* st/mesa: don't include compute resources in "Combined" limitsMarek Olšák2018-08-231-6/+3
| | | | | | | The combined limits should only include shader stages that can be active at the same time. Tested-by: Dieter Nützel <[email protected]>
* st/mesa: set ctx->Const.SubPixelBitsMarek Olšák2018-08-231-0/+1
| | | | Tested-by: Dieter Nützel <[email protected]>
* glsl: fix error checking against MAX_UNIFORM_LOCATIONSMarek Olšák2018-08-231-2/+6
| | | | Tested-by: Dieter Nützel <[email protected]>
* mesa: make MaxCombinedUniformComponents 64-bit to allow large UBOsMarek Olšák2018-08-232-7/+7
| | | | Tested-by: Dieter Nützel <[email protected]>
* mesa: add ctx->Const.MaxGeometryShaderInvocationsMarek Olšák2018-08-235-3/+7
| | | | | | | radeonsi wants to report a different value Reviewed-by: Ian Romanick <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* mesa: don't include compute resources in MAX_COMBINED_* limitsMarek Olšák2018-08-231-9/+13
| | | | | | | | | | 5 is the maximum number of shader stages that can be used by 1 execution call at the same time (e.g. a draw call). The limit ensures that each stage can use all of its binding points. Compute is separate and doesn't need the 5x multiplier. Tested-by: Dieter Nützel <[email protected]>
* mesa: bump GL_MAX_ELEMENTS_INDICES and GL_MAX_ELEMENTS_VERTICESMarek Olšák2018-08-232-2/+5
| | | | | | | | same number as our closed GL driver v2: don't use MaxArrayLockSize Tested-by: Dieter Nützel <[email protected]>
* mesa: remove incorrect change for EXT_disjoint_timer_queryMarek Olšák2018-08-231-2/+1
| | | | | Reviewed-by: Tapani Pälli <[email protected]> Tested-by: Dieter Nützel <[email protected]>
* glapi: actually implement GL_EXT_robustness for GLESMarek Olšák2018-08-231-0/+32
| | | | | | | | | | | | The extension was exposed but not the functions. This fixes: dEQP-GLES31.functional.debug.negative_coverage.get_error.buffer.readn_pixels dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformfv dEQP-GLES31.functional.debug.negative_coverage.get_error.state.get_nuniformiv Cc: 18.1 18.2 <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* intel/decoder: Decode SFIXED values.Kenneth Graunke2018-08-231-3/+7
| | | | | | This lets us example SAMPLER_STATE's LOD Bias field, among other things. Reviewed-by: Lionel Landwerlin <[email protected]>
* travis: use python3 for the autoconf buildsEmil Velikov2018-08-231-1/+11
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* configure: allow building with python3Emil Velikov2018-08-2327-39/+37
| | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* bin/git_sha1_gen.py: remove execute bit/shebangEmil Velikov2018-08-231-2/+0
| | | | | | | | The script is executed explicitly via the build system, that uses PYTHON/prog_python and equivalent. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* vk/wsi: avoid reading uninitialised memoryEric Engestrom2018-08-231-2/+2
| | | | | | | | | | | | It will be ignored by x11_swapchain_result() anyway (because reaching the `fail` label without setting `result` means the swapchain status was already a hard error), but the compiler still complains about reading uninitialised memory. While at it, drop the unused assignment right before returning. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* egl: drop unused _EGL_BUILT_IN_DRIVER_DRI2Eric Engestrom2018-08-233-4/+1
| | | | | | | Unused since b174a1ae720cb404738c "egl: Simplify the "driver" interface". Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* radv/gfx9: implement coherent shaders for VK_ACCESS_SHADER_READ_BITSamuel Pitoiset2018-08-231-1/+20
| | | | | | | | Single-sample color and single-sample depth (not stencil) are coherent with shaders. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]
* bin/install_megadrivers.py: Remove shebang and executable bitMathieu Bridon2018-08-231-1/+0
| | | | | | | | | | | | Since the script is never executed directly, but launched by Meson as an argument to the Python interpreter, those are not needed any more. In addition, they are the reason this script was missed when I moved the Meson buildsystem to Python 3, so removing them helps avoiding future confusion. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: Run the install script with Python 3Mathieu Bridon2018-08-235-0/+5
| | | | | | | | The script was being run directly as an executable, and it has a Python 2 shebang. Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glsl: remove execute bit and shebang from python testsEmil Velikov2018-08-233-3/+0
| | | | | | | | | | | | | Just like the rest of the tree - these should be run either as part of the build system check target, or at the very least with an explicitly versioned python executable. Fixes: db8cd8e3677 ("glcpp/tests: Convert shell scripts to a python script") Fixes: 97c28cb0823 ("glsl/tests: Convert optimization-test.sh to pure python") Fixes: 3b52d292273 ("glsl/tests: reimplement warnings-test in python") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* docs: update required mako versionEmil Velikov2018-08-231-1/+1
| | | | | | | | | | | The requirement was bumped a while back, but we forgot to update the docs. Fixes: ed871af91c2 ("configure.ac: raise Mako required version to 0.8.0") Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* configure: use distutils in ax_check_python_mako_moduleEmil Velikov2018-08-231-2/+3
| | | | | | | | Handling the version comparison by hand is a bad idea. Python has a handy module distutils for that - use it. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* configure: enforce python 2.7 with AM_PATH_PYTHONEmil Velikov2018-08-232-3/+6
| | | | | | | | | | | | | | | Currently we use AC_CHECK_PROGS looking for python2.7, python2 and finally python. That is due to the varying names used across the different OS. Use the handy AM_PATH_PYTHON which finds the correct name and checks for the version. Note: python2.7 has been an unofficial requirement for quite some time. Update the docs to reflect that. Cc: Dylan Baker <[email protected]> Signed-off-by: Emil Velikov <[email protected]>
* i965: Enable INTEL_shader_atomic_float_minmax on Gen9+Ian Romanick2018-08-221-0/+1
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965: Sort Gen9+ extension enablesIan Romanick2018-08-221-3/+3
| | | | | | | | | This is a strictly alphabetic sort, as is done in extensions_table.h There are other options. We should pick one and document it. Right now, this file is chaos. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Implement untyped atomic float min, max, and compare-swap ↵Ian Romanick2018-08-2214-1/+261
| | | | | | | | | | dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Expand untyped atomic message type field by a bitIan Romanick2018-08-223-4/+9
| | | | | | | | | | | This is necessary for a new Gen9 message type that will be added in the next patch. There are also Gen8 message types that need the extra bit (mostly for bindless). v2: Split off from the next patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Silence unused parameter warningsIan Romanick2018-08-225-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | src/intel/compiler/brw_disasm_info.c: In function ‘nir_print_instr’: src/intel/compiler/brw_disasm_info.c:30:61: warning: unused parameter ‘instr’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {} ^~~~~ src/intel/compiler/brw_disasm_info.c:30:74: warning: unused parameter ‘fp’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {} ^~ src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: src/intel/compiler/brw_disasm.c:850:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ src/intel/compiler/brw_fs_surface_builder.cpp: In function ‘void brw::surface_access::emit_byte_scattered_write(const brw::fs_builder&, const fs_reg&, const fs_reg&, const fs_reg&, unsigned int, unsigned int, unsigned int, brw_predicate)’: src/intel/compiler/brw_fs_surface_builder.cpp:193:57: warning: unused parameter ‘size’ [-Wunused-parameter] unsigned dims, unsigned size, ^~~~ v2: Update commit message. brw_fs_generator.cpp warnings were already fixed by another patch. Noticed by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add floating point atomic min, max, and compare-swap instrinsicsIan Romanick2018-08-224-8/+50
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: Add floating point atomic add instrinsicsIan Romanick2018-08-225-5/+22
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add support for lowering shared-variable float atomicsIan Romanick2018-08-221-3/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add support for lowering SSBO float atomicsIan Romanick2018-08-221-3/+3
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-221-1/+32
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: Extension boilerplate for INTEL_shader_atomic_float_minmaxIan Romanick2018-08-224-0/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* docs: Initial version of INTEL_shader_atomic_float_minmax specIan Romanick2018-08-221-0/+200
| | | | | | | | | | | | | | | v2: Describe interactions with the capabilities added by SPV_INTEL_shader_atomic_float_minmax v3: Remove 64-bit float support. v4: Explain NaN issues. Explain issues with atomicMin(-0, +0) and atomicMax(-0, +0). v5: Fix whitespace issues noticed by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* glsl: Add built-in functions for NV_shader_atomic_floatIan Romanick2018-08-221-3/+48
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: Extension boilerplate for NV_shader_atomic_floatIan Romanick2018-08-224-0/+5
| | | | | Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>