summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Add LOD bias/clamp loweringAlyssa Rosenzweig2019-11-224-1/+103
| | | | | | | | | We fetch the info with the new intrinsic and lower with ALU ops for txl instructions, which seemingly correspond to "TEXGRD" instructions (what we call textureLod). Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* pan/midgard: Implement load_sampler_lod_paramaters_panAlyssa Rosenzweig2019-11-223-1/+42
| | | | | | | | We can stuff this information in as parametrized system values, like we currently do texture size and SSBO addresses. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* nir: Add load_sampler_lod_paramaters_pan intrinsicAlyssa Rosenzweig2019-11-221-0/+4
| | | | | | | | | This loads in the <min_lod, max_lod, lod_bias> settings for a given sampler, which is necessary for lowering clamps/biases on certain Midgard chips. Signed-off-by: Alyssa Rosenzweig <[email protected]> Reviewed-by: Tomeu Vizoso <[email protected]>
* mapi/glapi: Generate sizeof() helpers instead of fixed sizes.Markus Wick2019-11-211-4/+11
| | | | | | | | | Generating a source code with a fixed size leads to issues with plattform dependent types. We either hard code 4 or 8 bytes there, and both are wrong on the other plattform. So this patch solves this issue by generating eg sizeof(GLsizeiptr), which is valid both on 32 and on 64 bit plattforms. Signed-off-by: Marek Olšák <[email protected]>
* intel/fs: Disable conditional discard optimization on Gen4 and Gen5Ian Romanick2019-11-211-1/+8
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The CMP instruction on Gen4 and Gen5 generates one bit (the LSB) of valid data and 31 bits of junk. Results of comparisons that are used as Boolean values need to have a fixup applied to generate the proper 0/~0 values. Calling fs_visitor::nir_emit_alu with need_dest=false prevents the fixup code from being generated. This results in a sequence like: cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ (+f0.1) or.z.f0.1(16) null<1>UD g4<8,8,1>UD g8<8,8,1>UD instead of cmp.l.f0.0(16) g8<1>F g14<8,8,1>F 0x0F /* 0F */ ... cmp.l.f0.0(16) g4<1>F g6<8,8,1>F 0x0F /* 0F */ or(16) g4<1>UD g4<8,8,1>UD g8<8,8,1>UD (+f0.1) and.z.f0.1(16) null<1>UD g4<8,8,1>UD 1UD I examined a couple of the shaders hurt by this change, and ALL of them would have been affected by this bug. :( Reviewed-by: Tapani Pälli <[email protected]> Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/1836 Fixes: 0ba9497e66a ("intel/fs: Improve discard_if code generation") Iron Lake total instructions in shared programs: 8122757 -> 8122957 (<.01%) instructions in affected programs: 8307 -> 8507 (2.41%) helped: 0 HURT: 100 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.67% x̄: 2.81% x̃: 2.76% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.58% 3.03% Instructions are HURT. total cycles in shared programs: 188510100 -> 188510376 (<.01%) cycles in affected programs: 76018 -> 76294 (0.36%) helped: 0 HURT: 55 HURT stats (abs) min: 2 max: 12 x̄: 5.02 x̃: 4 HURT stats (rel) min: 0.07% max: 3.75% x̄: 0.86% x̃: 0.56% 95% mean confidence interval for cycles value: 4.33 5.71 95% mean confidence interval for cycles %-change: 0.60% 1.12% Cycles are HURT. GM45 total instructions in shared programs: 4994403 -> 4994503 (<.01%) instructions in affected programs: 4212 -> 4312 (2.37%) helped: 0 HURT: 50 HURT stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 HURT stats (rel) min: 0.84% max: 6.25% x̄: 2.76% x̃: 2.72% 95% mean confidence interval for instructions value: 2.00 2.00 95% mean confidence interval for instructions %-change: 2.45% 3.07% Instructions are HURT. total cycles in shared programs: 128928750 -> 128928982 (<.01%) cycles in affected programs: 67442 -> 67674 (0.34%) helped: 0 HURT: 47 HURT stats (abs) min: 2 max: 12 x̄: 4.94 x̃: 4 HURT stats (rel) min: 0.09% max: 3.75% x̄: 0.75% x̃: 0.53% 95% mean confidence interval for cycles value: 4.19 5.68 95% mean confidence interval for cycles %-change: 0.50% 1.00% Cycles are HURT.
* docs: update calendar, add news item and link release notes for 19.2.6Dylan Baker2019-11-213-4/+4
|
* docs: Add SHA256 sum for 19.2.6Dylan Baker2019-11-211-1/+1
|
* docs: Add release notes for 19.2.6Dylan Baker2019-11-211-0/+87
|
* nir/serialize: do ctx = {0} instead of manual initializationsMarek Olšák2019-11-211-4/+2
| | | | Reviewed-by: Connor Abbott <[email protected]>
* nir: strip as we serialize to remove the nir_shader_clone callMarek Olšák2019-11-215-134/+34
| | | | | | Serializing stripped NIR is faster now. Reviewed-by: Connor Abbott <[email protected]>
* etnaviv: add drm-shimChristian Gmeiner2019-11-214-0/+269
| | | | | Signed-off-by: Christian Gmeiner <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* vk_util: drop duplicate formats in vk_format_map[]Eric Engestrom2019-11-211-2/+0
| | | | | | Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* turnip: implement UBWCJonathan Marek2019-11-219-125/+325
| | | | | | | | | | | This enables UBWC for everything except 3D textures. It breaks many image_to_image copies but those aren't important and it can be worked around later (image_to_image copy needs to be done in two steps, decode from the source format and then encode to the destination format). Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* freedreno/regs: update UBWC related bitsJonathan Marek2019-11-213-7/+11
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* swr: Fix build with llvm-10.0.Vinson Lee2019-11-211-1/+4
| | | | | | | | | | | | | Fix build error after llvm-10.0 commit 1dfede3122ee ("Move CodeGenFileType enum to Support/CodeGen.h"). ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp: In member function ‘void JitManager::DumpAsm(llvm::Function*, const char*)’: ../src/gallium/drivers/swr/rasterizer/jitter/JitManager.cpp:428:45: error: ‘CGFT_AssemblyFile’ is not a member of ‘llvm::TargetMachine’ *pMPasses, filestream, nullptr, TargetMachine::CGFT_AssemblyFile); ^ Signed-off-by: Vinson Lee <[email protected]> Reviewed-by: Jan Zielinski <[email protected]>
* aco: fix copy+paste errorRhys Perry2019-11-211-2/+2
| | | | | Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* aco: improve waitcnt insertion around loopsRhys Perry2019-11-211-45/+108
| | | | | | | | | | | | | | | | | | | | | Do this by repeating processing of loops until no progress is made. Totals from affected shaders: SGPRS: 162576 -> 162576 (0.00 %) VGPRS: 145228 -> 145228 (0.00 %) Spilled SGPRs: 668 -> 668 (0.00 %) Spilled VGPRs: 0 -> 0 (0.00 %) Private memory VGPRs: 0 -> 0 (0.00 %) Scratch size: 0 -> 0 (0.00 %) dwords per thread Code Size: 15778640 -> 15771336 (-0.05 %) bytes LDS: 146 -> 146 (0.00 %) blocks Max Waves: 6087 -> 6087 (0.00 %) v2: use block_kind_loop_header/block_kind_loop_exit to repeat at the end of loops instead of at each continue Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Daniel Schürmann <[email protected]>
* freedreno/perfctrs/fdperf: periodically restore countersRob Clark2019-11-211-1/+31
| | | | | | | | | When GPU is idle and suspends, the currently selected countables will all reset to the first one. So periodically restore the selected countables. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs: add fdperfRob Clark2019-11-212-0/+1082
| | | | | | | | | Port from the envytools tree, but converted to use the .c tables for describing the perfcounter groups/countables, rather than using rnndec to get this at runtime from the register xml. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs/a6xx: remove RBBM countersRob Clark2019-11-211-1/+1
| | | | | | | | | | Currently this are getting blocked by the kernel.. these counters don't seem to be the most useful ones, and to use them we'd have to somehow probe the kernel by submitting cmdstream to write the selector regs and see if that triggers a GPU fault. So let's just skip them. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfctrs/a2xx: move CP to be first groupRob Clark2019-11-211-1/+1
| | | | | | | fdperf expects this, to find the ALWAYS_COUNT counter Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs: add accessor to get per-gen tablesRob Clark2019-11-218-24/+66
| | | | | Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs: move to shared locationRob Clark2019-11-2113-12/+91
| | | | | | | | | | This should eventually be useful for VK_KHR_performance_query as well. And in the more near term, for fdperf. Attempt to not break android build is best-effort and untested. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs: remove gallium dependenciesRob Clark2019-11-215-9/+75
| | | | | | | Prep work to move to a shared location. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* freedreno/perfcntrs: small cleanupRob Clark2019-11-214-82/+31
| | | | | | | | | When we had one gen supporting performance counters, it made sense to have these builder macros in the .c file with the table. But time has come to de-duplicate. Signed-off-by: Rob Clark <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* nir: fix deref offset builderDave Airlie2019-11-221-1/+1
| | | | | | Use the correct bit size Reviewed-by: Jason Ekstrand <[email protected]>
* vtn/opencl: add clz supportDave Airlie2019-11-222-0/+10
| | | | | | This is needed for OpenCL Reviewed-by: Jason Ekstrand <[email protected]>
* nouveau: request ufind_msb64 lowering in the frontend.Dave Airlie2019-11-221-1/+1
| | | | | | | | This passes the piglit CL builtin-ulong-clz-1.0.generated.cl test. Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir: add 64-bit ufind_msb lowering support. (v2)Dave Airlie2019-11-222-0/+24
| | | | | | | | This adds the option to lower 64-bit ufind_msb opcodes. v2: use split_x/y removes component loops (Jason) Reviewed-by: Jason Ekstrand <[email protected]>
* spirv/nir/opencl: handle some multiply instructions.Dave Airlie2019-11-222-0/+55
| | | | | | | This adds support for some missing 24-bit and hi multiply variants. Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: get the correct type for function returns.Dave Airlie2019-11-221-1/+4
| | | | | | | | This needs to be derived from the address format, not always 1/32. Suggested by Jason Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: don't store 0 to cs.ptr_size for non kernel stages.Dave Airlie2019-11-221-1/+0
| | | | | | cs is a union so storing this there is wrong. Reviewed-by: Jason Ekstrand <[email protected]>
* util: add missing R8G8B8A8_SRGB format to vk_format_mapJonathan Marek2019-11-211-0/+1
| | | | | Signed-off-by: Jonathan Marek <[email protected]> Reviewed-by: Eric Anholt <[email protected]>
* docs: fix ascii html representationElie Tournier2019-11-211-1/+1
| | | | | | | | v2 (Eric): Use more readable ascii version Signed-off-by: Elie Tournier <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* Docs: remove duplicate meson docs for windowsElie Tournier2019-11-211-12/+0
| | | | | | | | This block is duplicated, we already have the windows instruction above. Signed-off-by: Elie Tournier <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* ci: Move freedreno's parallelism to the runner instead of gitlab-ci jobs.Eric Anholt2019-11-211-3/+1
| | | | | | | | | | | | | | | | | | | | I set the runners to concurrency=1, so they serve only one gitlab-ci job at at time. Swap over to using the parallel runner now to keep the runners busy, more efficiently than spawning many docker containers and downloading artifacts multiple times, and producing easier-to-understand results for browsing on the web. This bumps the a306 runners to 4x parallel instead of 2x like before, but cheza gles3 drops from 6 to 4. Current rough timings of the jobs (if no container download): db410c-gles2: 5:00 a630-gles2: 1:30 a630-gles3: 6:00 a630-gles31: 5:30 a630-gles3 is a bit longer than I like, but it should come back down once I can sort out the NIR algebraic rewinding.
* glsl: add missing initialization of the location path fieldIago Toral Quiroga2019-11-211-0/+2
| | | | | | | | | | | | | | This was apparently missed in 67b32190f3c95, which added support for ARB_shading_language_include to #line, including the 'path' field for the location. Fixes crashes in CTS with all drivers as they attempt to access an uninitialized path string during parsing. Fixes: 67b32190f3c95 ("glsl: add ARB_shading_language_include support to #line") Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2132 Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Jose Maria Casanova <[email protected]>
* docs: update features.txt for RADVRhys Perry2019-11-211-2/+2
| | | | | | | [skip ci] Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* gitlab-ci: Directly use host-mapped directory for ccacheMichel Dänzer2019-11-211-9/+3
| | | | | | | | | | | | | | | | | | | | | | | Use hardcoded /cache/mesa/ccache for the cache, so it will be shared by all jobs of all Mesa projects running on the same runner host. This should increase the hit rate and decrease the worst case storage used. Further benefits of directly using a host-mapped directory: * Saves up to ~1 minute per job for restoring and saving the cache contents via the GitLab CI cache mechanism * Cache contents generated by failed jobs are no longer lost * Jobs running in parallel on the same runner host can get hits from each other Also enable compression, so the default maximum cache size of 5G might be sufficient. v2: * Move CCACHE_DIR variable to the .build-linux template Suggested-by: Eric Anholt <[email protected]> Reviewed-by: Eric Anholt <[email protected]> # v1
* gitlab-ci: remove now useless meson-swr-glvnd build jobSamuel Pitoiset2019-11-211-24/+0
| | | | | | | All things are already part of meson-main. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: build GLVND in meson-clangSamuel Pitoiset2019-11-211-1/+2
| | | | | | | | Building GLVND in meson-main doesn't work because this disables libEGL and it's needed for running shader-db. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: build swr in meson-mainSamuel Pitoiset2019-11-211-2/+2
| | | | | | | | Now that debugoptimized isn't set and that all test jobs depend on meson-testing, enabling swr shouldn't slowdown the CI. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: do not build with debugoptimized for meson-mainSamuel Pitoiset2019-11-211-1/+0
| | | | | | | This should reduce compile time because optimizations are costly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: add a job that only build things needed for testingSamuel Pitoiset2019-11-211-4/+21
| | | | | | | | | For turnip and RADV testing, we will need a debugoptimized build without UBSAN. This introduces meson-testing which builds only the things that are needed by the test stage. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: fix ldd check for Vulkan driversSamuel Pitoiset2019-11-211-1/+1
| | | | | | | The 'dri' directory isn't created when building Vulkan drivers. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: move building piglit into a separate scriptSamuel Pitoiset2019-11-212-10/+14
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* pipe-loader: check that the pointer to driconf_xml isn't NULLSamuel Pitoiset2019-11-211-1/+1
| | | | | | | | | | | | This happens when mesa is built with only swrast. The default driver being kmsro and the default driconf file being v3d, it's NULL and then strdup crashes. This fixes a crash with piglit spec/egl_mesa_query_driver/conformance. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Michel Dänzer <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* panfrost: Add the lod_bias fieldAlyssa Rosenzweig2019-11-213-1/+6
| | | | | | | Enough trial and error ... just think even *more* Midgard about where this field might be! Signed-off-by: Alyssa Rosenzweig <[email protected]>
* compiler: move build definition of pp_standalone_scaffolding.cTimothy Arceri2019-11-212-2/+3
| | | | | | | | | This should fix android build issues while still allowing scons to build the standalone compiler. Closes: https://gitlab.freedesktop.org/mesa/mesa/issues/2129 Reviewed-by: Mark Janes <[email protected]>
* nir/validate: validate num_components on registers and intrinsicsKarol Herbst2019-11-211-8/+16
| | | | | | | | | | | also make 8 and 16 compoments invalid. We will enable that later again when we actually support it. v2: fix validation of nir_intrinsic_instr::num_components correct validation of instr->num_components Signed-off-by: Karol Herbst <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>