aboutsummaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* i965: fix autotools/android buildLionel Landwerlin2018-03-202-13/+6
| | | | | | | | | | | | | Autotools/android builds generate the header & code files in 2 steps, but the code generation requires the name of the header file to include it. This change generates both files in one command. Fixes: 035cc7a12dc ("i965: perf: reduce i965 binary size") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* st/mesa: add compiler/nir/ prefix for nir includesEmil Velikov2018-03-201-2/+2
| | | | | | | | | | | | Stay consistent with the rest of the codebase, effectively fixing the autotools build. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105621 Fixes: ffa4bbe4665 ("st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state tracker") Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]>
* anv: off-by-one in GetDescriptorSetLayoutSupportScott D Phillips2018-03-201-1/+1
| | | | | | | | | Loop was accessing one more than bindingCount elements from pBindings, accessing uninitialized memory. Fixes: ddc4069122 ("anv: Implement VK_KHR_maintenance3") Reviewed-by: Nanley Chery <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: perf: reduce i965 binary sizeLionel Landwerlin2018-03-206-221/+334
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Performance metric numbers are calculated the following way : - out of the 256 bytes long OA reports, we accumulate the deltas into an array of uint64_t - the equations' generated code reads the accumulated uint64_t deltas and normalizes them for a particular platform Our hardware is such that a number of counters in the OA reports always return the same values (i.e. they're not programmable), and they return the same values even across generations, and as a result a number of equations are identical in different metric sets across different generations. Up to now we've kept the generated code of the equations separated in different files (per generation/GT), and didn't apply any factorization of the common equations. We could have make some improvement by reusing equations within a given metrics file, but we can go even further and reuse across generations (i.e. all files). This change changes the code generation to emit a single file in which we reuse equations emitted code based on the hash of equations' strings. Here are the savings in a meson build : Before(.old)/after : $ du -h ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old 43M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so 47M ./build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old $ size build/src/mesa/drivers/dri/libmesa_dri_drivers.so build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old text data bss dec hex filename 13054002 409424 671856 14135282 d7aff2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so 14550386 409552 671856 15631794 ee85b2 build/src/mesa/drivers/dri/libmesa_dri_drivers.so.old As a side comment here is the size of the drivers if we remove all of the metrics from the build : $ du -sh build/src/mesa/drivers/dri/libmesa_dri_drivers.so 40M build/src/mesa/drivers/dri/libmesa_dri_drivers.so v2: Fix an issue with hashing of counter equations (Lionel) Build system rework (Emil) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]> (build system part) Reviewed-by: Kenneth Graunke <[email protected]>
* i965: perf: fix a counter return type on hswLionel Landwerlin2018-03-201-1/+1
| | | | | | | | The equation code computes a float (percentage) yet the return type was an uint64_t. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* mesa: fix leaking ParameterValueOffsetTapani Pälli2018-03-201-0/+1
| | | | | | | | | | | | | ==15115== 48 bytes in 1 blocks are definitely lost in loss record 16 of 66 ==15115== at 0x4C2EC15: realloc (vg_replace_malloc.c:785) ==15115== by 0x8602C3E: _mesa_reserve_parameter_storage (prog_parameter.c:212) ==15115== by 0x8602D1E: _mesa_add_parameter (prog_parameter.c:252) ==15115== by 0x86032C4: _mesa_add_sized_state_reference (prog_parameter.c:384) ==15115== by 0x8603324: _mesa_add_state_reference (prog_parameter.c:409) Fixes: edded12376 "mesa: rework ParameterList to allow packing" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* dri3: Don't fail on version mismatchDaniel Stone2018-03-205-9/+20
| | | | | | | | | | | | | The previous commit to make DRI3 modifier support optional, breaks with an updated server and old client. Make sure we never set multibuffers_available unless we also support it locally. Make sure we don't call stubs of new-DRI3 functions (or empty branches) which will never succeed. Signed-off-by: Daniel Stone <[email protected]> Reviewed-by: Dave Airlie <[email protected]> Fixes: 7aeef2d4efdc ("dri3: allow building against older xcb (v3)")
* radv: don't lower indirects until after opts have runTimothy Arceri2018-03-201-1/+8
| | | | | | | | Noticed while passing by. Not sure if it impacts anything, but likely to impact GFX9 more than anything else since we lower inputs, outputs and locals there. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* st/nir: fix atomic lowering for gallium driversTimothy Arceri2018-03-204-8/+14
| | | | | | | | | | | | | | | i965 and gallium handle the atomic buffer index differently. It was just by luck that the single piglit test for this was passing. For gallium we use the atomic binding so that we match the handling in st_bind_atomics(). On radeonsi this fixes the CTS test: KHR-GL43.shader_storage_buffer_object.advanced-write-fragment It also fixes tressfx hair rendering in Tomb Raider. Reviewed-by: Marek Olšák <[email protected]>
* st/radeonsi: enable uniform packing in NIR backendTimothy Arceri2018-03-202-9/+7
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st: add uniform packing support to lower_uniforms_to_ubo()Timothy Arceri2018-03-203-7/+14
| | | | Reviewed-by: Marek Olšák <[email protected]>
* gallium: add packed uniform CAPTimothy Arceri2018-03-2019-0/+22
| | | | Reviewed-by: Marek Olšák <[email protected]>
* st/nir/radeonsi: move nir_lower_uniforms_to_ubo() to the state trackerTimothy Arceri2018-03-209-14/+18
| | | | | | | | | | | | | This will only ever be used by gallium drivers so it probably doesn't belong in the nir toolkit. Also we want to pass it some non NIR things in the following patch. To avoid regressions we wrap the lowering calls that have been moved to st_glsl_to_nir with a quick hack so that they are only called for radeonsi, we will replace the hack with a check for uniform packing in a following patch. Reviewed-by: Marek Olšák <[email protected]>
* st: add st_glsl_type_dword_size() helperTimothy Arceri2018-03-202-0/+44
| | | | | | This will be used to support uniform packing. Reviewed-by: Marek Olšák <[email protected]>
* st/glsl_to_nir: add support for packed builtin uniformsTimothy Arceri2018-03-201-5/+37
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: add _mesa_add_sized_state_reference() helperTimothy Arceri2018-03-202-14/+27
| | | | | | This will be used for adding packed builtin uniforms. Reviewed-by: Marek Olšák <[email protected]>
* mesa: add support propagate uniform support for packed uniformsTimothy Arceri2018-03-201-2/+18
| | | | Reviewed-by: Marek Olšák <[email protected]>
* mesa: allow for uniform packing when adding uniforms to param listTimothy Arceri2018-03-201-5/+27
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: add packing support for setting uniform handlesTimothy Arceri2018-03-201-3/+13
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: add packing support for setting uniformsTimothy Arceri2018-03-201-19/+53
| | | | Reviewed-by: Nicolai Hähnle <[email protected]>
* mesa: create copy uniform to storage helpersTimothy Arceri2018-03-201-63/+91
| | | | | | | | These will be used in the following patch to allow copying directly to the param list when packing is enabled. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* mesa: rework ParameterList to allow packingTimothy Arceri2018-03-2019-60/+125
| | | | | | | | | | | | | | Currently everything is padded to 4 components. Making the list more flexible will allow us to do uniform packing. V2 (suggestions from Nicolai): - always pass existing calls to _mesa_add_parameter() true for padd_and_align - fix bindless param value offsets - remove left over wip logic from pad and align code - zero out param value padding - whitespace fix Reviewed-by: Marek Olšák <[email protected]>
* mesa: add PackedDriverUniformStorage constTimothy Arceri2018-03-201-0/+3
| | | | | | | Will be used to determine whether to take packing code paths or not. Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* broadcom/vc5: Don't annotate dumps with stale live intervals.Eric Anholt2018-03-194-2/+8
| | | | | As you're debugging register allocation, you may have changed the intervals and not recomputed yet. Just skip the dump in that case.
* broadcom/vc5: Add support for register spilling.Eric Anholt2018-03-197-11/+306
| | | | | | | | | | | | | | | Our register spilling support is nice to have since vc4 couldn't at all, but we're still very restricted due to needing to not spill during a TMU operation, or during the last segment of the program (which would be nice to spill a value of, when there's a long-lived value being passed through with little modification from the start to the end). We could do better by emitting unspills for the last-segment values just before the last thrsw, since the last segment is probably not the maximum interference area. Fixes GTF uniform_buffer_object_arrays_of_all_valid_basic_types and 3 others.
* broadcom/vc5: Remove redundant last_inst lookup.Eric Anholt2018-03-191-1/+0
| | | | The point was to get the MOV, which the MOV_dest already returned.
* broadcom/vc5: On QPU pack error, dump the instruction and return cleanly.Eric Anholt2018-03-191-1/+7
| | | | This is nice for debugging when you've made a bad instruction.
* broadcom/vc5: Add cursors to the compiler infrastructure, like NIR's.Eric Anholt2018-03-193-8/+73
| | | | | This will let me do lowering late in compilation using the same instruction builder as we use in nir_to_vir.
* broadcom/vc5: Move the umul macro to a header.Eric Anholt2018-03-192-8/+8
| | | | Anywhere we want to multiply, we probably want this.
* broadcom/vc5: Correct the arg count of TIDX/EIDX.Eric Anholt2018-03-191-2/+2
|
* broadcom/vc5: Re-do live variables after removing thrsws.Eric Anholt2018-03-192-3/+14
| | | | Otherwise our start/ends ips won't line up with the actual instructions.
* broadcom/vc5: Add a QPU helper for instructions using the TLB.Eric Anholt2018-03-192-0/+23
| | | | This will be used for detecting last thread segment in register spilling.
* broadcom/vc5: Introduce v3d_qpu_reads_vpm()/v3d_qpu_writes_vpm().Eric Anholt2018-03-192-4/+35
| | | | | These helpers will be used in register spilling to determine where to add a last thrsw if needed, and might help refactor QPU scheduling.
* broadcom/vc5: The ldvpm signal also a case of using the VPM.Eric Anholt2018-03-191-0/+3
| | | | | The QPU scheduling code calling this function already separately checked this signal.
* broadcom/vc5: Extract v3d_qpu_writes_tmu() helper.Eric Anholt2018-03-193-6/+12
| | | | This will be reused in register spilling.
* radv: don't export NULL layer.Dave Airlie2018-03-191-1/+1
| | | | | | | | | | | | We have some cases where in subpass we want the layer but having it be 0 and loaded in the frag shader without the vertex shader exporting it is fine. So don't export the layer if we don't have a value to put in it. Fixes: d4c74aed7a8 (radv/multiview: mark layer_input if we have input attachments.) Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* mesa: adjust incorrect comment in texture_buffer_rangeMarek Olšák2018-03-191-2/+2
|
* nir: Don't compare b2f or b2i with zeroIan Romanick2018-03-191-0/+5
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | All of the shaders that had loops changed were in Tomb Raider. The one shader that lost SIMD16 is one of those. Skylake total instructions in shared programs: 14391653 -> 14390468 (<.01%) instructions in affected programs: 111891 -> 110706 (-1.06%) helped: 501 HURT: 0 helped stats (abs) min: 1 max: 155 x̄: 2.37 x̃: 1 helped stats (rel) min: 0.05% max: 21.54% x̄: 1.61% x̃: 1.01% 95% mean confidence interval for instructions value: -3.23 -1.50 95% mean confidence interval for instructions %-change: -1.77% -1.45% Instructions are helped. total cycles in shared programs: 532793024 -> 532776598 (<.01%) cycles in affected programs: 987682 -> 971256 (-1.66%) helped: 348 nnHURT: 41 helped stats (abs) min: 1 max: 3074 x̄: 54.91 x̃: 18 helped stats (rel) min: 0.05% max: 32.24% x̄: 3.36% x̃: 1.68% HURT stats (abs) min: 1 max: 422 x̄: 65.39 x̃: 24 HURT stats (rel) min: 0.09% max: 39.29% x̄: 9.50% x̃: 2.02% 95% mean confidence interval for cycles value: -64.08 -20.38 95% mean confidence interval for cycles %-change: -2.78% -1.23% Cycles are helped. total loops in shared programs: 4854 -> 4829 (-0.52%) loops in affected programs: 27 -> 2 (-92.59%) helped: 18 HURT: 0 LOST: 1 GAINED: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radv: lower constant initializers on output variables earlierDave Airlie2018-03-191-0/+5
| | | | | | | | | | | | | | | | | If a shader only writes to an output via a constant initializer we need to lower it before we call nir_remove_dead_variables so that this pass sees the stores from the initializer and doesn't kill the output. Fixes test failures in new work-in-progress CTS tests: dEQP-VK.spirv_assembly.instruction.graphics.variable_init.output.float This is ported from anv: 99b57daf4a anv/pipeline: lower constant initializers on output variables earlier from Iago Toral Quiroga <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview timestamp queries.Dave Airlie2018-03-191-36/+43
| | | | | | | | For each view bit we need to emit a timestamp query. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: handle multiview queries properly. (v3)Dave Airlie2018-03-191-0/+19
| | | | | | | | | | | | | | | | | | | For multiview we need to emit a number of sequential queries depending on the view mask. This avoids dEQP-VK.multiview.queries.15 waiting forever on the CPU for query results that are never coming. We only really want to emit one query, and the rest should be blank (amdvlk does the same), so we emit begin/end pairs for all the others except the first query. v2: fix tests v3: split out patch. Fixes: dEQP-VK.multiview.queries* Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/query: split out begin/end query emissionDave Airlie2018-03-191-41/+57
| | | | | | | This just splits out the begin/end query hw emissions, it makes it easier to add multiview support for queries. Reviewed-by: Samuel Pitoiset <[email protected]>
* radv/multiview: mark layer_input if we have input attachments.Dave Airlie2018-03-191-1/+3
| | | | | | | | This fixes: dEQP-VK.multiview.input_attachments* Reviewed-by: Bas Nieuwenhuizen <[email protected]> Reviewed-by: Samuel Pitoiset <[email protected]>
* anv/pipeline: set active_stages earlyCaio Marcelo de Oliveira Filho2018-03-192-3/+10
| | | | | | | | | | | | | | Since the intermediate states of active_stages are not used, i.e. active_stages is read only after all stages were set into it, just set its value before compiling the shaders. This will allow to conditionally run certain passes based on what other shaders are being used, e.g. a certain pass might only be applicable to the vertex shader if there's no geometry or tessellation shader being used. v2: Use vk_to_mesa_shader_stage. (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/pipeline: fail if TCS/TES compile failCaio Marcelo de Oliveira Filho2018-03-191-7/+9
| | | | | | | v2: Add Fixes tag. (Lionel) Fixes: e50d4807a35e679 ("anv: Compile TCS/TES shaders.") Reviewed-by: Lionel Landwerlin <[email protected]>
* main/program_binary: In ProgramBinary set link status as LINKING_SKIPPEDJordan Justen2018-03-191-1/+1
| | | | | | | | | | | | | This change allows the disk shader cache to work with programs loaded with ProgramBinary. Drivers check for LINKING_SKIPPED, and if set, then they try to use the shader cache. Since the program loaded by ProgramBinary is similar to loading the shader from the disk cache, this is probably more appropriate. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Allow disk shader cache usage with LINKING_SUCCESS statusJordan Justen2018-03-191-3/+0
| | | | | | | | | | | | | | | | | | | | Currently, we only look in the disk shader cache if we see that the shader program is in the cache during the link step. If the shader cache entry isn't found during the program link, there are still some (fairly unlikely) scenarios where later it might be useful to search the cache for gen binary programs. 1. If the cache evicts the serialized glsl cache, there might still be valid gen program entries in the disk cache. 2. If two applications are running in parallel, then it is possible that one may write out the cached gen program item which the other application can then make use of. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* glsl/serialize: Save shader program metadata sha1Jordan Justen2018-03-191-0/+4
| | | | | | | | | | | | | | | When the shader cache is used, this can be generated. In fact, the shader cache uses this sha1 to lookup the serialized GL shader program. If a GL shader program is restored with ProgramBinary, the shaders are not available, and therefore the correct sha1 cannot be generated. If this is restored, then we can use the shader cache to restore the binary programs to the program that was loaded with ProgramBinary. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* glsl: Remove api_enabled tracking for transform feedbackJordan Justen2018-03-192-5/+0
| | | | | | | | | | We used this to prevent usage of the disk shader cache when transform feedback was enabled via the GL API. This is no longer used. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965: Allow disk shader cache usage with transform feedbackJordan Justen2018-03-191-8/+0
| | | | | | | Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105444 Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>