summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* r600/sb: fix crash in fold_alu_op3Roland Scheidegger2018-07-091-0/+2
| | | | | | | | | | | | | | | | | | | | fold_assoc() called from fold_alu_op3() can lower the number of src to 2, which then leads to an invalid access to n.src[2]->gvalue(). This didn't seem to have caused much harm in the past, but on Fedora 28 it will crash (presumably because -D_GLIBCXX_ASSERTIONS is used, although with libstdc++ 4.8.5 this didn't do anything, -D_GLIBCXX_DEBUG was needed to show the issue). An alternative fix would be to instead call fold_alu_op2() from within fold_assoc() when the number of src is reduced and return always TRUE from fold_assoc() in this case, with the only actual difference being the return value from fold_alu_op3() then. I'm not sure what the return value actually should be in this case (or whether it even can make a difference). https://bugs.freedesktop.org/show_bug.cgi?id=106928 Cc: [email protected] Reviewed-by: Dave Airlie <[email protected]>
* vulkan: Update the XML and headers to 1.1.80Jason Ekstrand2018-07-081-49/+231
| | | | Acked-by: Lionel Landwerlin <[email protected]>
* i965: fix clear color bo address relocationLionel Landwerlin2018-07-071-1/+1
| | | | | | Fixes: 7987d041fda0c9 ("i965/surface_state: Emit the clear color address instead of value.") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* radv: winsys/amdgpu: include missing pthread.h headerMauro Rossi2018-07-071-0/+1
| | | | | | | | | | | | | | | | | | pthread types are used in some files without explicitely including pthread.h. This leads to compile errors on Android 7.x nougat-x86 e.g. in src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.c:31: In file included from external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_bo.h:32: external/mesa/src/amd/vulkan/winsys/amdgpu/radv_amdgpu_winsys.h:52:2: error: unknown type name 'pthread_mutex_t' pthread_mutex_t global_bo_list_lock; ^ 1 error generated. Including pthread.h explicitely solves the building error Signed-off-by: Mauro Rossi <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* nv50/ir: fix Instruction::isActionEqual for PHI instructionsKarol Herbst2018-07-071-0/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | phi instructions don't have the same results by simply having the same sources. They need to be inside the same BasicBlock or share an equal condition resulting into a path through the shader selecting equal sources as well. short example: cond = ...; const0 = 0; const1 = 1; if (cond) { ssa_1 = const0; } else { ssa_2 = const1; } ssa_3 = phi ssa_1 ssa_2; if (!cond) { ssa_4 = const0; } else { ssa_5 = const1; } ssa_6 = phi ssa_4 ssa_5; allthough both phis actually have sources with equal results, merging them would be wrong due to having a different condition selecting which source to take. For now we also stick an assert into GlobalCSE, because it should never end up having to merge phi instructions. Reviewed-by: Ilia Mirkin <[email protected]>
* nvc0/ir: use the combined tid special registerRhys Perry2018-07-079-0/+61
| | | | | | | | | | | | | | total instructions in shared programs : 5804448 -> 5804690 (0.00%) total gprs used in shared programs : 670065 -> 670065 (0.00%) total shared used in shared programs : 548832 -> 548832 (0.00%) total local used in shared programs : 21068 -> 21068 (0.00%) local shared gpr inst bytes helped 0 0 0 5 5 hurt 0 0 0 191 191 Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]>
* nir/print: Print texture and sampler indicesJason Ekstrand2018-07-071-0/+11
| | | | | | | | Commit 5fb69daa6076e56b deleted support from nir_print for printing the texture and sampler indices on texture instructions. This commit just brings it back as best as we can. Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* intel/compiler: Relax mixed type restriction for saturating immediatesIan Romanick2018-07-062-4/+22
| | | | | | | | | | | | | | | | | At the time of commit 7bc6e455e23 (i965: Add support for saturating immediates.) we thought mixed type saturates would be impossible. We were only thinking about type converting moves from D to F, for example. However, type converting moves w/saturate from F to DF are definitely possible. This change minimally relaxes the restriction to allow cases that I have been able trigger via piglit tests. Fixes new piglit tests: - arb_gpu_shader_fp64/execution/built-in-functions/fs-sign-sat-neg-abs.shader_test - arb_gpu_shader_fp64/execution/built-in-functions/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <[email protected]> Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/vec4: Properly handle sign(-abs(x))Ian Romanick2018-07-061-1/+17
| | | | | | | | | | | | | | This is achived by copying the sign(abs(x)) optimization from the FS backend. On Gen7 an earlier platforms, this fixes new piglit tests: - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <[email protected]> Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/fs: Properly handle sign(-abs(x))Ian Romanick2018-07-061-3/+12
| | | | | | | | | | | | | Fixes new piglit tests: - glsl-1.10/execution/fs-sign-neg-abs.shader_test - glsl-1.10/execution/fs-sign-sat-neg-abs.shader_test - glsl-1.10/execution/vs-sign-neg-abs.shader_test - glsl-1.10/execution/vs-sign-sat-neg-abs.shader_test Signed-off-by: Ian Romanick <[email protected]> Cc: [email protected] Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* vulkan: utils: handle hexadecimal values in registryLionel Landwerlin2018-07-061-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* st/dri: fix a crash in server_wait_syncMarek Olšák2018-07-061-0/+6
| | | | | | | | | | Ported from i965 including the comment. This fixes: dEQP-EGL.functional.reusable_sync.valid.wait_server Cc: 18.1 <[email protected]> Reviewed-by: Michel Dänzer <[email protected]>
* python: Stop using the Python 2 exception syntaxMathieu Bridon2018-07-065-7/+7
| | | | | | | | | | | | | | We could have made this compatible with Python 3 by using: except Exception as e: But since none of this code actually uses the exception objects, let's just drop them entirely. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* python: Use spaces, not tabsMathieu Bridon2018-07-061-4/+4
| | | | | | | | | Python 3 doesn't allow mixing spaces and tabs in a script, contrarily to Python 2. Signed-off-by: Mathieu Bridon <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* python: Use the print functionMathieu Bridon2018-07-0638-1595/+1652
| | | | | | | | | | | | In Python 2, `print` was a statement, but it became a function in Python 3. Using print functions everywhere makes the script compatible with Python versions >= 2.6, including Python 3. Signed-off-by: Mathieu Bridon <[email protected]> Acked-by: Eric Engestrom <[email protected]> Acked-by: Dylan Baker <[email protected]>
* vma/tests: Fix compilation if limits.h defines PAGE_SIZE (v2)Jon Turney2018-07-061-8/+8
| | | | | | | | | | per POSIX, limits.h may define PAGE_SIZE when the value is not indeterminate v2: just change the variable name, since there's no intended correlation here between this value and the machine's actual page size. Signed-off-by: Jon Turney <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* radv: fix emitting the view index on GFX9Samuel Pitoiset2018-07-061-1/+2
| | | | | | | | For merged shaders, VS as HS for example. Cc: <[email protected]> Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* i965/vec4: Make the vec4_visitor::nir_emit_instr default case unreachableIan Romanick2018-07-051-2/+1
| | | | | | | | | | | | | The bug fixed by the previous commit went undetected because extra stderr messages are not flagged by the CI. Copy the solution from fs_visitor::nir_emit_instr and mark the default case unreachable. An alternate solution is to delete the default case so that the compiler will issue a warning. That may require more work since there are other (impossible) cases that exist. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: More DCE after loweringIan Romanick2018-07-051-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Some of the lowering passes, nir_lower_locals_to_regs for example, can cause some previously live code to be dead. This pass in particular leaves a bunch of nir_instr_type_deref instructions floating around. This causes shader-db runs on Gen5 through Haswell to spew tons of messages like: VS instruction not yet implemented by NIR->vec4 UnrealEngine4/EffectsCaveDemo/239.shader_test is one shader that generates these messages. Cleaning up the dead code fixes that. To verify, I did a shader-db before and after. Even though all the messages are gone, the results make my brain hurt. :( Haswell total cycles in shared programs: 411890163 -> 411891145 (<.01%) cycles in affected programs: 57016 -> 57998 (1.72%) helped: 3 HURT: 11 helped stats (abs) min: 2 max: 154 x̄: 96.67 x̃: 134 helped stats (rel) min: 0.08% max: 2.23% x̄: 1.42% x̃: 1.96% HURT stats (abs) min: 18 max: 686 x̄: 115.64 x̃: 20 HURT stats (rel) min: 0.81% max: 7.12% x̄: 1.87% x̃: 0.93% 95% mean confidence interval for cycles value: -51.39 191.67 95% mean confidence interval for cycles %-change: -0.14% 2.46% Inconclusive result (value mean confidence interval includes 0). Ivy Bridge total cycles in shared programs: 259114802 -> 259115032 (<.01%) cycles in affected programs: 24034 -> 24264 (0.96%) helped: 1 HURT: 9 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 0.08% max: 0.08% x̄: 0.08% x̃: 0.08% HURT stats (abs) min: 18 max: 48 x̄: 25.78 x̃: 20 HURT stats (rel) min: 0.80% max: 1.94% x̄: 1.08% x̃: 0.80% 95% mean confidence interval for cycles value: 12.42 33.58 95% mean confidence interval for cycles %-change: 0.54% 1.38% Cycles are HURT. Signed-off-by: Ian Romanick <[email protected]> Fixes: 5a02ffb733e nir: Rework lower_locals_to_regs to use deref instructions Reviewed-by: Jason Ekstrand <[email protected]>
* v3d: Fix leak of the default attributes BOs.Eric Anholt2018-07-051-1/+10
| | | | The GLES3 CTS makes a lot more progress on a run now.
* v3d: Fix leak of the spill BO on context destruction.Eric Anholt2018-07-051-0/+2
|
* nir: Apply fragment color clamping to gl_FragData[] as well.Eric Anholt2018-07-051-7/+2
| | | | | | | | | | | | | | From the ARB_color_buffer_float spec: 35. Should the clamping of fragment shader output gl_FragData[n] be controlled by the fragment color clamp. RESOLVED: Since the destination of the FragData is a color buffer, the fragment color clamp control should apply. Fixes arb_color_buffer_float-mrt mixed on v3d. Reviewed-by: Rob Clark <[email protected]>
* v3d: Skip emitting per-RT blend state for RTs with blend disabled.Eric Anholt2018-07-051-2/+8
| | | | | | Cleans up the CL of fbo-drawbuffers2-blend a bit. We could do better on more complicated cases by noticing if multiple RTs have the same blend state and emitting them in a single packet.
* v3d: Add proper support for GL_EXT_draw_buffers2's blending enables.Eric Anholt2018-07-054-25/+46
| | | | | I had flagged it as enabled on V3D 4.x, but not actually implemented the per-RT enables. Fixes piglit fbo_drawbuffers2-blend.
* v3d: Add support for GL_SAMPLE_ALPHA_TO_ONE.Eric Anholt2018-07-051-0/+3
| | | | Fixes piglit ext_framebuffer_multisample-draw-buffers-alpha-to-one
* v3d: Respect swap_color_rb for the f32_color_rb case.Eric Anholt2018-07-051-5/+7
| | | | | We don't actually set the two flags together, but I want to use the r/g/b/a reordered fields in the next commit.
* st/nir: Disable varying packing when doing transform feedback.Eric Anholt2018-07-051-1/+9
| | | | | | | | | | | | | | | | The varying packing would result in st_nir_assign_var_locations() picking new driver_locations, despite the pipe_stream_output already being set up for the old driver location. This left the gallium driver with no way to work back to what varying was referenced by pipe_stream_output. Fixes these tests on V3D: dEQP-GLES3.functional.transform_feedback.random.separate.points.3 dEQP-GLES3.functional.transform_feedback.random.separate.points.7 dEQP-GLES3.functional.transform_feedback.random.separate.points.9 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.3 dEQP-GLES3.functional.transform_feedback.random.separate.triangles.8 Reviewed-by: Timothy Arceri <[email protected]>
* radv/winsys: make use of radeon_emit()Samuel Pitoiset2018-07-051-11/+12
| | | | | Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only flush CB meta in pipeline image barriers when neededSamuel Pitoiset2018-07-052-2/+15
| | | | | | | | If the given image doesn't enable CMASK, FMASK or DCC that's useless to flush CB metadata. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: only flush DB meta in pipeline image barriers when neededSamuel Pitoiset2018-07-051-7/+15
| | | | | | | | If the given image doesn't have HTILE, that's useless to flush DB metadata. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv: fix "error: initializer element is not constant" build errorSamuel Pitoiset2018-07-051-2/+2
| | | | | | | | GCC 4.8 fails to compile with "static const", while GCC 8.1 fails to compile with only "static". Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* util: u_queue: fix android build errorLionel Landwerlin2018-07-051-1/+1
| | | | | | | | | mesa/src/util/u_queue.c:242:15: error: address of array 'queue->name' will always evaluate to 'true' [-Werror,-Wpointer-bool-conversion] Fixes: b238e33bc9d48b814370 "kutil/queue: add a process name into a thread name" Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* Util: fix msvc buildBenedikt Schemmer2018-07-051-1/+1
| | | | | | | | The MSVC preprocessor doesnt understand #warning Fixes: 2e1e6511f76 ("util: extract get_process_name from xmlconfig.c") Reviewed-by: Roland Scheidegger <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* python: Specify the JSON separatorsMathieu Bridon2018-07-052-2/+2
| | | | | | | | | | | | | | | On Python 2, the default JSON separators are ', ' for items and ': ' for dicts. On Python 3, the default is the same when no indent is specified, but if one is (and we do specify one) then the default items separator becomes ',' (the dict separator remains unchanged). This change explicitly specifies the Python 3 default, which helps ensuring that the output is identical, whether it was generated by Python 2 or 3. Reviewed-by: Eric Engestrom <[email protected]>
* python: Stabilize some script outputsMathieu Bridon2018-07-055-5/+8
| | | | | | | | | | | | In Python, dictionaries and sets are unordered, and as a result their is no guarantee that running this script twice will produce the same output. Using ordered dicts and explicitly sorting items makes the build more reproducible, and will make it possible to verify that we're not breaking anything when we move the build scripts to Python 3. Reviewed-by: Eric Engestrom <[email protected]>
* intel: tools: remove drm-uapi definesLionel Landwerlin2018-07-051-29/+1
| | | | | | | We already embed the headers, no need to redefine defines/structs. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: intel_dump_gpu: use simulator id in capturesLionel Landwerlin2018-07-052-2/+2
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: devinfo: add simulator idLionel Landwerlin2018-07-052-4/+48
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: dump-gpu: dump 48-bit addressesScott D Phillips2018-07-052-167/+151
| | | | | | | | | | | | | For gen8+, write out PPGTT tables in aub files so that full 48-bit addresses can be serialized. v2: Fix handling of `end` index in map_ppgtt v3: Correctly mark GGTT entry as present (Rafael) Signed-off-by: Scott D Phillips <[email protected]> Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: import intel_aubdumpLionel Landwerlin2018-07-054-0/+1440
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel: tools: update intel_aub.hLionel Landwerlin2018-07-051-0/+26
| | | | | | | Scott added new stuff in IGT. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: batch-decoder: add missing return lineLionel Landwerlin2018-07-051-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: batch-decoder: don't asks for constant BO until decodingLionel Landwerlin2018-07-051-6/+11
| | | | | | | | | | With PPGTT mappings, our aubinator implementation can be quite slow if we request a buffer that doesn't exist. Instead of doing a PPGTT walk for invalid addresses (0 lengths), wait until we're sure we want to decode the data. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/batch-decoder: handle non-contiguous binding table / surface stateScott D Phillips2018-07-051-4/+14
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/aubinator: aubinate ppgtt aubsScott D Phillips2018-07-051-1/+72
| | | | | | | | | | | | v2: by Lionel Fix memfd_create compilation issue Fix pml4 address stored on 32 instead of 64bits Return no buffer if first ppgtt page is not mapped v3: Drop additional memfd_create() (Rafael) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: handle GGTT mappingsLionel Landwerlin2018-07-051-13/+244
| | | | | | | | | | We use memfd to store physical pages as they get read/written to and the GGTT entries translating virtual address to physical pages. Based on a commit by Scott Phillips. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* util: rb-tree: A simple, invasive, red-black treeJason Ekstrand2018-07-054-0/+694
| | | | | | | | | | | | | | | | | | | | | | | | This is a simple, invasive, liberally licensed red-black tree implementation. It's an invasive data structure similar to the Linux kernel linked-list where the intention is that you embed a rb_node struct the data structure you intend to put into the tree. The implementation is mostly based on the one in "Introduction to Algorithms", third edition, by Cormen, Leiserson, Rivest, and Stein. There were a few other key design points: * It's an invasive data structure similar to the [Linux kernel linked list]. * It uses NULL for leaves instead of a sentinel. This means a few algorithms differ a small bit from the ones in "Introduction to Algorithms". * All search operations are inlined so that the compiler can optimize away the function pointer call. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: aubinator: drop the 1Tb GTT mappingLionel Landwerlin2018-07-051-55/+75
| | | | | | | | | | | | | | | Now that we're softpinning the address of our BOs in anv & i965, the addresses selected start at the top of the addressing space. This is a problem for the current implementation of aubinator which uses only a 40bit mmapped address space. This change keeps track of all the memory writes from the aub file and fetch them on request by the batch decoder. As a result we can get rid of the 1<<40 mmapped address space and only rely on the mmap aub file \o/ Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: rework register writes handlingLionel Landwerlin2018-07-051-28/+54
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator: remove standard input processing optionLionel Landwerlin2018-07-051-90/+12
| | | | | | | | | | | On a follow up commit in this series, we stop copying the data from the mmap'ed file into our big gtt mmap, and start referencing data in it directly. So reallocating the read buffer and adding more data from stdin wouldn't work. For that reason, let's stop supporting stdin process. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>