summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/nir: Lowering image loads and stores trashes all metadataJason Ekstrand2018-08-301-2/+2
| | | | | | | | | | | This fixes the GL_ARB_fragment_shader_interlock piglit test on gen8 platforms where the lack of metadata dirtying was causing another pass to accidentally delete a much needed loop. https://bugs.freedesktop.org/show_bug.cgi?id=107745 Fixes: 37f7983bcca1 "intel/compiler: Do image load/store lowering..." Jason Ekstrand <[email protected]> writes: Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Remove surface_idx from brw_image_paramJason Ekstrand2018-08-294-17/+6
| | | | | | | Now that the drivers are lowering to surface indices themselves, we no longer need to push the surface index into the shader. Reviewed-by: Kenneth Graunke <[email protected]>
* intel: Use TXS for image_size when we have a typed surfaceJason Ekstrand2018-08-295-4/+74
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv,i965: Lower away image derefs in the driverJason Ekstrand2018-08-297-166/+237
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Previously, the back-end compiler turn image access into magic uniform reads and there was a complex contract between back-end compiler and driver about setting up and filling out those params. As of this commit, both drivers now lower image_deref_load_param_intel intrinsics to load_uniform intrinsics controlled by the driver and lower the other image_deref_* intrinsics to image_* intrinsics which take an actual binding table index. There are still "magic" uniforms but they are now added and controlled entirely by the driver and that contract no longer spans components. This also has the side-effect of making most image use compile-time binding table indices. Previously, all image access pulled the binding table index from a uniform. Part of the reason for this was that the magic uniforms made it difficult to decouple binding table indices from the uniforms and, since they are indexed completely differently (especially in Vulkan), it was hard to pull them apart. Now that the driver is handling both, it's trivial to decouple the two and provide actual binding table indices. Shader-db results on Kaby Lake: total instructions in shared programs: 15166872 -> 15164293 (-0.02%) instructions in affected programs: 115834 -> 113255 (-2.23%) helped: 191 HURT: 0 total cycles in shared programs: 571311495 -> 571196465 (-0.02%) cycles in affected programs: 4757115 -> 4642085 (-2.42%) helped: 73 HURT: 67 total spills in shared programs: 10951 -> 10926 (-0.23%) spills in affected programs: 742 -> 717 (-3.37%) helped: 7 HURT: 0 total fills in shared programs: 22226 -> 22201 (-0.11%) fills in affected programs: 1146 -> 1121 (-2.18%) helped: 7 HURT: 0 Reviewed-by: Kenneth Graunke <[email protected]>
* nir: Use a bitfield for image access qualifiersJason Ekstrand2018-08-292-2/+3
| | | | | | | | | | This commit expands the current memory access enum to contain the extra two bits provided for images. We choose to follow the SPIR-V convention of NonReadable and NonWriteable because readonly implies that you *can* read so readonly + writeonly doesn't make as much sense as NonReadable + NonWriteable. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Use two components for 1D array image sizesJason Ekstrand2018-08-292-29/+20
| | | | | | | | | | Having the array length component stored in .z was a small convenience for the ISL image param filling code and an annoyance in the NIR lowering code. The only convenience of treating 1D arrays like 2D arrays in the lowering code is in the address calculation code so let's put all the complexity there as well. Reviewed-by: Kenneth Graunke <[email protected]>
* isl: Use the view array length for the image sizeJason Ekstrand2018-08-291-2/+5
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* intel/compiler: Do image load/store lowering to NIRJason Ekstrand2018-08-298-1120/+885
| | | | | | | | | | | | | | | | | | | | | | | This commit moves our storage image format conversion codegen into NIR instead of doing it in the back-end. This has the advantage of letting us run it through NIR's optimizer which is pretty effective at shrinking things down. In the common case of rgba8, the number of instructions emitted after NIR is done with it is half of what it was with the lowering happening in the back-end. On the downside, the back-end's lowering is able to directly use predicates and the NIR lowering has to use IFs. Shader-db results on Kaby Lake: total instructions in shared programs: 15166910 -> 15166872 (<.01%) instructions in affected programs: 5895 -> 5857 (-0.64%) helped: 15 HURT: 0 Clearly, we don't have that much image_load_store happening in the shaders in shader-db.... Reviewed-by: Kenneth Graunke <[email protected]>
* anv/pipeline: Remove dead image loads in lower_input_attacnmentsJason Ekstrand2018-08-291-2/+2
| | | | | | | Dead code will get rid of them eventually but it's better if they're just gone so we guarantee they won't trip up later passes. Reviewed-by: Kenneth Graunke <[email protected]>
* nir/format_convert: Rename nir_format_bitcast_uint_vecJason Ekstrand2018-08-291-1/+1
| | | | | | | | We have a name for that, it's called a uvec. This just makes the function name a bit shorter. While we're here, we also add an assert for one of the assumptions this function makes. Reviewed-by: Kenneth Graunke <[email protected]>
* intel/tools: new i965_disasm toolSagar Ghuge2018-08-293-0/+207
| | | | | | | | | | | | | | Adds a new i965 instruction disassemble tool v2: 1) fix a few nits (Matt Turner) 2) Remove i965_disasm header (Matt Turner) v3: 1) Redirect output to correct file descriptors (Matt Turner) 2) Refactor code (Matt Turner) 3) Use better formatting style (Matt Turner) Reviewed-by: Matt Turner <[email protected]> Signed-off-by: Sagar Ghuge <[email protected]>
* anv: Free the app and engine nameJason Ekstrand2018-08-291-0/+3
| | | | | Fixes: 8c048af5890d4 "anv: Copy the appliation info into the instance" Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: blorp: support multiple aspect blitsLionel Landwerlin2018-08-291-70/+75
| | | | | | | | | | Newer blit tests are enabling depth&stencils blits. We currently don't support it but can do by iterating over the aspects masks (copy some logic from the CopyImage function). Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 9f44745eca0e41 ("anv: Use blorp to implement VkBlitImage") Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1Ian Romanick2018-08-281-3/+16
| | | | | | | No shader-db changes on any Intel platform. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for imageAtomicAdd of +1 or -1Ian Romanick2018-08-281-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | v2: Refactor selection of atomic opcode to a separate function. Suggested by Jason. No changes on any other Intel platforms. Skylake total instructions in shared programs: 14304261 -> 14304241 (<.01%) instructions in affected programs: 1625 -> 1605 (-1.23%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5 helped stats (rel) min: 1.01% max: 14.29% x̄: 5.86% x̃: 4.07% 95% mean confidence interval for instructions value: -10.66 0.66 95% mean confidence interval for instructions %-change: -15.91% 4.19% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 527531226 -> 527531194 (<.01%) cycles in affected programs: 92204 -> 92172 (-0.03%) helped: 2 HURT: 0 Haswell and Broadwell had similar results. (Broadwell shown) total instructions in shared programs: 14615730 -> 14615710 (<.01%) instructions in affected programs: 1838 -> 1818 (-1.09%) helped: 4 HURT: 0 helped stats (abs) min: 1 max: 8 x̄: 5.00 x̃: 5 helped stats (rel) min: 0.89% max: 13.04% x̄: 5.37% x̃: 3.78% 95% mean confidence interval for instructions value: -10.66 0.66 95% mean confidence interval for instructions %-change: -14.59% 3.85% Inconclusive result (value mean confidence interval includes 0). Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/fs: Refactor image atomics to be a bit more like other atomicsIan Romanick2018-08-281-40/+44
| | | | | | | This greatly simplifies the next patch. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* i965/fs: Emit BRW_AOP_INC or BRW_AOP_DEC for atomicAdd of +1 or -1Ian Romanick2018-08-281-4/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Funny story... a single shader was hurt for instructions, spills, fills. That same shader was also the most helped for cycles. #GPUsAreWeird No changes on any other Intel platform. v2: Refactor selection of atomic opcode to a separate function. Suggested by Jason. Haswell, Broadwell, and Skylake had similar results. (Skylake shown) total instructions in shared programs: 14304116 -> 14304261 (<.01%) instructions in affected programs: 12776 -> 12921 (1.13%) helped: 19 HURT: 1 helped stats (abs) min: 1 max: 16 x̄: 2.32 x̃: 1 helped stats (rel) min: 0.05% max: 7.27% x̄: 0.92% x̃: 0.55% HURT stats (abs) min: 189 max: 189 x̄: 189.00 x̃: 189 HURT stats (rel) min: 4.87% max: 4.87% x̄: 4.87% x̃: 4.87% 95% mean confidence interval for instructions value: -12.83 27.33 95% mean confidence interval for instructions %-change: -1.57% 0.31% Inconclusive result (value mean confidence interval includes 0). total cycles in shared programs: 527552861 -> 527531226 (<.01%) cycles in affected programs: 1459195 -> 1437560 (-1.48%) helped: 16 HURT: 2 helped stats (abs) min: 2 max: 21328 x̄: 1353.69 x̃: 6 helped stats (rel) min: 0.01% max: 5.29% x̄: 0.36% x̃: 0.03% HURT stats (abs) min: 12 max: 12 x̄: 12.00 x̃: 12 HURT stats (rel) min: 0.03% max: 0.03% x̄: 0.03% x̃: 0.03% 95% mean confidence interval for cycles value: -3699.81 1295.92 95% mean confidence interval for cycles %-change: -0.94% 0.30% Inconclusive result (value mean confidence interval includes 0). total spills in shared programs: 8025 -> 8033 (0.10%) spills in affected programs: 208 -> 216 (3.85%) helped: 1 HURT: 1 total fills in shared programs: 10989 -> 11040 (0.46%) fills in affected programs: 444 -> 495 (11.49%) helped: 1 HURT: 1 Ivy Bridge total instructions in shared programs: 11709181 -> 11709153 (<.01%) instructions in affected programs: 3505 -> 3477 (-0.80%) helped: 3 HURT: 0 helped stats (abs) min: 1 max: 23 x̄: 9.33 x̃: 4 helped stats (rel) min: 0.11% max: 1.16% x̄: 0.63% x̃: 0.61% total cycles in shared programs: 254741126 -> 254738801 (<.01%) cycles in affected programs: 919067 -> 916742 (-0.25%) helped: 3 HURT: 0 helped stats (abs) min: 21 max: 2144 x̄: 775.00 x̃: 160 helped stats (rel) min: 0.03% max: 0.90% x̄: 0.32% x̃: 0.03% total spills in shared programs: 4536 -> 4533 (-0.07%) spills in affected programs: 40 -> 37 (-7.50%) helped: 1 HURT: 0 total fills in shared programs: 4819 -> 4813 (-0.12%) fills in affected programs: 94 -> 88 (-6.38%) helped: 1 HURT: 0 Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]> [v1] Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: Silence unused parameter warnings in brw_eu.hIan Romanick2018-08-281-1/+1
| | | | | | | | | | | | | | | | | All of the other brw_*_desc functions take a devinfo parameter, and all of the others at least have an assert that uses it. Keep the parameter, but mark it as unused. Silences 37 warnings like: In file included from src/intel/common/gen_disasm.c:27:0: src/intel/compiler/brw_eu.h: In function ‘brw_pixel_interp_desc’: src/intel/compiler/brw_eu.h:377:53: warning: unused parameter ‘devinfo’ [-Wunused-parameter] brw_pixel_interp_desc(const struct gen_device_info *devinfo, ^~~~~~~ Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* anv: Claim to support depthBounds for ID gamesJason Ekstrand2018-08-281-0/+9
| | | | | Cc: "18.2" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Copy the appliation info into the instanceJason Ekstrand2018-08-282-8/+30
| | | | | Cc: "18.2" <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Add INTEL_fragment_shader_ordering support.Kevin Rogovin2018-08-281-0/+1
| | | | | | | | | | Adds suppport for INTEL_fragment_shader_ordering. We achieve the fragment ordering by using the same instruction as for beginInvocationInterlockARB() which is by issuing a memory fence via sendc. Signed-off-by: Kevin Rogovin <[email protected]> Reviewed-by: Plamena Manolova <[email protected]>
* intel/eu: print bytes instead of 32 bit hex valueSagar Ghuge2018-08-271-17/+30
| | | | | | | | | | | | | | | INTEL_DEBUG=hex prints 32 bit hex value and due to endianness of CPU byte order is reversed. In order to disassemble binary files, print each byte instead of 32 bit hex value. v2: Print blank spaces in order to vertically align output of compacted instructions hex value with uncompacted instructions hex value. (Matt Turner) v3: Fix line wrap at correct length Signed-off-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: decoder: handle 0 sized structsLionel Landwerlin2018-08-271-4/+8
| | | | | | | | | | Gen7.5 has a BLEND_STATE of size 0 which includes a variable length group. We did not deal with that very well, leading to an endless loop. Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Add support for protected memory properties on ↵Samuel Iglesias Gonsálvez2018-08-271-0/+7
| | | | | | | | | | | anv_GetPhysicalDeviceProperties2() VkPhysicalDeviceProtectedMemoryProperties structure is new on Vulkan 1.1. Fixes Vulkan CTS CL#2849. Signed-off-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools: Add 0x in front of a couple of hex valuesJason Ekstrand2018-08-251-2/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Fill holes in the VF VUE to zeroJason Ekstrand2018-08-251-1/+28
| | | | | | | | This fixes a GPU hang in DOOM 2016 running under wine. Cc: [email protected] Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104809 Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: tools: Fix aubinator_error's fprintf call (format-security)Kai Wasserbäch2018-08-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | | The recent commit 4616639b49b4bbc91e503c1c27632dccc1c2b5be introduced the new function aubinator_error() which is a trivial wrapper around fprintf() to STDERR. The call to fprintf() however is passed the message msg directly: fprintf(stderr, msg); This is a format-security violation and leads to an FTBFS with -Werror=format-security (GCC 8): ../../../src/intel/tools/aubinator.c: In function 'aubinator_error': ../../../src/intel/tools/aubinator.c:74:4: error: format not a string literal and no format arguments [-Werror=format-security] fprintf(stderr, msg); ^~~~~~~ This patch fixes this trivially by introducing a catch-all "%s" format argument. Fixes: 4616639b49b ("intel: tools: split aub parsing from aubinator") Cc: Lionel Landwerlin <[email protected]> Signed-off-by: Kai Wasserbäch <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch_decoder: Print blend states properlyJason Ekstrand2018-08-251-1/+16
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/batch_decoder: Fix dynamic state printingJason Ekstrand2018-08-251-2/+2
| | | | | | | | | Instead of printing addresses like everyone else, we were accidentally printing the offset from state base address. Also, state_map is a void pointer so we were incrementing in bytes instead of dwords and every state other than the first was wrong. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Print ISL formats for vertex elementsJason Ekstrand2018-08-251-1/+2
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Clean up field iteration and fix sub-dword fieldsJason Ekstrand2018-08-251-16/+16
| | | | | | | | | | | | | First of all, setting iter->name in advance_field is unnecessary because it gets set by gen_decode_field which gets called immediately after gen_decode_field in the one call-site. Second, we weren't properly initializing start_bit and end_bit in the initial condition of gen_field_iterator_next so the first field of a struct would get printed wrong if it doesn't start on the first bit. This is fixed by adding a iter_start_field helper which sets the field and also sets up the other bits we need. This fixes decoding of 3DSTATE_SBE_SWIZ. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: decoder: unify MI_BB_START field namingLionel Landwerlin2018-08-243-7/+7
| | | | | | | | | | | The batch decoder looks for a field with a particular name to decide whether an MI_BB_START leads into a second batch buffer level. Because the names are different between Gen7.5/8 and the newer generation we fail that test and keep on reading (invalid) instructions. Signed-off-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107544 Reviewed-by: Jason Ekstrand <[email protected]>
* Revert "configure: allow building with python3"Emil Velikov2018-08-243-7/+7
| | | | | | | | | | | | | | This reverts commit ae7898dfdbe5c8dab7d11c71862353f1ae43feb0. Turns out the python scripts are _not_ fully python 3 compatible. As Ilia reported using get_xmlpool.py with LANG=C produces some weird output - see the link for details. Even though the issue was spotted with the autoconf build, it exposes a genuine problem with the script (and lack of lang handling of the meson build.) https://lists.freedesktop.org/archives/mesa-dev/2018-August/203508.html
* intel/nir: Enable nir_opt_find_array_copiesJason Ekstrand2018-08-232-13/+28
| | | | | | | | | | | | | | | | | | | | | | | We have to be a bit careful with this one because we want it to run in the optimization loop but only in the first brw_nir_optimize call. Later calls assume that we've lowered away copy_deref instructions and we don't want to introduce any more. Shader-db results on Kaby Lake: total instructions in shared programs: 15176942 -> 15176942 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 In spite of the lack of any shader-db improvement, this patch completely eliminates spilling in the Batman: Arkham City tessellation shaders. This is because we are now able to detect that the temporary array created by DXVK for storing TCS inputs is a copy of the input arrays and use indirect URB reads instead of making a copy of 4.5 KiB of input data and then indirecting on it with if-ladders. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/nir: Use nir_shrink_vec_array_varsJason Ekstrand2018-08-231-0/+1
| | | | | | | | | | | | | | | | | | | | | | | | | Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15176765 (<.01%) instructions in affected programs: 4259 -> 3419 (-19.72%) helped: 1 HURT: 0 total spills in shared programs: 10954 -> 10855 (-0.90%) spills in affected programs: 295 -> 196 (-33.56%) helped: 1 HURT: 0 total fills in shared programs: 22222 -> 22117 (-0.47%) fills in affected programs: 417 -> 312 (-25.18%) helped: 1 HURT: 0 The helped shader is from the OglCSDof synmark test. On my Kaby Lake laptop, the actual framerate of the benchmark didn't appear to improve beyond the noise. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/nir: Use the new structure and array splitting passesJason Ekstrand2018-08-231-0/+2
| | | | | | | | | | | | | | | | | | | | | | We call structure splitting once because it is guaranteed to split all the structures in the entire shader in one go. We call array splitting in the loop in case future optimizations turn indirects into direct dereferences and we can split more arrays. Shader-db results on Kaby Lake: total instructions in shared programs: 15177605 -> 15177605 (0.00%) instructions in affected programs: 0 -> 0 helped: 0 HURT: 0 This is unsurprising because nir_lower_vars_to_ssa already effectively does structure and array splitting internally. It doesn't actually split the variables but it's ability to reason about aliasing in the presence of arrays and structures and pick out scalars or vectors to be lowered to SSA values is fairly advanced. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/decoder: Decode SFIXED values.Kenneth Graunke2018-08-231-3/+7
| | | | | | This lets us example SAMPLER_STATE's LOD Bias field, among other things. Reviewed-by: Lionel Landwerlin <[email protected]>
* configure: allow building with python3Emil Velikov2018-08-233-7/+7
| | | | | | | | | | | | Pretty much all of the scripts are python2+3 compatible. Check and allow using python3, while adjusting the PYTHON2 refs. Note: - python3.4 is used as it's the earliest supported version - python3 chosen prior to python2 Signed-off-by: Emil Velikov <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* intel/compiler: Implement untyped atomic float min, max, and compare-swap ↵Ian Romanick2018-08-2214-1/+261
| | | | | | | | | | dataport messages v2: Split changes to the message type field to another patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Expand untyped atomic message type field by a bitIan Romanick2018-08-223-4/+9
| | | | | | | | | | | This is necessary for a new Gen9 message type that will be added in the next patch. There are also Gen8 message types that need the extra bit (mostly for bindless). v2: Split off from the next patch. Suggested by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/compiler: Silence unused parameter warningsIan Romanick2018-08-225-7/+5
| | | | | | | | | | | | | | | | | | | | | | | | src/intel/compiler/brw_disasm_info.c: In function ‘nir_print_instr’: src/intel/compiler/brw_disasm_info.c:30:61: warning: unused parameter ‘instr’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {} ^~~~~ src/intel/compiler/brw_disasm_info.c:30:74: warning: unused parameter ‘fp’ [-Wunused-parameter] __attribute__((weak)) void nir_print_instr(const nir_instr *instr, FILE *fp) {} ^~ src/intel/compiler/brw_disasm.c: In function ‘src_ia1’: src/intel/compiler/brw_disasm.c:850:18: warning: unused parameter ‘_reg_file’ [-Wunused-parameter] unsigned _reg_file, ^~~~~~~~~ src/intel/compiler/brw_fs_surface_builder.cpp: In function ‘void brw::surface_access::emit_byte_scattered_write(const brw::fs_builder&, const fs_reg&, const fs_reg&, const fs_reg&, unsigned int, unsigned int, unsigned int, brw_predicate)’: src/intel/compiler/brw_fs_surface_builder.cpp:193:57: warning: unused parameter ‘size’ [-Wunused-parameter] unsigned dims, unsigned size, ^~~~ v2: Update commit message. brw_fs_generator.cpp warnings were already fixed by another patch. Noticed by Caio. Signed-off-by: Ian Romanick <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/isl: Avoid tiling some 16K-wide render targetsNanley Chery2018-08-221-0/+23
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Fix rendering issues on BDW and SKL. Fixes: 0288fe8d0417730bdd5b3477130dd1dc32bdbcd3 ("i965/miptree: Use the correct BLT pitch") Fixes the following regressions seen exclusively on SKL: * KHR-GL46.texture_barrier_ARB.disjoint-texels * KHR-GL46.texture_barrier_ARB.overlapping-texels * KHR-GL46.texture_barrier.disjoint-texels * KHR-GL46.texture_barrier.overlapping-texels and both on BDW and SKL: * GTF-GL46.gtf21.GL2FixedTests.buffer_corners.buffer_corners * GTF-GL46.gtf21.GL2FixedTests.stencil_plane_corners.stencil_plane_corners v2: Note the fixed tests (Andres). Don't cause failures with multisampled buffers (Andres). Don't hamper SKL GT4 (Ken). v3: Fix the Fixes tag (Dylan). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107359 Cc: <[email protected]> Tested-by: Andres Gomez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/tools/aubwrite: Always use physical addresses for traces.Rafael Antognolli2018-08-222-11/+13
| | | | | | | | | | | | | | | It looks like we can't rely on the simulator to always translate virtual addresses to physical ones correctly. So let's use physical everywhere. Since our current GGTT maps virtual to physical addresses in a 1:1 way, no further changes are required. Additionally, we have other address spaces not in use right now. So let's make it easier to switch which one we are using but putting the default one into the aub_file struct. Cc: Lionel Landwerlin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/tools/aubwrite: Rename "legacy" to "Trace Block".Rafael Antognolli2018-08-221-1/+1
| | | | | | | Hopefully it's a little more descriptive, and more accurate. Cc: Lionel Landwerlin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: aubinator_viewer: add urb viewLionel Landwerlin2018-08-223-0/+172
| | | | | | | | | | This is available through a "Show URB" button on the 3DPRIMITIVE instructions. v2: Fix urb allocation end value in tooltip (Rafael) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: aubinator_viewer: store urb state during decodingLionel Landwerlin2018-08-222-23/+153
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: add aubinator viewerLionel Landwerlin2018-08-226-0/+2788
| | | | | | | | | | | | | | | | | A graphical user interface version of aubinator. Allows you to : - simultaneously look at multiple points in the aub file (using all the goodness of the existing decoding in aubinator) - edit an aub file v2: Switch from GLFW to GTK+3 v3: Fix warning when exiting Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]> (v1)
* intel: tools: import ImGuiLionel Landwerlin2018-08-2219-2/+31693
| | | | | | | | | | | | | | | | | | | | | | | | | | | | We want to add a new UI tool to decode aub files. This will use the Dear ImGui library to render its interface. The build of this UI toolkit is conditional to -Dwith_tools=intel-ui which superseeds -Dwith_tools=intel. The main way to use ImGui is to embed its source code at a particular revision. Most embedding projects have to do a bit of integration which is really specific to one's project. In our case the only modification is to include libepoxy. We also choose to use Gtk+3 for the window system integration. As oppose to the previous previous version of this patch using GLFW, Gtk+ is able to handle X11/Wayland session as well as property DPI scaling on retina monitors. The import was done at this commit (https://github.com/ocornut/imgui) : commit 6211f40f3d903dd9df961256e044029c49793aa3 Author: omar <[email protected]> Date: Fri Jul 27 12:29:33 2018 +0200 Internals: Drag and Drop: default drop preview use a narrower clipping rectangle (no effect here, but other branches uses a narrow clipping rectangle that was too small so this is a fix for it) + Comments v2: Switch from GLFW to GTK+ (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Rafael Antognolli <[email protected]>
* intel: tools: aub_mem: reuse already mapped ppgtt buffersLionel Landwerlin2018-08-221-5/+11
| | | | | | | | | | | | | | | | | | | | When we map a PPGTT buffer into a continous address space of aubinator to be able to inspect it, we currently add it to the list of BOs to unmap once we're finished. An optimization we can apply it to look up that list before trying to remap PPGTT buffers again (we already do this for GGTT buffers). We need to take some care before doing this because the list also contains GGTT BOs. As GGTT & PPGTT are 2 different address spaces, we can have matching addresses in both that point to different physical locations. This changes adds a flag on the elements of the list of mapped BOs to differenciate between GGTT & PPGTT, which allows use to reuse that list when looking up both address spaces. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel: tools: aubmem: map gtt data to aub fileLionel Landwerlin2018-08-222-0/+35
| | | | | | | | This will allow the aubinator viewer tool to modify the aub data that was loaded at a particular gtt address. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>