summaryrefslogtreecommitdiffstats
path: root/src/mesa/drivers/dri
Commit message (Collapse)AuthorAgeFilesLines
* i965: perf: update render basic configs for big core gen9/gen10Lionel Landwerlin2019-04-018-23/+24
| | | | | | | | | This updates allows an MI_LRI to trigger a OA report write in the global OA buffer. This isn't really useful for us, we just keep close to the internal public configs. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add ring busyness metric for cfl gt2Lionel Landwerlin2019-04-011-1/+165
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: enable Icelake metricsLionel Landwerlin2019-03-313-3/+11
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add Icelake metricsLionel Landwerlin2019-03-311-0/+11899
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: drop programming of an unused NOA registerLionel Landwerlin2019-03-311-11/+6
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: hsw: drop register programming not needed on HSWLionel Landwerlin2019-03-311-2/+1
| | | | | | | This register is flagged as IVB only in the documentation. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: chv: fixup counters namesLionel Landwerlin2019-03-311-25/+25
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: add PMA stall metricsLionel Landwerlin2019-03-3110-10/+1140
| | | | | | | | These are new metrics for Gen8/9 to measure the effect of the PMA stall workaround fix. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update memory write configLionel Landwerlin2019-03-311-7/+49
| | | | | | | | This rework the programming between older pre-production steppings & new ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update compute metrics configLionel Landwerlin2019-03-311-8/+2
| | | | | | | | This unifies some of the programming between pre-production stepping and production ones. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: perf: sklgt2: update a priority for register programmingLionel Landwerlin2019-03-311-2/+2
| | | | | | | This makes no difference in term of programming, it's just a cleanup. Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965,iris/blorp: do not blit 0-sizesSergii Romantsov2019-03-301-1/+9
| | | | | | | | | | | | | | | | | Seems there is no sense in blitting 0-sized sources or destinations. Additionaly it may cause segfaults for i965. v2: Function call replaced with inline check v3: Added check to avoid devision by zero (L. Landwerlin) v4: Added simillar check for Iris (L. Landwerlin) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110239 Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965/blorp: Remove unused parameter from blorp_surf_for_miptree.Rafael Antognolli2019-03-281-24/+12
| | | | | | It seems pretty useless nowadays. Reviewed-by: Jason Ekstrand <[email protected]>
* i965,iris,anv: Make alpha to coverage work with sample maskDanylo Piliaiev2019-03-251-6/+12
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | From "Alpha Coverage" section of SKL PRM Volume 7: "If Pixel Shader outputs oMask, AlphaToCoverage is disabled in hardware, regardless of the state setting for this feature." From OpenGL spec 4.6, "15.2 Shader Execution": "The built-in integer array gl_SampleMask can be used to change the sample coverage for a fragment from within the shader." From OpenGL spec 4.6, "17.3.1 Alpha To Coverage": "If SAMPLE_ALPHA_TO_COVERAGE is enabled, a temporary coverage value is generated where each bit is determined by the alpha value at the corresponding sample location. The temporary coverage value is then ANDed with the fragment coverage value to generate a new fragment coverage value." Similar wording could be found in Vulkan spec 1.1.100 "25.6. Multisample Coverage" Thus we need to compute alpha to coverage dithering manually in shader and replace sample mask store with the bitwise-AND of sample mask and alpha to coverage dithering. The following formula is used to compute final sample mask: m = int(16.0 * clamp(src0_alpha, 0.0, 1.0)) dither_mask = 0x1111 * ((0xfea80 >> (m & ~3)) & 0xf) | 0x0808 * (m & 2) | 0x0100 * (m & 1) sample_mask = sample_mask & dither_mask Credits to Francisco Jerez <[email protected]> for creating it. It gives a number of ones proportional to the alpha for 2, 4, 8 or 16 least significant bits of the result. GEN6 hardware does not have issue with simultaneous usage of sample mask and alpha to coverage however due to the wrong sending order of oMask and src0_alpha it is still affected by it. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109743 Signed-off-by: Danylo Piliaiev <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* android: static link with libexpat with Android O+Kishore Kadiyala2019-03-251-1/+9
| | | | | | | | | | | | | In Android O, MESA needs to statically link libexpat so that it's in same VNDK namespace. v2: apply change also to anv driver (Tapani) v3: use += in anv change (Eric Engestrom) Change-Id: I82b0be5c817c21e734dfdf5bfb6a9aa1d414ab33 Signed-off-by: Kishore Kadiyala <[email protected]> Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* i965/icl: Add WA_2204188704 to disable pixel shader panic dispatchAnuj Phogat2019-03-192-0/+10
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Stop setting LowerBuferInterfaceBlocksJason Ekstrand2019-03-151-0/+4
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Instead, we do UBO and SSBO deref lowering in NIR after we've given it a chance to optimize SSBO access: Shader-db results on Kaby Lake: total instructions in shared programs: 15235775 -> 15235484 (<.01%) instructions in affected programs: 14992 -> 14701 (-1.94%) helped: 19 HURT: 20 total cycles in shared programs: 339220331 -> 339027307 (-0.06%) cycles in affected programs: 79831981 -> 79638957 (-0.24%) helped: 540 HURT: 602 total loops in shared programs: 4402 -> 4348 (-1.23%) loops in affected programs: 186 -> 132 (-29.03%) helped: 27 HURT: 0 total spills in shared programs: 23261 -> 23234 (-0.12%) spills in affected programs: 38 -> 11 (-71.05%) helped: 1 HURT: 0 total fills in shared programs: 31442 -> 31371 (-0.23%) fills in affected programs: 98 -> 27 (-72.45%) helped: 1 HURT: 0 LOST: 12 GAINED: 12 Most of the help and hurt in instruction counts was just churn caused by re-ordering of optimizations and the fact that the NIR deref lowering code is emitting slightly different instructions. Nothing was hurt by more than three instructions and most things weren't helped by more than four. The primary exception to this is one Car Chase shader: shaders/non-free/gfxbench4/carchase/341.shader_test CS SIMD32: 1144 -> 821 (-28.23%) There is also one compute shader in Manhattan 3.1 and a fragment shader in the UE4 Shooter Game demo that now get a loop partially unrolled. Those showed up in the results as hurt instructions but were manually removed to get the results above. The lost/gained was a dozen Car Chase shaders that went from SIMD8 to SIMD16 thanks to improved register pressure: shaders/non-free/gfxbench4/carchase/366.shader_test CS shaders/non-free/gfxbench4/carchase/368.shader_test CS shaders/non-free/gfxbench4/carchase/370.shader_test CS shaders/non-free/gfxbench4/carchase/372.shader_test CS shaders/non-free/gfxbench4/carchase/376.shader_test CS shaders/non-free/gfxbench4/carchase/378.shader_test CS shaders/non-free/gfxbench4/carchase/380.shader_test CS shaders/non-free/gfxbench4/carchase/382.shader_test CS shaders/non-free/gfxbench4/carchase/384.shader_test CS shaders/non-free/gfxbench4/carchase/388.shader_test CS shaders/non-free/gfxbench4/carchase/4.shader_test CS shaders/non-free/gfxbench4/carchase/6.shader_test CS Given how much it appeared to be improved, I ran Car Chase on my laptop. Unfortunately, I wasn't able to see any measurable improvement. It might be helped by 1-2% but it's in the noise. It does render correctly as far as I can tell so the improvement is legitimate. All of the loops that got delete were in dolphin uber shaders. I've had no opportunity to test them for correctness or performance. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* mesa: rename logging functions to reflect that they format stringsMark Janes2019-03-146-43/+43
| | | | | | | In preparation for the definition of a function to log a formatted string. Reviewed-by: Erik Faye-Lund <[email protected]>
* i965: Disable ARB_fragment_shader_interlock for platforms prior to GEN9Plamena Manolova2019-03-141-1/+24
| | | | | | | | | | | ARB_fragment_shader_interlock depends on memory fences to ensure fragment ordering and this ordering guarantee is only supported from GEN9 onwards. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109980 Fixes: 939312702e35 "i965: Add ARB_fragment_shader_interlock support." Signed-off-by: Plamena Manolova <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: remove scaling factors from P010, P012Tapani Pälli2019-03-141-2/+2
| | | | | | | | | | | | | | | | | | Patch removes scaling factors introduced in 2a2e69f975b but leaves option to use scaling in place as it could be useful with other upcoming YUV formats. We did this scaling because ffmpeg was shifting channel bits down, however it seems this is not the right place as compositor wants to flip same buffers directly to display as well and therefore bitshifting needs to be done by the client when receiving frame from ffmpeg. Now P0x formats are treated the same, e.g. P010 is same as P016 but with lower 6 bits set to zeros. Fixes: 2a2e69f975b "i965: add P0x formats and propagate required scaling factors" Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Reimplement all the PIPE_CONTROL rules.Kenneth Graunke2019-03-111-136/+403
| | | | | | | | | | | | | | | | | | | | | | | | | | This implements virtually all documented PIPE_CONTROL restrictions in a centralized helper. You now simply ask for the operations you want, and the pipe control "brain" will figure out exactly what pipe controls to emit to make that happen without tanking your system. The hope is that this will fix some intermittent flushing issues as well as GPU hangs. However, it also has a high risk of causing GPU hangs and other regressions, as this is a particularly sensitive area and poking the bear isn't always advisable. Mark Janes noted that this patch helps with some GPU hangs on Icelake. This does re-enable the VF Invalidate => Write Immediate workaround on Gen8, which had been disabled (bug 103787) due to GPU hangs. The old code did this workaround after another which would have added CS stall bits, so it missed a workaround. The new code orders them properly and appears to work. v4: Don't pass "bo, offset, imm" to a recursive CS stall (caught by Topi Pohjolainen), drop Gen10 workarounds that are unnecessary for production hardware. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Use genxml for emitting PIPE_CONTROL.Kenneth Graunke2019-03-117-230/+362
| | | | | | | | | | | While this does add a bunch of boilerplate, it also protects us against the hardware moving bits, or changing their meaning. For something as finnicky as PIPE_CONTROL, the extra safety seems worth it. We turn PIPE_CONTROL_* into an bitfield of arbitrary flags, and then pack them appropriately. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Rename ISP_DIS to INDIRECT_STATE_POINTERS_DISABLE.Kenneth Graunke2019-03-112-2/+2
| | | | | | Clearer name. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move some genX infrastructure to genX_boilerplate.h.Kenneth Graunke2019-03-114-128/+174
| | | | | | | This will let us make multiple genX_*.c files, without copy and pasting all this boilerplate. Reviewed-by: Topi Pohjolainen <[email protected]>
* isl: Add a swizzle parameter to isl_buffer_fill_state()Kenneth Graunke2019-03-071-0/+1
| | | | | | | This is necessary for legacy texture buffer object formats, where we'll need to use a swizzle to fake e.g. luminance. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/decoders: handle decoding MI_BBS from ringLionel Landwerlin2019-03-071-1/+1
| | | | | | | | | An MI_BATCH_BUFFER_START in the ring buffer acts as a second level batchbuffer (aka jump back to ring buffer when running into a MI_BATCH_BUFFER_END). Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* intel/decoders: add address space indicator to get BOsLionel Landwerlin2019-03-071-1/+1
| | | | | | | Some commands like MI_BATCH_BUFFER_START have this indicator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965: stop calling nir_lower_returns()Timothy Arceri2019-03-061-3/+1
| | | | | | We now call this for all drivers in glsl_to_nir() instead. Reviewed-by: Eric Anholt <[email protected]>
* glsl: use NIR function inlining for drivers that use glsl_to_nir()Timothy Arceri2019-03-061-1/+1
| | | | | | | | glsl_to_nir() is still missing support for converting certain functions to NIR, so for those we use the GLSL IR optimisations to remove the functions. Reviewed-by: Eric Anholt <[email protected]>
* nir/lower_doubles: Inline functions directly in lower_doublesJason Ekstrand2019-03-061-4/+4
| | | | | | | | | | | | Instead of trusting the caller to already have created a softfp64 function shader and added all its functions to our shader, we simply take the softfp64 shader as an argument and do the function inlining ouselves. This means that there's no more nasty functions lying around that the caller needs to worry about cleaning up. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* glsl/nir: Add a shared helper for building float64 shadersJason Ekstrand2019-03-062-50/+3
| | | | | | Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Compile the fp64 program based on nir optionsJason Ekstrand2019-03-061-1/+2
| | | | | | | | | | Instead of looking the devinfo directly, look at the lowering options we provided to NIR. This is more accurate as it's now checking for "do we need full software lowering" rather than a hardware bit. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* nir: rename glsl_type_is_struct() -> glsl_type_is_struct_or_ifc()Timothy Arceri2019-03-061-2/+2
| | | | | | | | | | Replace done using: find ./src -type f -exec sed -i -- \ 's/glsl_type_is_struct(/glsl_type_is_struct_or_ifc(/g' {} \; Acked-by: Karol Herbst <[email protected]> Acked-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* i965: Implement threaded GL support.Kenneth Graunke2019-03-053-0/+51
| | | | | | | | | | | | | | | | | | | Now i965 supports mesa_glthread=true like Gallium drivers do. According to Markus (degasus), the Citra emulator now runs ~30% faster. Emmanuel (linkmauve) also reported that the Dolphin emulator improved by 2.8x on one game. (Both of those still need to be added to drirc.) An Intel Mesa CI run with mesa_glthread=true appears to be happy. Bioshock Infinite's benchmark mode seems to be around 15-20% faster on my Skylake GT4 at 1920x1080. Tested-by: Markus Wick <[email protected]> Tested-by: Emmanuel Gil Peyrot <[email protected]> Tested-by: Kenneth Graunke <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* i965: Fix allow_higher_compat_version workaround limited by OpenGL 3.0Yevhenii Kolesnikov2019-02-281-6/+12
| | | | | | | | | | | | Added check for higher compat profile being allowed before assigning certain extensions. Fixes: 272fe9494232 (mesa: enable ARB_texture_buffer_* extensions in the Compatibility profile) Signed-off-by: Danylo Piliaiev <[email protected]> Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Timothy Arceri <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=107052
* i965: fixed clamping in set_scissor_bits when the y is flippedEleni Maria Stea2019-02-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Calculating the scissor rectangle fields with the y flipped (0 on top) can generate negative values that will cause assertion failure later on as the scissor fields are all unsigned. We must clamp the bbox values again to make sure they don't exceed the fb_height. Also fixed a calculation error. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=108999 https://bugs.freedesktop.org/show_bug.cgi?id=109594 v2: - I initially clamped the values inside the if (Y is flipped) case and I made a mistake in the calculation: the clamp of the bbox[2] should be a check if (bbox[2] >= fbheight) bbox[2] = fbheight - 1 instead and I shouldn't have changed the ScissorRectangleYMax calculation. As the fixed code is equivalent with using CLAMP instead of MAX2 at the top of the function when bbox[2] and bbox[3] are calculated, and the 2nd is more clear, I replaced it. (Nanley Chery) v3: - Reversed the CLAMP change in bbox[3] as the API guarantees that the viewport height is positive. (Nanley Chery) v4: - Added nomination for the mesa-stable branch and the link to the second bugzilla bug (Nanley Chery) CC: <[email protected]> Tested-by: Paul Chelombitko <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* i965: Add support for sampling from XYUV imagesKasireddy, Vivek2019-02-262-0/+9
| | | | | | | | | Add support to the i965 DRI driver to sample from XYUV8888 buffers. Signed-off-by: Vivek Kasireddy <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* dri: meson: do not prefix user provided dri-drivers-pathSergii Romantsov2019-02-251-1/+1
| | | | | | | | | | | | | | | | | | | | | | The user can select the location where there dri drivers are installed by the dri-drivers-path meson option. By default path will be $prefix/$libdir/dri. Currently we add $prefix to the user provided path. Resulting in an incorrect or even missing path. v2: fixed dri_search_path by default, rebased to master v3: new commit-message (Emil Velikov), cc mesa-stable Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109698 CC: Rafael Antognolli <[email protected]> CC: Dylan Baker <[email protected]> Cc: 18.3 19.0 <[email protected]> Fixes: 306914db92e1 (meson: Add dridriverdir variable to dri.pc.) Signed-off-by: Sergii Romantsov <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: re-emit index buffer state on a reset option change.Andrii Simiklit2019-02-203-1/+13
| | | | | | | | | | | | | | Seems like we forget to update the index buffer (ib) status and IndexedDrawCutIndexEnable or CutIndexEnable flag is left unchanged it leads to ignoring of glEnable/glDisable functions for GL_PRIMITIVE_RESTART in some cases. The index buffer (ib) status should be re-emmited after the reset option change to avoid some unexpected behavior. Reviewed-by: Lionel Landwerlin <[email protected]> Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=109451 Cc: <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]> Signed-off-by: Andrii Simiklit <[email protected]>
* i965: always enable EXT_float_blendIlia Mirkin2019-02-181-0/+1
| | | | | | | | | From the table in isl_format.c, it appears that all generations support blending on 32-bit float surfaces. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: scale factor changes should trigger recompileLionel Landwerlin2019-02-182-1/+16
| | | | | | | | Found by inspection. Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 3da858a6b990c5 ("intel/compiler: add scale_factors to sampler_prog_key_data") Reviewed-by: Tapani Pälli <[email protected]>
* i965: Assert the execobject handles match for this deviceChris Wilson2019-02-161-0/+2
| | | | | | | Object handles are local to the device fd, so double check we are not mixing together objects from multiple screens on execbuf submission. Reviewed-by: Kenneth Graunke <[email protected]>
* i965: Removed the field etc_format from the struct intel_mipmap_treeEleni Maria Stea2019-02-153-18/+1
| | | | | | | | | After the previous changes to emulate the ETC/EAC formats using the secondary shadow miptree, the etc_format field of the intel_mipmap_tree struct became redundant and the remaining check that used it has been replaced. (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
* i965: Enabled the OES_copy_image extension on Gen 7 GPUsEleni Maria Stea2019-02-151-4/+12
| | | | | | | | | | | OES_copy_image extension was disabled on Gen7 due to the lack of support for ETC2 images. Enabled it back. (Kenneth Graunke) v2: - Removed the blank lines in the comments above OES_copy_image and OES_texture_view extensions in intel_extensions.c (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
* i965: Fixed the CopyImageSubData for ETC2 on Gen < 8Eleni Maria Stea2019-02-153-18/+6
| | | | | | | | | | | | | | | | | | | | | | For CopyImageSubData to copy the data during the 1st draw call, we need to update the shadow tree right before the rendering. v2: - Added assertion that the miptree doesn't need update at the time we update the texture surface. (Nanley Chery) v3: - As we now update the tree before the rendering we don't need to copy the data during the unmap anymore. Removed the unnecessary update from the intel_miptree_unmap in intel_mipmap_tree.c (Nanley Chery) v4: - Fixed unrelated empty line removal (Nanley Chery) - As now the intel_upate_etc_shadow of intel_mipmap_tree.c is only called inside its following function, we don't need to declare it at the top of the file anymore. (Nanley Chery) Reviewed-by: Nanley Chery <[email protected]>
* i965: Faking the ETC2 compression on Gen < 8 GPUs using two miptrees.Eleni Maria Stea2019-02-153-69/+134
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | GPUs Gen < 8 cannot sample ETC2 formats. So far, they converted the compressed EAC/ETC2 images to non-compressed RGBA images. When GetCompressed* functions were called, the pixels were returned in this RGBA format and not the compressed format that was expected. Trying to fix this problem, we use a secondary shadow miptree to store the decompressed data for the rendering and the main miptree to store the compressed for the Get functions to work. Each time that the main miptree is written with compressed data, we decompress them to RGB and update the shadow. Then we use the shadow for rendering. v2: - Fixes in the commit message (Nanley Chery) - Reversed the changes in brw_get_texture_swizzle and swapped the b, g values at the time that we decompress the data in the function: intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Simplified the format checks in the miptree_create function of the intel_mipmap_tree.c and reserved the call of the intel_lower_compressed_format for the case that we are faking the ETC support (Nanley Chery) - Removed the check for the auxiliary usage for the shadow miptree at creation (miptree_create of intel_mipmap_tree.c) as we won't use auxiliary buffers with these types of trees (Nanley Chery) - Set the etc_format of the non-ETC miptrees to MESA_FORMAT_NONE and removed the unecessary checks (Nanley Chery) - Fixed an unrelated indentation change (Nanley Chery) - Modified the function intel_miptree_finish_write to set the mt->shadow_needs_update to true to catch all the cases when we need to update the miptree (Nanley Chery) - In order to update the shadow miptree during the unmap of the main and always map the main (Nanley Chery) the following change was necessary: Splitted the previous update function that was updating all the mipmap levels and use two functions instead: one that updates one level and one that updates all of them. Used the first during unmap and the second before the rendering. - Removed the BRW_MAP_ETC_BIT flag and the mechanism to decide which miptree should be mapped each time and reversed all the changes in the higher level texture functions that upload data to textures as they aren't needed anymore. - Replaced the boolean needs_fake_etc with an inline function that checks when we need to fake the ETC compression (Nanley Chery) - Removed the initialization of the strides in the update function as the values will be overwritten by the intel_miptree_map call (Nanley Chery) - Used minify instead of division in the new update function intel_miptree_update_etc_shadow_levels in intel_mipmap_tree.c (Nanley Chery) - Removed the depth from the calculation of the number of slices in the new update function (intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c) as we don't need to support 3D ETC images. (Nanley Chery) v3: - Renamed the rgba_fmt in function miptree_create (intel_mipmap_tree.c) to decomp_format as the format is not always in rgba order. (Nanley Chery) - Documented the new usage for the shadow miptree in the comment above the field in the intel_miptree struct in intel_mipmap_tree.h (Nanley Chery) - Removed the redundant flags from the mapping of the miptrees in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) - Fixed the switch from surface's logical level to physical level in the intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Excluded the Baytrail GPUs from the check for the ETC emulation as they support the ETC formats natively. (Nanley Chery) - Simplified the check if the format is BGRA in intel_miptree_update_etc_shadow of intel_mipmap_tree.c (Nanley Chery) v4: - Removed the functions intel_miptree_(map|unmap)_etc and the check if we need to call them as with the new changes, they became unreachable. (Nanley Chery) - We'd rather calculate the level width and height using the shadow miptree instead of the main in intel_miptree_update_etc_shadow_levels of intel_mipmap_tree.c (Nanley Chery) - Fixed the format in the mt_surface_usage, set at the miptree creation, in miptree_create of intel_mipmap_tree.c (Nanley Chery) v5: - Fixed the levels calculations in intel_mipmap_tree.c (Nanley Chery) - Update the flag shadow_needs_update outside the function intel_miptree_update_etc_shadow (Nanley Chery) - Fixed indentation error (Nanley Chery) v6: - Fixed typo in commit message (Nanley Chery) - Simplified the assignment of the mt_fmt in the miptree_create of the intel_mipmap_tree.c (Nanley Chery) - Combined declarations and assignments where it was possible in the intel_miptree_update_etc_shadow and intel_miptree_update_etc_shadow_levels of the intel_mipmap_tree.c (Nanley Chery) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=81843 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=104272 Reviewed-by: Nanley Chery <[email protected]>
* i965: Rename intel_mipmap_tree::r8stencil_* -> ::shadow_*Nanley Chery2019-02-153-19/+19
| | | | | | | Use more generic field names. We'll reuse these fields for a workaround with ASTC miptrees. Reviewed-by: Eleni Maria Stea <[email protected]>
* drirc/i965: add option to disable 565 configs and visualsTapani Pälli2019-02-151-0/+13
| | | | | | | | | | | We have cases where we would not like to expose these. v2: call the option allow_rgb565_configs for consistency with existing allow_rgb10_configs (Eric, Jason) Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* drm-uapi: use local files, not system libdrmEric Engestrom2019-02-1417-20/+20
| | | | | | | | | There was an issue recently caused by the system header being included by mistake, so let's just get rid of this include path and always explicitly #include "drm-uapi/FOO.h" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Kristian H. Kristensen <[email protected]>
* i965: add P0x formats and propagate required scaling factorsTapani Pälli2019-02-123-0/+17
| | | | | | Signed-off-by: Tapani Pälli <[email protected]> Signed-off-by: Lin Johnson <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>