summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* anv/icl: Add gen11 mocs definesAnuj Phogat2018-02-161-0/+11
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/common/icl: Add has_sample_with_hiz flag in gen_device_infoAnuj Phogat2018-02-152-1/+8
| | | | | | | | | | | Sampling from hiz is enabled in i965 for GEN9+ but this feature has been removed from gen11. So, this new flag will be useful to turn the feature on/off for different gen h/w. It will be used later in a patch adding device info for gen11. Suggested-by: Kenneth Graunke <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/icl: Add assertions to check dispatch mode is SIMD8Anuj Phogat2018-02-151-0/+2
| | | | | | | | | | SIMD4x2 dispatch mode has been removed in GEN11. We're not using it anyways in Mesa. Adding few asserts to make it explicit. Use GEN_GEN macro in place of devinfo->gen (Ken) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/icl: Update the comment for maximum number of threads per PSDAnuj Phogat2018-02-151-4/+5
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/icl: Do StateCacheInvalidation for indirect clear colorAnuj Phogat2018-02-151-1/+1
| | | | | | | | | | StateCacheInvalidation is required on all gen7+ platforms. We don't need to update this check for every new gen h/w unless this requirement is changed. So, dropping the check for latest gen h/w. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/isl/icl: Build and use gen11 surface state emit functionsAnuj Phogat2018-02-156-1/+35
| | | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* intel/isl/icl: Add the maximum surface size limitAnuj Phogat2018-02-151-1/+5
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/genxml/icl: Update genx_bits headerAnuj Phogat2018-02-151-0/+1
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel/genxml/icl: Generate packing headersAnuj Phogat2018-02-155-2/+15
| | | | | | | | | Move build system changes in to one patch (Ken, Emil) Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* intel/genxml/icl: Add gen11.xmlAnuj Phogat2018-02-151-0/+3765
| | | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* meson: add convenience variable for anv_extensions.py depdendencyDylan Baker2018-02-151-4/+6
| | | | | Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* meson: use depend_files for adding extra file dependenciesDylan Baker2018-02-151-2/+2
| | | | | | | cc: Jason Ekstrand <[email protected]> Fixes: dd088d4bec74f37ffe4 ("anv/extensions: Generate a header file with extension tables") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* meson: use depend_files to track extra file dependenciesDylan Baker2018-02-151-2/+2
| | | | | | | cc: Jason Ekstrand <[email protected]> Fixes: f93994080993bda ("anv: Split anv_extensions.py into two files") Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* Revert "anv/meson: Make anv_entrypoints_gen.py depend on anv_extensions.py"Dylan Baker2018-02-151-2/+1
| | | | | | | | | | | | | This reverts commit 10d1b0be8e9c463dbc35cd66968299f33c76672c. This is unnecessary, the depend_files argument is for adding dependencies on files that are not part of the input, which is already done. cc: Jason Ekstrand <[email protected]> Fixes: 10d1b0be8e9c463dbc35cd66968299f33c76672c Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* isl: Don't use surface format R32_FLOAT for typed atomic integer operationsAnuj Phogat2018-02-141-1/+8
| | | | | | | | | | | | | | | From Skylake PRM Surface Formats section: "The surface format for the typed atomic integer operations must be R32_UINT or R32_SINT." Fixes an error and a piglit GPU hang in simulation environment. Piglit test: gl45-imageAtomicExchange-float.shader_test Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected] Reviewed-by: Kenneth Graunke <[email protected]> Cc: "18.0 17.3" <[email protected]>
* intel/aubinator: Correctly decode INTERFACE_DESCRIPTOR_DATAJason Ekstrand2018-02-141-1/+1
| | | | Reviewed-by: Kenneth Graunke <[email protected]>
* anv/gen10: Remove warning message.Rafael Antognolli2018-02-141-5/+2
| | | | | | | | | | | Gen10 seems pretty stable so far, remove "alpha support" message. Signed-off-by: Rafael Antognolli <[email protected]> Cc: Jason Ekstrand <[email protected]> Cc: "18.0" [email protected] Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/compiler: clean up nir_intrinsic_load_input for vertex shadersIago Toral Quiroga2018-02-141-11/+2
| | | | | | | | | This code to re-set the type of the source and destination is not necessary since we never manipulate the types. Looks like a left over from a time where we had to retype to float temporarily to handle 64-bit inputs. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/compiler: fix first_component for 64-bit types on vertex inputsIago Toral Quiroga2018-02-141-0/+3
| | | | | | | | | | Divide it by two as we do for other stages. This is because the component layout qualifier is always in 32-bit units. Fixes issues in a new CTS test (still WIP): KHR-GL45.enhanced_layouts.varying_double_components Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* anv: Move setting current_pipeline to cmd_state_initJason Ekstrand2018-02-121-1/+1
| | | | | | | | | | | We were setting current_pipeline to UINT32_MAX and then calling cmd_cmd_state_reset which memsets the entire state struct to 0 which implicitly resets current_pipeline to 3D. I have no idea how this hasn't caused everything to explode. Fixes: cd3feea74582 "anv/cmd_buffer: Rework anv_cmd_state_reset" cc: [email protected] Reviewed-by: Lionel Landwerlin <[email protected]>
* anv: Don't resolve or ambiguate non-existent layersJason Ekstrand2018-02-121-2/+10
| | | | | | | | | | | The previous code was trying to avoid non-existent layers by taking a MAX with anv_image_aux_layers. Unfortunately, it wasn't taking into account that layer_count starts at base_layer which may not be zero. Instead, we need to subtract base_layer from anv_image_aux_layers with a guard against roll-over. Fixes: de3be6180169f9 "anv/cmd_buffer: Rework aux tracking" Reviewed-by: Nanley Chery <[email protected]>
* anv: Drop I915_EXEC_CONSTANTS_REL_GENERAL from execbuf.Kenneth Graunke2018-02-121-2/+1
| | | | | | | | | | | | | The kernel used to have execbuf parameters to program the INSTPM bit for whether 3DSTATE_CONSTANT_* should be relative to dynamic state base address or an absolute address. However, they never worked in the presence of hardware contexts, so I deleted them a while back. It doesn't make sense to set this flag, as it doesn't exist anymore. It also never did anything anyway - the flag is zero, so |'ing it in did nothing. The default is relative anyway. Reviewed-by: Jason Ekstrand <[email protected]>
* intel/compiler: fix 64bit value prints on 32bitGrazvydas Ignotas2018-02-102-3/+3
| | | | | | | | Fix the following: warning: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 3 has type ‘uint64_t {aka long long unsigned int}. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_hiz_opJason Ekstrand2018-02-086-53/+26
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Use isl_aux_op instead of blorp_fast_clear_opJason Ekstrand2018-02-085-35/+16
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Allow fast-clearing the first slice of a multi-slice imageJason Ekstrand2018-02-082-12/+23
| | | | | | | | Now that we're tracking aux properly per-slice, we can enable this for applications which actually care. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Rework aux trackingJason Ekstrand2018-02-084-171/+360
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This commit completely reworks aux tracking. This includes a number of somewhat distinct changes: 1) Since we are no longer fast-clearing multiple slices, we only need to track one fast clear color and one fast clear type. 2) We store two bits for fast clear instead of one to let us distinguish between zero and non-zero fast clear colors. This is needed so that we can do full resolves when transitioning to PRESENT_SRC_KHR with gen9 CCS images where we allow zero clear values in all sorts of places we wouldn't normally. 3) We now track compression state as a boolean separate from fast clear type and this is tracked on a per-slice granularity. The previous scheme had some issues when it came to individual slices of a multi-LOD images. In particular, we only tracked "needs resolve" per-LOD but you could do a vkCmdPipelineBarrier that would only resolve a portion of the image and would set "needs resolve" to false anyway. Also, any transition from an undefined layout would reset the clear color for the entire LOD regardless of whether or not there was some clear color on some other slice. As far as full/partial resolves go, he assumptions of the previous scheme held because the one case where we do need a full resolve when CCS_E is enabled is for window-system images. Since we only ever allowed X-tiled window-system images, CCS was entirely disabled on gen9+ and we never got CCS_E. With the advent of Y-tiled window-system buffers, we now need to properly support doing a full resolve of images marked CCS_E. v2 (Jason Ekstrand): - Fix an bug in the compressed flag offset calculation - Treat 3D images as multi-slice for the purposes of resolve tracking v3 (Jason Ekstrand): - Set the compressed flag whenever we fast-clear - Simplify the resolve predicate computation logic Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Move the mi_alu helper higher upJason Ekstrand2018-02-081-17/+19
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/image: Simplify some verbose commenntsJason Ekstrand2018-02-081-10/+3
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Use blorp_ccs_ambiguate instead of fast-clearsJason Ekstrand2018-02-082-50/+40
| | | | | | | | | | | | | | | | | | Even though the blorp pass looks a bit on the sketchy side, the end result in the Vulkan driver is very nice. Instead of having this weird case where you do a fast clear and then maybe have to resolve, we just do the ambiguate and are done with it. The ambiguate does exactly what we want of setting all the CCS values to 0 which puts it into the pass-through state. This should also improve performance a bit in certain cases. For instance, if we did a transition from UNDEFINED to GENERAL for a surface that doesn't have CCS enabled all the time, we would end up doing a fast-clear and then a full resolve which ends up touching every byte in the main surface as well as the CCS. With the ambiguate pass, that transition only touches the CCS. Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Re-arrange the logic around UNDEFINED fast-clearsJason Ekstrand2018-02-081-17/+14
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Pull the undefined layout condition into the ifJason Ekstrand2018-02-081-9/+4
| | | | | | | | | Now that this isn't a multi-case if and it's just the one case, it's a bit clearer if the condition is just part of the if instead of being pulled out into a boolean variable. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/blorp: Add a CCS ambiguation passJason Ekstrand2018-02-082-0/+158
| | | | | | | | | | | | This pass performs an "ambiguate" operation on a CCS-compressed surface by manually writing zeros into the CCS. On gen8+, ISL gives us a fairly detailed notion of how the CCS is laid out so this is fairly simple to do. On gen7, the CCS tiling is quite crazy but that isn't an issue because we can only do CCS on single-slice images so we can just blast over the entire CCS buffer if we want to. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv: Only fast clear single-slice imagesJason Ekstrand2018-02-081-17/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current strategy we use for managing resolves has an issues where we track clear colors and the need for resolves per-LOD but we still allow resolves of only a subset of the slices in any given LOD and doing so sets the "needs resolve" flag for that LOD to false while leaving the remaining layers unresolved. This patch is only the first step and does not, by itself fix anything. However, it's fairly self-contained and splitting it out means any performance regressions should bisect to this nice obvious commit rather than to the giant "rework aux tracking" commit. Nanley and I did some testing and none of the applications we tested even tried to fast-clear anything other than the first slice of an image. The test was done by adding a printf right before we call blorp_fast_clear if we were every going to touch any slice other than the first with a fast-clear. Due to the way the original code was structured, this would not have included applications which only cleared a subset of layers. The applications tested were: * All Sascha Willems demos * Aztec Ruins * Dota 2 * The Talos Principle * Mad Max * Warhammer 40,000: Dawn of War III * Serious Sam Fusion 2017: BFE While not the full list of shipping applications, it's a pretty good spread and covers most of the engines we've seen running on our driver. If this is ever shown to be a performance problem in the future, we can reconsider our strategy. Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Add a mark_image_written helperJason Ekstrand2018-02-085-0/+119
| | | | | | | | | Currently, this helper does nothing but we call it every place where an image is written through the render pipeline. This will allow us to properly mark the aux state so that we can handle resolves correctly. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/blorp: Add src/dst_level helper variables in CmdCopyImageJason Ekstrand2018-02-081-8/+6
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Add an anv_genX_call macroJason Ekstrand2018-02-081-15/+25
| | | | | | | This is copied and pasted from the similar macro we added to ISL. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Generalize transition_color_bufferJason Ekstrand2018-02-081-12/+47
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This moves it to being based on layout_to_aux_usage instead of being hard-coded based on bits of a priori knowledge of how transitions interact with layouts. This conceptually simplifies things because we're now using layout_to_aux_usage and layout_supports_fast_clear to make resolve decisions so changes to those functions will do what one expects. There is a potential bug with window system integration on gen9+ where we wouldn't do a resolve when transitioning to the PRESENT_SRC layout because we just assume that everything that handles CCS_E can handle it all the time. When handing a CCS_E image off to the window system, we may need to do a full resolve if the window system does not support the CCS_E modifier. The only reason why this hasn't been a problem yet is because we don't support modifiers in Vulkan WSI and so we always get X tiling which implies no CCS on gen9+. This patch doesn't actually fix that bug yet but it takes us the first step in that direction by making us actually pick the correct resolve op. In order to handle all of the cases, we need more detailed aux tracking. v2 (Jason Ekstrand): - Make a few more things const - Use the anv_fast_clear_support enum v3 (Jason Ekstrand): - Move an assert and add a better comment Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/cmd_buffer: Recurse in transition_color_buffer instead of falling throughJason Ekstrand2018-02-081-9/+9
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/image: Support color aspects in layout_to_aux_usageJason Ekstrand2018-02-081-19/+29
| | | | Reviewed-by: Nanley Chery <[email protected]>
* anv/image: Add a helper for determining when fast clears are supportedJason Ekstrand2018-02-082-0/+83
| | | | | | | | | | | | | | | | v2 (Jason Ekstrand): - Return an enum instead of a boolean v3 (Jason Ekstrand): - Return ANV_FAST_CLEAR_NONE instead of false (Topi) - Rename ANV_FAST_CLEAR_ANY to ANV_FAST_CLEAR_DEFAULT_VALUE - Add documentation for the enum values v4 (Jason Ekstrand): - Remove a dead comment Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/image: Update a commentJason Ekstrand2018-02-081-1/+1
| | | | | | | This got lost in all of the aspect vs. plane rebasing of YCBCR. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/blorp: Rework HiZ ops to look like MCS and CCSJason Ekstrand2018-02-083-26/+34
| | | | | Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/blorp: Support ISL_AUX_USAGE_HIZ in surf_for_anv_imageJason Ekstrand2018-02-081-16/+6
| | | | | | | | | If the function gets passed ANV_AUX_USAGE_DEFAULT, it still has the old behavior of setting ISL_AUX_USAGE_NONE for depth/stencil which is what we want for blits/copies. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* anv/blorp: Rework image clear/resolve helpersJason Ekstrand2018-02-083-104/+166
| | | | | | | | | | This replaces image_fast_clear and ccs_resolve with two new helpers that simply perform an isl_aux_op whatever that may be on CCS or MCS. This is a bit cleaner as it separates performing the aux operation from which blorp helper we have to call to do it. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* intel/isl: Codify AUX operations in an enumJason Ekstrand2018-02-081-25/+49
| | | | | | | | | Right now, we have different entrypoints and enums in blorp for these different operations. This provides us a central enum which we can begin to transition to. Reviewed-by: Topi Pohjolainen <[email protected]> Reviewed-by: Nanley Chery <[email protected]>
* meson: Add build option for toolsScott D Phillips2018-02-081-2/+4
| | | | | | | | | | | | | | | Add a build option to control building some of the misc tools we have. Also set the executables to install, presumably you want that if you're asking for the build. v2: set 'install:' to the with_tools value, not true (Jordan) handle 'all' in a the comma list (Dylan) Add freedreno's tools (Dylan) Reviewed-by: Eric Anholt <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Dylan Baker <[email protected]>
* i965: remove unused brw_nir_lower_cs_shared()Timothy Arceri2018-02-072-9/+0
| | | | | | This has been unused since 8761a04d0d93. Reviewed-by: Elie Tournier <[email protected]>
* anv/device: initialize the list of enabled extensions properlyIago Toral Quiroga2018-02-061-1/+1
| | | | | | | | | | | | | | | The loop goes through the list of enabled extensions marking them as enabled in the list, but this relies on every other extension being initialized to false by default. This bug would make us, for example, advertise certain device extension entry points as available even when the corresponding extensions had not been enabled. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Fixes: abc62282b5c "anv: Add a per-device table of enabled extensions" Cc: "18.0" <[email protected]>
* i965/nir: do int64 lowering before optimizationIago Toral Quiroga2018-02-061-4/+12
| | | | | | | | | | | | | | | | | | | | | | | | Otherwise loop unrolling will fail to see the actual cost of the unrolling operations when the loop body contains 64-bit integer instructions, and very specially when the divmod64 lowering applies, since its lowering is quite expensive. Without this change, some in-development CTS tests for int64 get stuck forever trying to register allocate a shader with over 50K SSA values. The large number of SSA values is the result of NIR first unrolling multiple seemingly simple loops that involve int64 instructions, only to then lower these instructions to produce a massive pile of code (due to the divmod64 lowering in the unrolled instructions). With this change, loop unrolling will see the loops with the int64 code already lowered and will realize that it is too expensive to unroll. v2: Run nir_algebraic first so we can hopefully get rid of some of the int64 instructions before we even attempt to lower them. Reviewed-by: Matt Turner <[email protected]>