aboutsummaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
...
* intel/perf: add TGL supportLionel Landwerlin2019-10-314-0/+8611
| | | | | Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/compiler: Report the number of non-spill/fill SEND messages on vec4 tooIan Romanick2019-10-301-5/+35
| | | | | | | | | | | | | | | | | | This make shader-db's report.py work on Haswell and earlier platforms. The problem is that the script would detect the "sends" output for scalar shaders and expect in in vec4 shaders too. When it didn't find it, the script would fail with: Traceback (most recent call last): File "./report.py", line 351, in <module> main() File "./report.py", line 182, in main before_count = before[p][m] KeyError: 'sends' Fixes: f192741ddd8 ("intel/compiler: Report the number of non-spill/fill SEND messages") Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/dev: set default num_eu_per_subslice on gen12Lionel Landwerlin2019-10-301-1/+2
| | | | | | Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Fixes: 8125d7960b ("intel/dev: Add preliminary device info for Tigerlake") Acked-by: Jason Ekstrand <jason@jlekstrand.net>
* intel/eu/validate/gen12: Add TGL to eu_validate tests.Jordan Justen2019-10-301-0/+9
| | | | | | | | | | | | These reworks were combined into this patch: * Matt Turner: i965: Disable NoDDChk/NoDDClr test on Gen12+ * Francisco Jerez: intel/eu/validate/gen12: Disable qword_low_power_no_depctrl eu_validate test. Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Francisco Jerez <currojerez@riseup.net> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/dev: Add preliminary device info for TigerlakeJordan Justen2019-10-301-0/+49
| | | | | | | | | | | Reworks: * adjust 64-bit support, hiz (Jason Ekstrand) * sim-id (Lionel Landwerlin) * adjust threads, urb size (Rafael Antognolli) * adjust urb size (Kenneth Graunke) Signed-off-by: Jordan Justen <jordan.l.justen@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/dump_gpu: handle context create extended ioctlLionel Landwerlin2019-10-301-0/+15
| | | | | Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Rafael Antognolli <rafael.antognolli@intel.com>
* anv: Add Tile Cache Flush for Unified Cache.Rafael Antognolli2019-10-303-1/+45
|
* blorp: Add Tile Cache Flush for Unified Cache.Rafael Antognolli2019-10-301-0/+3
|
* intel/genxml: Add gen12 tile cache flush bitJordan Justen2019-10-301-0/+1
| | | | Signed-off-by: Jordan Justen <jordan.l.justen@intel.com>
* anv: Align fast clear color state buffer to a page.Rafael Antognolli2019-10-301-0/+9
| | | | | | | | | | | | | On gen11 and older, compressed images are tiled and aligned to 4K. On gen12 this 4K alignment restriction was removed. However, only aligning the fast clear color buffer to 64B (a cacheline, as it's on the documentation) is causing some bugs where the fast clear color is not converted during the fast clear operation. Aligning things to 4K seems to fix it. v2: Assert that image->planes[plane].offset is 4K aligned (Nanley) Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/compiler: Add instruction compaction support on Gen12Matt Turner2019-10-302-184/+868
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
* intel/compiler: Make separate src0/src1 index tablesMatt Turner2019-10-301-11/+18
| | | | | | TGL uses different data (and even a different format!) for each source. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
* intel/compiler: Inline get_src_index()Matt Turner2019-10-301-26/+15
| | | | | | | TGL will have separate tables for src0 and src1, so the shared function will no longer make sense. Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
* intel/compiler: Restructure instruction compaction in preparation for Gen12Matt Turner2019-10-301-20/+28
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <caio.oliveira@intel.com>
* intel/compiler: Remove unreachable() from brw_reg_type.cMatt Turner2019-10-301-3/+3
| | | | | | | | | | | | | | | | | The EU compaction unit test fuzzes the compaction code by flipping bits. We use a simple skip_bits() function with a list of reserved bits to ignore, but for more complex cases like invalid combinations of register file:type, we need either machinery to check validity or for these functions to simply inform us whether a combination was valid. enum brw_reg_type a 4-bit field in brw_reg, so rather than expanding it with an "INVALID" value, just return -1 and let the caller check for that. Scott suggested redefining unreachable() within the unit test to longjmp() which would allow driver code like this to still use it and allow the test to handle expected failures like this. If that plan works out, I plan to revert this.
* anv: Avoid emitting UBO surface states that won't be usedJason Ekstrand2019-10-301-1/+12
| | | | | | | | This shaves around 4-5% off of a CPU-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/vec4: Set brw_stage_prog_data::has_ubo_pullJason Ekstrand2019-10-301-0/+2
| | | | | | | | | | | | | | In 0e4a75f917, Ken added a flag brw_stage_prog_data which indicates whether any UBO pulls ever occur. Unfortunately, he neglected to set the bit in the vec4 back-end. This was fine at the time because the optimization was intended for iris which does not support gen7 and using the vec4 back-end on Gen8+ requires an environment variable. We want to use this in Vulkan which does support Gen7 so we want the information from the vec4 back-end as well as scalar. Fixes: 0e4a75f917 "intel/compiler: Record whether any pull constant..." Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/isl: Allow stencil buffer to support compression on Gen12+Sagar Ghuge2019-10-291-2/+3
| | | | | | | | | v2: (Nanley Chery) - Fix commit title - Fix comment Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/blorp: Set stencil resolve enable bitSagar Ghuge2019-10-291-4/+17
| | | | | | | | | | | | | | When set, the stencil buffer is filled with the true stencil values and we have to disable stencil buffer clear enable bit. v2: 1) Refactor code little bit (Nanley Chery) 2) Fix assertion (Nanley Chery) v3: 1) Remove unncessary assignment (Nanley Chery) 2) Fix GEN_GEN check (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel: Track stencil aux usage on Gen12+Sagar Ghuge2019-10-293-0/+9
| | | | | | | | | | | | | | Enable stencil compression enable and control surface enable bit if stencil buffer lossless compression is enabled. v2: Remove unnecessary GEN_GEN check (Nanley Chery) v3: (Nanley Chery) - Change commit subject tag from intel/isl to intel - Keep assignment order correct Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/blorp: Add helper function for stencil buffer resolveSagar Ghuge2019-10-292-0/+34
| | | | | | | | On Gen12+, Stencil buffer's lossless compression should be resolved with WM_HZ_OP packet. Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/blorp: Assign correct view while clearing depth stencilSagar Ghuge2019-10-291-1/+1
| | | | | | | | | | We never saw any failures regarding this typo but it's good to assign correct stencil view while constructing blorp_params. Fixes: 0cabf93b80d0 "intel/blorp: Add an entrypoint for clearing depth and stencil" Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* genxml/gen12: Add Stencil Buffer Resolve Enable bitSagar Ghuge2019-10-291-0/+1
| | | | | Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* anv: Reduce the minimum number of relocationsJason Ekstrand2019-10-291-1/+1
| | | | | | | | | | The original value of 256 was under the assumption that you're a batch buffer which is likely going to have a large number of relocations. However, pipeline objects on Gen7 will have at most 6 relocations (one per shader stage and one for the workaround BO) so this is a lot of per-pipeline wasted space. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
* anv: Delay allocation of relocation listsJason Ekstrand2019-10-291-67/+71
| | | | | | | | | | | | | The old relocation list code always allocated 256 relocations and a hash set up-front without knowing whether or not we really need them. In particular, in the softpin case, this is two fairly large allocations that we don't need to be making. Also, for pipeline objects on haswell where we don't have softpin, we don't need relocations unless scratch is used so this is extra data per-pipeline. Instead, we should do it on-demand. This shaves 3.5% off of a cpu-limited example running with the Dawn WebGPU implementation. Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
* anv: Implement new way for setting streamout buffers.Plamena Manolova2019-10-293-0/+19
| | | | | | | | | For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net>
* genxml: Add 3DSTATE_SO_BUFFER_INDEX_* instructionsPlamena Manolova2019-10-291-0/+47
| | | | | | | | For gen12 we set the streamout buffers using 4 separate commands instead of 3DSTATE_SO_BUFFER. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Acked-by: Jason Ekstrand <jason@jlekstrand.net>
* anv: Set depthBounds to true in anv_GetPhysicalDeviceFeatures.Plamena Manolova2019-10-291-1/+1
| | | | | | | | | Add depth bounds testing to the list of supported physical device features. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* genxml: Change 3DSTATE_DEPTH_BOUNDS bias.Plamena Manolova2019-10-291-1/+1
| | | | | | | | | The bias for the 3DSTATE_DEPTH_BOUNDS instruction should be 2 not 1. Signed-off-by: Plamena Manolova <plamena.manolova@intel.com> Reviewed-by: Jason Ekstrand <jason@jlekstrand.net> Reviewed-by: Jordan Justen <jordan.l.justen@intel.com>
* intel/perf: update ICL configurationsLionel Landwerlin2019-10-291-59/+28
| | | | | | | | | A few equations/programming changes for ICL. v2: Fix a couple of issues in naming and floating/integer operations (Ken) Signed-off-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com> Acked-by: Kenneth Graunke <kenneth@whitecape.org>
* anv: Fix output of INTEL_DEBUG=bat for chained batchesCaio Marcelo de Oliveira Filho2019-10-281-1/+1
| | | | | | | | | | The anv_batch_bo contents are linked one to another, and when printing we have to start with the first of those. Since in `u_vector` new elements are added to the head, to get the first element we need the vector's tail. Fixes: 32ffd90002b ("anv: add support for INTEL_DEBUG=bat") Reviewed-by: Lionel Landwerlin <lionel.g.landwerlin@intel.com>
* loader: default to iris for all future PCI IDsEric Engestrom2019-10-282-0/+3
| | | | | | | | | | The existing "fallback" code didn't actually do anything, so this removes it, and instead we just always fallback to `iris` for future PCI IDs. Suggested-by: Kenneth Graunke <kenneth@whitecape.org> Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* anv: add a couple printflike() annotationsEric Engestrom2019-10-281-2/+4
| | | | | Signed-off-by: Eric Engestrom <eric.engestrom@intel.com> Reviewed-by: Kristian H. Kristensen <hoegsberg@google.com>
* intel/isl: Support lossless compression with multisamplesSagar Ghuge2019-10-281-5/+1
| | | | | | | | | | GEN12 adds the ability to losslessly compress each sample plane in a multisampled buffer that uses MCS compression. v2: Remove unnecessary assertion (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/blorp: Use isl_aux_usage_has_mcs instead of comparingSagar Ghuge2019-10-281-5/+7
| | | | | | | | | | | | Depending on MCS_CSS or MCS we can emit blorp blit shaders. As we support MCS_CSS and MCS, it makes sense to use isl_aux_usage_has_mcs function. v2: Fix commit message (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/isl: Don't reconfigure aux surfaces for MCSSagar Ghuge2019-10-281-0/+3
| | | | | | | | | If aux for MCS is already configured, don't configure again. v2: Fix missing period in commit message (Nanley Chery) Signed-off-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Nanley Chery <nanley.g.chery@intel.com>
* intel/blorp: Satisfy clear color rules for HIZ_CCSNanley Chery2019-10-281-1/+35
| | | | | | | | | Store the converted depth value into two dwords. Avoids regressing the piglit test "fbo-depth-array depth-clear", when HIZ_CCS sampling is enabled in a later commit. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Fix and use HIZ_CCS write through modeNanley Chery2019-10-282-0/+7
| | | | | | | | Write through to the CCS if the surface is used as a texture and can be sampled by the HW with CCS. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/blorp: Satisfy HIZ_CCS fast-clear alignmentsNanley Chery2019-10-281-0/+47
| | | | | | | | Prevent the piglit test, amd_vertex_shader_layer-layered-depth-texture-render, from regressing in in a future commit. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Refactor blorp_can_hiz_clear_depth()Nanley Chery2019-10-283-16/+19
| | | | | | | Prepare this function to be used in iris and to handle new Gen12 behavior. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* isl: Add isl_surf_supports_hiz_ccs_wt()Nanley Chery2019-10-282-0/+18
| | | | | | | | Add a helper to determine if an ISL surface supports the write-through mode of HIZ_CCS. Reviewed-by: Sagar Ghuge <sagar.ghuge@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/blorp: Treat HIZ_CCS like HiZNanley Chery2019-10-281-2/+2
| | | | | | Allow it in depth buffer instructions but disable it for blits. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/blorp: Assert against HiZ in surface statesNanley Chery2019-10-281-2/+1
| | | | | | | Avoid unexpected behavior if the caller happens to pass in a HiZ aux usage. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Support HIZ_CCS in isl_surf_get_ccs_surfNanley Chery2019-10-283-7/+38
| | | | | | | Add an extra aux parameter which will be filled out with CCS if the first two isl_surf parameters fit the requirements for HiZ_CCS. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* isl: Reduce assertions during aux surf creationNanley Chery2019-10-281-5/+15
| | | | | | Return false more often to reduce the burden on the caller. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Enable CCS_E for R24_UNORM_X8_TYPELESS on TGL+Nanley Chery2019-10-282-1/+2
| | | | | | | | While this format isn't listed in BSpec: 53911, other documentation and empirical evidence suggest that it's fine to remap it to R32_FLOAT. I've filed a bug for the BSpec page. Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Use 3DSTATE_DEPTH_BUFFER::ControlSurfaceEnableNanley Chery2019-10-282-1/+2
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel/isl: Support HIZ_CCS in emit_depth_stencil_hizJason Ekstrand2019-10-281-2/+10
| | | | | | | v2. Remove undocumented CCS_E-only mode for depth. (Nanley) Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Use RENDER_SURFACE_STATE::DepthStencilResourceNanley Chery2019-10-282-0/+6
| | | | Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>
* intel: Update alignment restrictions for HiZ surfaces.Jordan Justen2019-10-281-1/+7
| | | | | | | | | v2 (Nanley): * Maintain a chronological ordering for HiZ alignments. Suggested by Ken. Co-authored-by: Nanley Chery <nanley.g.chery@intel.com> Reviewed-by: Kenneth Graunke <kenneth@whitecape.org>