summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel: Fix broxton 2x6 way size computationAnuj Phogat2017-06-061-0/+4
| | | | | | | | | | | | | | | | | This patch is undoing the changes to way size computation in broxton 2x6, made by below commit: Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138 Author: Anuj Phogat <[email protected]> i965: Simplify l3 way size computations By making use of l3_banks field in gen_device_info struct l3_way_size for gen7+ = 2 * l3_banks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306 Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* tree-wide: remove trailing backslashEric Engestrom2017-06-071-1/+1
| | | | | | | | | Simple search for a backslash followed by two newlines. If one of the newlines were to be removed, this would cause issues, so let's just remove these trailing backslashes. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Ian Romanick <[email protected]>
* anv: Set better descriptor set limitsAlex Smith2017-06-061-3/+6
| | | | | | | | | | | | | Based on discussions with Jason, Ivy Bridge and Bay Trail only actually support 16 samplers, while newer hardware can support more than the current limit of 64. Therefore set the lower limit where needed, and bump up to 128 for everything else. There is also a limit on the total number of other resources of around 250. This allows Dawn of War III to render correctly on ANV. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Set driver version to Mesa versionAlex Smith2017-06-061-1/+1
| | | | | | | | | | | | As already done by RADV. v2: Move version calculation function to src/vulkan/util to share with RADV. Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* util/vulkan: Move Vulkan utilities to src/vulkan/utilAlex Smith2017-06-068-8/+8
| | | | | | | | | | | | | We have Vulkan utilities in both src/util and src/vulkan/util. The latter seems a more appropriate place for Vulkan-specific things, so move them there. v2: Android build system changes (from Tapani Pälli) Signed-off-by: Alex Smith <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* intel: gen-decoder: rework how we handle groupsLionel Landwerlin2017-06-063-104/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of handling groups doesn't seem to be able to handle MI_LOAD_REGISTER_* with more than one register. This change reworks the way we handle groups by building a traversal list on loading the GENXML files. Let's say you have Instruction { Field0 Field1 Field2 Group0 (count=2) { Field0-0 Field0-1 } Group1 (count=4) { Field1-0 Field1-1 } } We build of linked on load that goes : Instruction -> Group0 -> Group1 All of those are gen_group structures, making the traversal trivial. We just need to iterate groups for the right number of timers (count field in genxml). The more fancy case is when you have only a single group of unknown size (count=0). In that case we keep on reading that group for as long as we're within the DWordLength of that instruction. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.Kenneth Graunke2017-06-053-3/+2
| | | | | | | | | We moved to INTEL_SCALAR_* when we added more than a single stage, but never went back and converted the VS to work that way. Be consistent. Also update the documentation to actually mention these debug variables. Acked-by: Jason Ekstrand <[email protected]>
* i965: Simplify l3 way size computationsAnuj Phogat2017-06-021-10/+2
| | | | | | | | | | | By making use of l3_banks field in gen_device_info struct l3_way_size for gen7+ = 2 * l3_banks. V2: Keep the get_l3_way_size() function. Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add and initialize l3_banks field for gen7+Anuj Phogat2017-06-022-3/+27
| | | | | | | | | | | This new field helps simplify l3 way size computations in next patch. V2: Initialize the l3_banks to 0 in macros. Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel/blorp: Handle gen6 stencil/HiZ offsets in the back-endJason Ekstrand2017-06-011-2/+30
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add a helper for getting the byte/tile offset of a subimageJason Ekstrand2017-06-013-9/+64
| | | | | | | | | Frequently, get_image_offset_sa is combined with get_intratile_offset_sa so it makes sense to have a single helper to do both. If the caller doesn't want the intratile offsets, it can simply pass NULL and ISL will assert that they are 0. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Make get_intratile_offset_el take the element size in bitsJason Ekstrand2017-06-012-8/+5
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add a new layout for HiZ and stencil on Sandy BridgeJason Ekstrand2017-06-012-5/+197
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Generate phys_total_el from isl_calc_phys_extentJason Ekstrand2017-06-011-68/+97
| | | | | | | | | | The only surface layout for which slice0 makes any sense is GEN4_2D. Move all of the slice0 stuff into isl_calc_phys_total_extent_el_gen4_2d and make the others trivially return the total size in surface elements. As a side-effect, array_pitch_el_rows is now returned from these helpers as well. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Don't check array pitch for gen4 3D texturesJason Ekstrand2017-06-011-1/+0
| | | | | | Array pitch doesn't matter in this layout. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Refactor to use a phys_total_el extent.Jason Ekstrand2017-06-011-19/+19
| | | | | | | We've already implicitly been using a physical total size in surface elements. This just centralizes things a bit. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Add an isl_assert_div helperJason Ekstrand2017-06-011-0/+7
| | | | | | | This is a fairly common operation and it's nice to be able to just call the one little function. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Refactor isl_calc_array_pitch_el_rowsJason Ekstrand2017-06-011-47/+46
| | | | | | | | | | | Over 90% of the function only applies to ISL_DIM_LAYOUT_GEN4_2D anyway so we can just handle the other two as special cases at the top. The two "generic" cases below the switch only apply on gen9 and above and only to 3D or CCS surfaces. This implies that they only apply to surfaces with ISL_DIM_LAYOUT_GEN4_2D. Making them look generic is a lie. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Move isl_calc_array_pitch_el_rows higher upJason Ekstrand2017-06-011-117/+117
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/isl: Remove the device parameter from isl_tiling_get_infoJason Ekstrand2017-06-014-30/+16
| | | | | | | | | We were only using it for validating that we don't use Ys/Yf on gen8 and earlier. Removing it from isl_tiling_get_info lets us remove it from a bunch of other things that had no business needing a hardware generation. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Drop duplicate shadow variable.Kenneth Graunke2017-06-011-1/+0
| | | | | | We already initialized this at the top of the function. Trivial.
* genxml: Make 3DSTATE_CONSTANT_BODY on Gen7+ use arrays.Kenneth Graunke2017-06-015-36/+28
| | | | | | This will let us initialize the constant buffers with loops. Reviewed-by: Lionel Landwerlin <[email protected]>
* genxml: Fix decoder to print the array element on field members.Kenneth Graunke2017-06-011-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Previously we'd print things like: 0xfffbb568: 0x00010000 : Dword 1 ReadLength: 0 ReadLength: 1 0xfffbb568: 0x00000001 : Dword 1 ReadLength: 1 ReadLength: 0 instead of the more obvious: 0xfffbb568: 0x00010000 : Dword 1 ReadLength[0]: 0 ReadLength[1]: 1 0xfffbb568: 0x00000001 : Dword 1 ReadLength[2]: 1 ReadLength[3]: 0 (Yes, the ralloc context here is bogus - the decoder leaks just about everything. We need to use proper ralloc contexts someday...) Acked-by: Lionel Landwerlin <[email protected]>
* genxml: Fix decoding of array groups.Kenneth Graunke2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | If you had a group as the first element of a struct, i.e. <struct name="3DSTATE_CONSTANT_BODY" length="10"> <group count="4" start="0" size="16"> <field name="ReadLength" start="0" end="15" type="uint"/> </group> ... </struct> we would get a group_offset of 0, causing create_field() to think the field wasn't in a group, and fail to offset forward for successive array elements. So we'd mark all the array elements as offset 0. Using ctx->group->elem_size is a better check for "are we in a group?". Acked-by: Lionel Landwerlin <[email protected]>
* genxml: Fix decoder for groups with multiple fields.Kenneth Graunke2017-06-011-4/+2
| | | | | | | | | | | | | | | | | If you have something like: <group count="0" start="96" size="32"> <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/> <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/> </group> We would reset ctx->group_count to 0 after processing the first field, so the second would not have a group count. This is largely untested, as the only groups with multiple fields are packets we don't emit in Mesa. Found by inspection. Acked-by: Lionel Landwerlin <[email protected]>
* genxml: Fix parsing of address fields in groups.Kenneth Graunke2017-06-011-2/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | For example, <group count="4" start="64" size="64"> <field name="Pointer" start="5" end="63" type="address"/> </group> used to generate: const uint64_t v2_address = __gen_combine_address(data, &dw[2], values->Pointer, 0); ... const uint64_t v4_address = __gen_combine_address(data, &dw[4], values->Pointer, 0); ... but now generates code with proper subscripts: const uint64_t v2_address = __gen_combine_address(data, &dw[2], values->Pointer[0], 0); ... const uint64_t v4_address = __gen_combine_address(data, &dw[4], values->Pointer[1], 0); ... Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Move SOL PSIZ hacks from draw time to link time.Kenneth Graunke2017-06-011-12/+1
| | | | | | | | | We can just update the gl_transform_feedback_info fields at link time to make the VUE header fields have the right location and component. Then we don't need to handle them specially at draw time, which is expensive. Reviewed-by: Rafael Antognolli <[email protected]>
* anv: Port over CACHE_MODE_1 optimization fix enables from brw.Kenneth Graunke2017-05-301-0/+13
| | | | | | | | | Ben and I haven't observed these to help anything, but they enable hardware optimizations for particular cases. It's probably best to enable them ahead of time, before we run into such a case. Reviewed-by: Plamena Manolova <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* genxml: Add Gen9 CACHE_MODE_1 definitons.Kenneth Graunke2017-05-301-0/+30
| | | | | | | | | These were already in gen8.xml but not gen9.xml. There are a few new fields and a couple that have changed. These are all documented in the Skylake PRM, Volume 2c Command Reference: Registers, Part 1. Reviewed-by: Plamena Manolova <[email protected]> Acked-by: Jason Ekstrand <[email protected]>
* genxml: Make a SCISSOR_RECT structure on Gen4-5.Kenneth Graunke2017-05-293-12/+24
| | | | | | | | | | | | | Gen6+ support multiple scissor rectangles, and define a SCISSOR_RECT structure containing their dimensions. On Gen4-5, those same fields exist in SF_VIEWPORT. This patch extracts the SF_VIEWPORT fields into a SCISSOR_RECT structure. Although not a named concept on Gen4-5, it works just as well, and gives us a consistent SCISSOR_RECT structure across all generations, making it easier to reuse code. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Ignore INTEL_SCALAR_* debug variables on Gen10+.Kenneth Graunke2017-05-291-10/+16
| | | | | | | | | | | Scalar mode has been default since Broadwell, and vector mode is getting increasingly unmaintained. There are a few things that don't even fully work in vector mode on Skylake, but we've never cared because nobody uses it. There's no point in porting it forward to new platforms. So, just ignore the debug options to force it on. Reviewed-by: Jason Ekstrand <[email protected]>
* anv: automake: list shared libraries after the static onesEmil Velikov2017-05-291-16/+15
| | | | | | | | | | The compiler can discard the shared ones from the link chain, since there is no user (the static libraries) before it on the command line. Cc: [email protected] Reported-by: Laurent Carlier <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eduardo Lima Mitev <[email protected]>
* i965: Use BLORP for color clears on gen4-5Jason Ekstrand2017-05-261-0/+4
| | | | | | | | We don't support replicated data clears yet. Those take a bit more work and enabling replicated data clears in its own commit is probably better for bisectibility anyway. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Assert that no one tries to blit combined depth stencilJason Ekstrand2017-05-261-0/+6
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Add blorp support for gen4-5Jason Ekstrand2017-05-264-12/+95
| | | | | | | | | | Due to complications with things such as URB setup on gen4-5, it's easier to keep gen4 support in blorp completely internal to i965. This makes things a bit awkward because that means there's a file in i965 that includes blorp_priv.h but it's either that or have a file in blorp that includes brw_context.h. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Set additional brw_wm_prog_key fields on gen4-5Jason Ekstrand2017-05-262-2/+9
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Add support for gen4-5 SF programsJason Ekstrand2017-05-266-3/+85
| | | | | | | | As part of enabling support for SF programs, we plumb the SF URB size through to emit_urb_config. For now, it's always zero but, on gen4, it may be something larger. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Make convert_to_single_slice available outside blorp_blitJason Ekstrand2017-05-262-8/+11
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Use designated initializers to set up VERTEX_ELEMENTSJason Ekstrand2017-05-261-32/+43
| | | | | | | | We also add a slot variable and use it as an iterator. This will make it much easier to conditionally put something between the header and the vertex position. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Rename emit_viewport_state to emit_cc_viewportJason Ekstrand2017-05-261-3/+3
| | | | | | | The real point of this packet is that it sets up CC_VIEWPORT so that name is a bit better. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Make the common genX_blorp_exec code gen4-safeJason Ekstrand2017-05-261-8/+36
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Re-arrange blorp_genX_exec.hJason Ekstrand2017-05-261-229/+229
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Don't use ffma directlyJason Ekstrand2017-05-261-1/+1
| | | | | | | It isn't supported prior to gen6 and, on gen6+, NIR will fuse the fmul and fadd into an ffma automatically for us anyway. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Delete isl_to_gen_ds_surfypeJason Ekstrand2017-05-261-19/+0
| | | | | | It's no longer used. Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp: Pull the pipeline bits of blorp_exec into a helperJason Ekstrand2017-05-261-25/+31
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/blorp/blit: Add support for normalized coordinatesJason Ekstrand2017-05-262-5/+28
| | | | | | | Gen5 and earlier can't do non-normalized coordinates so we need to compensate in the shader. Fortunately, it's pretty easy plumb through. Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move clip program compilation to the compilerJason Ekstrand2017-05-269-0/+2347
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* i965: Move SF compilation to the compilerJason Ekstrand2017-05-264-0/+932
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/compiler: Make brw_disasm take const assemblyJason Ekstrand2017-05-263-15/+15
| | | | Reviewed-by: Topi Pohjolainen <[email protected]>
* intel/decoder: Handle the BLT ring in gen_group_get_lengthJason Ekstrand2017-05-261-0/+4
| | | | Reviewed-by: Jordan Justen <[email protected]>