summaryrefslogtreecommitdiffstats
path: root/src/intel/common
Commit message (Collapse)AuthorAgeFilesLines
* intel: decoder: enable decoding a single fieldLionel Landwerlin2017-11-012-0/+52
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: expose missing find_enum()Lionel Landwerlin2017-11-011-0/+2
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: extract field value computationLionel Landwerlin2017-11-011-30/+37
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: rename field() to field_value()Lionel Landwerlin2017-11-011-18/+18
| | | | | | | We would like to avoid collisions with variables named field. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: rename internal function to free nameLionel Landwerlin2017-11-011-3/+3
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: simplify field_is_header()Lionel Landwerlin2017-11-012-4/+6
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: common: make intel utils available from C++Lionel Landwerlin2017-11-012-0/+17
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: remove unused platform fieldLionel Landwerlin2017-11-011-2/+0
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: extract instruction/structs lengthLionel Landwerlin2017-11-012-0/+8
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: pack iterator variable declarationsLionel Landwerlin2017-11-011-11/+8
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: simplify creation of struct when 0-allocatedLionel Landwerlin2017-11-011-4/+0
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: add destructor for gen_specLionel Landwerlin2017-11-012-102/+91
| | | | | | | | This makes use of ralloc to simplify the destruction. We can also store instructions in hash tables. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: expose helper to test header fieldsLionel Landwerlin2017-11-012-3/+4
| | | | | | | | These fields are of little importance as they're used to recognize instructions. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: don't read qword outside instruction/struct limitLionel Landwerlin2017-11-012-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | We used to print invalid data when the last field was being clamped to 32bits due to Dword Length of the whole instruction. Here is an example where the decoder read part of the next instruction instead of stopping at the 32bit limit: 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 8791026489807077376 With this change we have the proper value : 0x000ce0b4: 0x10000002: MI_STORE_DATA_IMM (4 Dwords) 0x000ce0b4: 0x10000002 : Dword 0 DWord Length: 2 Store Qword: 0 Use Global GTT: false 0x000ce0b8: 0x00045010 : Dword 1 Core Mode Enable: 0 Address: 0x00045010 0x000ce0bc: 0x00000000 : Dword 2 0x000ce0c0: 0x00000000 : Dword 3 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: split out getting the next field and decoding itLionel Landwerlin2017-11-011-10/+21
| | | | | | | | | Due to the new way we handle fields, we need *not* to forget the first field when decoding instructions. The issue was that the advance function was called first and skipped the first field. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: move field name copyLionel Landwerlin2017-11-011-2/+7
| | | | | | | This should be inside the function that actually decodes fields. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: reorder iterator init functionLionel Landwerlin2017-11-011-14/+14
| | | | | | | Making the next change more readable. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: common: print out all dword with field spanning multiple dwordsLionel Landwerlin2017-11-011-4/+6
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | For example, we were skipping Dword 3 in this PIPE_CONTROL : 0x000ce130: 0x7a000004: PIPE_CONTROL DWord Length: 4 0x000ce134: 0x00000010 : Dword 1 Flush LLC: false Destination Address Type: 0 (PPGTT) LRI Post Sync Operation: 0 (No LRI Operation) Store Data Index: 0 Command Streamer Stall Enable: false Global Snapshot Count Reset: false TLB Invalidate: false Generic Media State Clear: false Post Sync Operation: 0 (No Write) Depth Stall Enable: false Render Target Cache Flush Enable: false Instruction Cache Invalidate Enable: false Texture Cache Invalidation Enable: false Indirect State Pointers Disable: false Notify Enable: false Pipe Control Flush Enable: false DC Flush Enable: false VF Cache Invalidation Enable: true Constant Cache Invalidation Enable: false State Cache Invalidation Enable: false Stall At Pixel Scoreboard: false Depth Cache Flush Enable: false 0x000ce138: 0x00000000 : Dword 2 Address: 0x00000000 0x000ce140: 0x00000000 : Dword 4 Immediate Data: 0 Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: decoder: build sorted linked lists of fieldsLionel Landwerlin2017-11-012-25/+34
| | | | | | | | | | The xml files don't always have fields in order. This might confuse our parsing of the commands. Let's have the fields in order. To do this, the easiest way it to use a linked list. It also helps a bit with the iterator. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel: common: expose gen_spec fieldsLionel Landwerlin2017-11-012-13/+13
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Scott D Phillips <[email protected]>
* intel/genxml: Fix decoding of groups with fields smaller than a DWord.Kenneth Graunke2017-10-302-10/+16
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | Groups containing fields smaller than a DWord were not being decoded correctly. For example: <group count="32" start="32" size="4"> <field name="Vertex Element Enables" start="0" end="3" type="uint"/> </group> gen_field_iterator_next would properly walk over each element of the array, incrementing group_iter, and calling iter_group_offset_bits() to advance to the proper DWord. However, the code to print the actual values only considered iter->field->start/end, which are 0 and 3 in the above example. So it would always fetch bits 3:0 of the current DWord when printing values, instead of advancing to each element of the array, printing bits 0-3, 4-7, 8-11, and so on. To fix this, we add new iter->start/end tracking, which properly advances for each instance of a group's field. Caught by Matt Turner while working on 3DSTATE_VF_COMPONENT_PACKING, with a patch to convert it to use an array of bitfields (the example above). This also fixes the decoding of 3DSTATE_SBE's "Attribute Active Component Format" fields. Reviewed-by: Jordan Justen <[email protected]>
* intel: common: silence compiler warningLionel Landwerlin2017-10-301-1/+1
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* meson: move expat dependency where it's neededEric Engestrom2017-10-181-1/+1
| | | | | | Suggested-by: Lionel Landwerlin <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Signed-off-by: Eric Engestrom <[email protected]>
* intel: Add simple logging façade for Android (v2)Chad Versace2017-10-173-0/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm bringing up Vulkan in the Android container of Chrome OS (ARC++). On Android, stdio goes to /dev/null. On Android, remote gdb is even more painful than the usual remote gdb. On Android, nothing works like you expect and debugging is hell. I need logging. This patch introduces a small, simple logging API that can easily wrap Android's API. On non-Android platforms, this logger does nothing fancy. It follows the time-honored Unix tradition of spewing everything to stderr with minimal fuss. My goal here is not perfection. My goal is to make a minimal, clean API, that people hate merely a little instead of a lot, and that's good enough to let me bring up Android Vulkan. And it needs to be fast, which means it must be small. No one wants to their game to miss frames while aiming a flaming bow into the jaws of an angry robot t-rex, and thus become t-rex breakfast, because some fool had too much fun desiging a bloated, ideal logging API. If people like it, perhaps we should quickly promote it to src/util. The API looks like this: #define INTEL_LOG_TAG "intel-vulkan" #define DEBUG intel_logd("try hard thing with foo=%d", foo); n = try_foo(...); if (n < 0) { intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__); return VK_ERROR_DEVICE_LOST; } And produces this on non-Android: intel-vulkan: debug: try hard thing with foo=93 intel-vulkan: error: anv_device.c:182: foo failed bigtime v2: Fix meson build. [for dcbaker] Reviewed-by: Jason Ekstrand <[email protected]>
* intel/common: Improve the comments for sample positionsJason Ekstrand2017-10-161-0/+65
| | | | | | | These are pulled directly from brw_multisample_state.h Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Add parentheses around usage of macro argumentsMatt Turner2017-10-041-1/+1
| | | | | | Otherwise I cannot use this macro in test_eu_validate.cpp Reviewed-by: Iago Toral Quiroga <[email protected]>
* meson: Add build Intel "anv" vulkan driverDylan Baker2017-09-271-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows building and installing the Intel "anv" Vulkan driver using meson and ninja, the driver has been tested against the CTS and has seems to pass the same series of tests (they both segfault when the CTS tries to run wayland wsi tests). There are still a mess of TODO, XXX, and FIXME comments in here. Those are mostly for meson bugs I'm trying to fix, or for additional things to implement for other drivers/features. I have configured all intermediate libraries and optional tools to not build by default, meaning they will only be built if they're pulled in as a dependency of a target that will actually be installed) this allows us to avoid massive if chains, while ensuring that only the bits that need to be built are. v2: - enable anv, x11, and wayland by default - add configure option to disable valgrind v3: - fix typo in meson_options (Nicholas) v4: - Remove dead code (Eric) - Remove change to generator that was from v0 (Eric) - replace if chain with loop (Eric) - Fix typos (Eric) - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric) v5: - rebase on util string buffer implementation Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v4)
* Revert "intel: Remove unused device info for KBL GT1.5"Anuj Phogat2017-09-211-0/+11
| | | | | | | | This reverts commit 4c4c28ca70b2267a2563047e35498b1c9252664f. GT1.5 device info is required for few reserved pci-id's. Signed-off-by: Anuj Phogat <[email protected]>
* i965: Add an INTEL_DEBUG=reemit option.Kenneth Graunke2017-09-152-0/+2
| | | | | | | | | Jason and I use this for debugging all the time. Recompiling the driver to enable it is kind of annoying. It's a great thing to try along with always_flush_batch=true and always_flush_cache=true to detect a class of problems - namely, atoms listening to an insufficient set of dirty bits. Reviewed-by: Matt Turner <[email protected]>
* i965: Add an INTEL_DEBUG=submit option for printing batch statistics.Kenneth Graunke2017-09-132-1/+2
| | | | | | | | | | | | | | | | | | | When a batch is submitted, INTEL_DEBUG=bat prints a message indicating which part of the code triggered the flush, and some statistics about the batch/state buffer utilization. It also decodes the batchbuffer in debug builds...which is so much output that it drowns out the utilization messages, if that's all you care about. INTEL_DEBUG=submit now just does the utilization messages. INTEL_DEBUG=bat continues to do both (as the message is a good indicator that we're starting decode of a new batch). v2: Rename from "flush" to "submit" (suggested by Chris) because we might want "flush" for PIPE_CONTROL debugging someday. Reviewed-by: Chris Wilson <[email protected]>
* intel: Remove unused device info for KBL GT1.5Anuj Phogat2017-09-061-11/+0
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/decoder: Reuse the gen_make_gen() helper.Eric Anholt2017-07-251-3/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Reuse the MAX2 macro instead of defining another one.Eric Anholt2017-07-251-3/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: add number of subslices to device infoLionel Landwerlin2017-07-112-8/+54
| | | | | | | | | | | We could have used a single integer to store that value, but Cannonlake has different number of subslices per slice depending on the GT. v2: Add CFL subslice numbers (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* intel: Fix clflushing on modern (Baytrail+) Atom CPUs.Kenneth Graunke2017-07-101-0/+12
| | | | | | | | Thanks to Chris Wilson for pointing this out. Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel: Move clflush helpers from anv to common/gen_clflush.h.Kenneth Graunke2017-07-101-0/+56
| | | | | | | | | I want to use these in the OpenGL driver as well. v2: Add to COMMON_FILES in Makefile.sources (caught by Emil) Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/CFL: Add PCI Ids for Coffee Lake.Anusha Srivatsa2017-06-222-0/+27
| | | | | | | | | | | | | | Coffee Lake has a gen9 graphics following KBL. From 3D perspective, CFL is a clone of KBL/SKL features. v2: Change commit message, correct alignment <Anuj Phogat> v3: Update IDs. v4: Initialize l3_banks, correct nomenclature <Anuj> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Anusha Srivatsa <[email protected]> Acked-by: Benjamin Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: compiler/i965: fix is_broxton checksLionel Landwerlin2017-06-201-0/+3
| | | | | | | | | | In 5f2fe9302c is_geminilake was introduced for the differenciate broxton from geminilake. Unfortunately I failed as verifying that is_broxton is throughout the code base to mean Gen9lp. Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cnl: Add l3 configuration for CannonlakeBen Widawsky2017-06-201-1/+20
| | | | | | | | | | | | | | | | | | V2 (Anuj): Squash the changes in one patch rebase on master. Address the review comments made by Francisco Jerez. Do the URB allocation per slice (not per bank). V3 (Anuj): Update the comment. Format the table as other l3 config tables. Signed-off-by: Ben Widawsky <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> --- V1 was sent out with the heading: "i965/cnl: Properly handle l3 configuration"
* i965: Add a variable for way size per bank in get_l3_way_size()Anuj Phogat2017-06-201-5/+4
| | | | | | | | | | Adding this variable better explains the computation of L3 way size in the function. V2: Use const variable for way_size_per_bank. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Fix broxton 2x6 l3 configAnuj Phogat2017-06-201-0/+16
| | | | | | | | | | | The new table added in this patch matches with the table in gfxspecs. We were programming the wrong values earlier. V2: Update the comment. Cc: "17.1" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel: common: add number of thread per euLionel Landwerlin2017-06-192-2/+28
| | | | | | | This will be used by to normalize OA counters. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: common: express timestamps units in frequencyLionel Landwerlin2017-06-192-11/+13
| | | | | | | | | | | | | | Rather than storing the period as a double that looses some precision. Also fixes the Gen9LP timestamp frequency which is no 19200123 but 19200000 as pointed by Ville : https://lists.freedesktop.org/archives/intel-gfx/2017-April/125126.html Finally add the Cannonlake timestamp frequency. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: common: add flag to identify platforms by nameLionel Landwerlin2017-06-192-6/+24
| | | | | | | | The perf infrastructure needs to identify specific platforms, not just generations. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cnl: Add a preliminary device for CannonlakeBen Widawsky2017-06-091-0/+46
| | | | | | | | | | | | | | | | v2 (Anuj): Rebased on master and updated pci ids Remove redundant initialization of max_wm_threads to 64 * 12. For gen9+ max_wm_threads are initialized in gen_get_device_info(). v3 (Anuj): Move the patch to end of series. Remove unused gt1, gt2, gt3 functions. Remove l3_banks variable. Variable is now available on master. Signed-off-by: Anuj Phogat <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/cnl: Handle gen10 in switch cases across the driverAnuj Phogat2017-06-091-0/+1
| | | | | | | | | V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec() gen10_init_atoms() (Jason) Remove Vulkan changes. Do them later in a separate patch. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Make feature macros gen8 basedBen Widawsky2017-06-091-8/+5
| | | | | | | | | | All the "features" of the hardware are similar starting with GEN8, so remove as much of the GEN9 uniqueness as possible. This makes implementing future gen platforms a bit easier. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Fix broxton 2x6 way size computationAnuj Phogat2017-06-061-0/+4
| | | | | | | | | | | | | | | | | This patch is undoing the changes to way size computation in broxton 2x6, made by below commit: Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138 Author: Anuj Phogat <[email protected]> i965: Simplify l3 way size computations By making use of l3_banks field in gen_device_info struct l3_way_size for gen7+ = 2 * l3_banks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306 Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel: gen-decoder: rework how we handle groupsLionel Landwerlin2017-06-062-86/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of handling groups doesn't seem to be able to handle MI_LOAD_REGISTER_* with more than one register. This change reworks the way we handle groups by building a traversal list on loading the GENXML files. Let's say you have Instruction { Field0 Field1 Field2 Group0 (count=2) { Field0-0 Field0-1 } Group1 (count=4) { Field1-0 Field1-1 } } We build of linked on load that goes : Instruction -> Group0 -> Group1 All of those are gen_group structures, making the traversal trivial. We just need to iterate groups for the right number of timers (count field in genxml). The more fancy case is when you have only a single group of unknown size (count=0). In that case we keep on reading that group for as long as we're within the DWordLength of that instruction. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.Kenneth Graunke2017-06-052-2/+1
| | | | | | | | | We moved to INTEL_SCALAR_* when we added more than a single stage, but never went back and converted the VS to work that way. Be consistent. Also update the documentation to actually mention these debug variables. Acked-by: Jason Ekstrand <[email protected]>