summaryrefslogtreecommitdiffstats
path: root/src/intel/common
Commit message (Collapse)AuthorAgeFilesLines
...
* intel: Add simple logging façade for Android (v2)Chad Versace2017-10-173-0/+170
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | I'm bringing up Vulkan in the Android container of Chrome OS (ARC++). On Android, stdio goes to /dev/null. On Android, remote gdb is even more painful than the usual remote gdb. On Android, nothing works like you expect and debugging is hell. I need logging. This patch introduces a small, simple logging API that can easily wrap Android's API. On non-Android platforms, this logger does nothing fancy. It follows the time-honored Unix tradition of spewing everything to stderr with minimal fuss. My goal here is not perfection. My goal is to make a minimal, clean API, that people hate merely a little instead of a lot, and that's good enough to let me bring up Android Vulkan. And it needs to be fast, which means it must be small. No one wants to their game to miss frames while aiming a flaming bow into the jaws of an angry robot t-rex, and thus become t-rex breakfast, because some fool had too much fun desiging a bloated, ideal logging API. If people like it, perhaps we should quickly promote it to src/util. The API looks like this: #define INTEL_LOG_TAG "intel-vulkan" #define DEBUG intel_logd("try hard thing with foo=%d", foo); n = try_foo(...); if (n < 0) { intel_loge("%s:%d: foo failed bigtime", __FILE__, __LINE__); return VK_ERROR_DEVICE_LOST; } And produces this on non-Android: intel-vulkan: debug: try hard thing with foo=93 intel-vulkan: error: anv_device.c:182: foo failed bigtime v2: Fix meson build. [for dcbaker] Reviewed-by: Jason Ekstrand <[email protected]>
* intel/common: Improve the comments for sample positionsJason Ekstrand2017-10-161-0/+65
| | | | | | | These are pulled directly from brw_multisample_state.h Reviewed-by: Kenneth Graunke <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* i965: Add parentheses around usage of macro argumentsMatt Turner2017-10-041-1/+1
| | | | | | Otherwise I cannot use this macro in test_eu_validate.cpp Reviewed-by: Iago Toral Quiroga <[email protected]>
* meson: Add build Intel "anv" vulkan driverDylan Baker2017-09-271-0/+44
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This allows building and installing the Intel "anv" Vulkan driver using meson and ninja, the driver has been tested against the CTS and has seems to pass the same series of tests (they both segfault when the CTS tries to run wayland wsi tests). There are still a mess of TODO, XXX, and FIXME comments in here. Those are mostly for meson bugs I'm trying to fix, or for additional things to implement for other drivers/features. I have configured all intermediate libraries and optional tools to not build by default, meaning they will only be built if they're pulled in as a dependency of a target that will actually be installed) this allows us to avoid massive if chains, while ensuring that only the bits that need to be built are. v2: - enable anv, x11, and wayland by default - add configure option to disable valgrind v3: - fix typo in meson_options (Nicholas) v4: - Remove dead code (Eric) - Remove change to generator that was from v0 (Eric) - replace if chain with loop (Eric) - Fix typos (Eric) - define HAVE_DLOPEN for both libdl and builtin dl cases (Eric) v5: - rebase on util string buffer implementation Signed-off-by: Dylan Baker <[email protected]> Reviewed-by: Eric Anholt <[email protected]> (v4)
* Revert "intel: Remove unused device info for KBL GT1.5"Anuj Phogat2017-09-211-0/+11
| | | | | | | | This reverts commit 4c4c28ca70b2267a2563047e35498b1c9252664f. GT1.5 device info is required for few reserved pci-id's. Signed-off-by: Anuj Phogat <[email protected]>
* i965: Add an INTEL_DEBUG=reemit option.Kenneth Graunke2017-09-152-0/+2
| | | | | | | | | Jason and I use this for debugging all the time. Recompiling the driver to enable it is kind of annoying. It's a great thing to try along with always_flush_batch=true and always_flush_cache=true to detect a class of problems - namely, atoms listening to an insufficient set of dirty bits. Reviewed-by: Matt Turner <[email protected]>
* i965: Add an INTEL_DEBUG=submit option for printing batch statistics.Kenneth Graunke2017-09-132-1/+2
| | | | | | | | | | | | | | | | | | | When a batch is submitted, INTEL_DEBUG=bat prints a message indicating which part of the code triggered the flush, and some statistics about the batch/state buffer utilization. It also decodes the batchbuffer in debug builds...which is so much output that it drowns out the utilization messages, if that's all you care about. INTEL_DEBUG=submit now just does the utilization messages. INTEL_DEBUG=bat continues to do both (as the message is a good indicator that we're starting decode of a new batch). v2: Rename from "flush" to "submit" (suggested by Chris) because we might want "flush" for PIPE_CONTROL debugging someday. Reviewed-by: Chris Wilson <[email protected]>
* intel: Remove unused device info for KBL GT1.5Anuj Phogat2017-09-061-11/+0
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/decoder: Reuse the gen_make_gen() helper.Eric Anholt2017-07-251-3/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Reuse the MAX2 macro instead of defining another one.Eric Anholt2017-07-251-3/+1
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: add number of subslices to device infoLionel Landwerlin2017-07-112-8/+54
| | | | | | | | | | | We could have used a single integer to store that value, but Cannonlake has different number of subslices per slice depending on the GT. v2: Add CFL subslice numbers (Lionel) Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Ben Widawsky <[email protected]>
* intel: Fix clflushing on modern (Baytrail+) Atom CPUs.Kenneth Graunke2017-07-101-0/+12
| | | | | | | | Thanks to Chris Wilson for pointing this out. Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel: Move clflush helpers from anv to common/gen_clflush.h.Kenneth Graunke2017-07-101-0/+56
| | | | | | | | | I want to use these in the OpenGL driver as well. v2: Add to COMMON_FILES in Makefile.sources (caught by Emil) Reviewed-by: Daniel Vetter <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965/CFL: Add PCI Ids for Coffee Lake.Anusha Srivatsa2017-06-222-0/+27
| | | | | | | | | | | | | | Coffee Lake has a gen9 graphics following KBL. From 3D perspective, CFL is a clone of KBL/SKL features. v2: Change commit message, correct alignment <Anuj Phogat> v3: Update IDs. v4: Initialize l3_banks, correct nomenclature <Anuj> Cc: Rodrigo Vivi <[email protected]> Signed-off-by: Anusha Srivatsa <[email protected]> Acked-by: Benjamin Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* intel: compiler/i965: fix is_broxton checksLionel Landwerlin2017-06-201-0/+3
| | | | | | | | | | In 5f2fe9302c is_geminilake was introduced for the differenciate broxton from geminilake. Unfortunately I failed as verifying that is_broxton is throughout the code base to mean Gen9lp. Fixes: 5f2fe9302c ("intel: common: add flag to identify platforms by name") Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cnl: Add l3 configuration for CannonlakeBen Widawsky2017-06-201-1/+20
| | | | | | | | | | | | | | | | | | V2 (Anuj): Squash the changes in one patch rebase on master. Address the review comments made by Francisco Jerez. Do the URB allocation per slice (not per bank). V3 (Anuj): Update the comment. Format the table as other l3 config tables. Signed-off-by: Ben Widawsky <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]> --- V1 was sent out with the heading: "i965/cnl: Properly handle l3 configuration"
* i965: Add a variable for way size per bank in get_l3_way_size()Anuj Phogat2017-06-201-5/+4
| | | | | | | | | | Adding this variable better explains the computation of L3 way size in the function. V2: Use const variable for way_size_per_bank. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Fix broxton 2x6 l3 configAnuj Phogat2017-06-201-0/+16
| | | | | | | | | | | The new table added in this patch matches with the table in gfxspecs. We were programming the wrong values earlier. V2: Update the comment. Cc: "17.1" <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel: common: add number of thread per euLionel Landwerlin2017-06-192-2/+28
| | | | | | | This will be used by to normalize OA counters. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: common: express timestamps units in frequencyLionel Landwerlin2017-06-192-11/+13
| | | | | | | | | | | | | | Rather than storing the period as a double that looses some precision. Also fixes the Gen9LP timestamp frequency which is no 19200123 but 19200000 as pointed by Ville : https://lists.freedesktop.org/archives/intel-gfx/2017-April/125126.html Finally add the Cannonlake timestamp frequency. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* intel: common: add flag to identify platforms by nameLionel Landwerlin2017-06-192-6/+24
| | | | | | | | The perf infrastructure needs to identify specific platforms, not just generations. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* i965/cnl: Add a preliminary device for CannonlakeBen Widawsky2017-06-091-0/+46
| | | | | | | | | | | | | | | | v2 (Anuj): Rebased on master and updated pci ids Remove redundant initialization of max_wm_threads to 64 * 12. For gen9+ max_wm_threads are initialized in gen_get_device_info(). v3 (Anuj): Move the patch to end of series. Remove unused gt1, gt2, gt3 functions. Remove l3_banks variable. Variable is now available on master. Signed-off-by: Anuj Phogat <[email protected]> Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]>
* i965/cnl: Handle gen10 in switch cases across the driverAnuj Phogat2017-06-091-0/+1
| | | | | | | | | V2: Start using gen10 functions isl_gen10*(), gen10_blorp_exec() gen10_init_atoms() (Jason) Remove Vulkan changes. Do them later in a separate patch. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965: Make feature macros gen8 basedBen Widawsky2017-06-091-8/+5
| | | | | | | | | | All the "features" of the hardware are similar starting with GEN8, so remove as much of the GEN9 uniqueness as possible. This makes implementing future gen platforms a bit easier. Signed-off-by: Ben Widawsky <[email protected]> Reviewed-by: Anuj Phogat <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel: Fix broxton 2x6 way size computationAnuj Phogat2017-06-061-0/+4
| | | | | | | | | | | | | | | | | This patch is undoing the changes to way size computation in broxton 2x6, made by below commit: Commit: 0d576fbfbe912cf3fb9ab594bb31eb58bccf2138 Author: Anuj Phogat <[email protected]> i965: Simplify l3 way size computations By making use of l3_banks field in gen_device_info struct l3_way_size for gen7+ = 2 * l3_banks. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=101306 Signed-off-by: Anuj Phogat <[email protected]> Tested-by: Mark Janes <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel: gen-decoder: rework how we handle groupsLionel Landwerlin2017-06-062-86/+161
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | The current way of handling groups doesn't seem to be able to handle MI_LOAD_REGISTER_* with more than one register. This change reworks the way we handle groups by building a traversal list on loading the GENXML files. Let's say you have Instruction { Field0 Field1 Field2 Group0 (count=2) { Field0-0 Field0-1 } Group1 (count=4) { Field1-0 Field1-1 } } We build of linked on load that goes : Instruction -> Group0 -> Group1 All of those are gen_group structures, making the traversal trivial. We just need to iterate groups for the right number of timers (count field in genxml). The more fancy case is when you have only a single group of unknown size (count=0). In that case we keep on reading that group for as long as we're within the DWordLength of that instruction. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* i965: Change INTEL_DEBUG=vec4 to INTEL_SCALAR_VS for consistency.Kenneth Graunke2017-06-052-2/+1
| | | | | | | | | We moved to INTEL_SCALAR_* when we added more than a single stage, but never went back and converted the VS to work that way. Be consistent. Also update the documentation to actually mention these debug variables. Acked-by: Jason Ekstrand <[email protected]>
* i965: Simplify l3 way size computationsAnuj Phogat2017-06-021-10/+2
| | | | | | | | | | | By making use of l3_banks field in gen_device_info struct l3_way_size for gen7+ = 2 * l3_banks. V2: Keep the get_l3_way_size() function. Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965: Add and initialize l3_banks field for gen7+Anuj Phogat2017-06-022-3/+27
| | | | | | | | | | | This new field helps simplify l3 way size computations in next patch. V2: Initialize the l3_banks to 0 in macros. Suggested-by: Francisco Jerez <[email protected]> Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* genxml: Fix decoder to print the array element on field members.Kenneth Graunke2017-06-011-3/+9
| | | | | | | | | | | | | | | | | | | | | | | | | Previously we'd print things like: 0xfffbb568: 0x00010000 : Dword 1 ReadLength: 0 ReadLength: 1 0xfffbb568: 0x00000001 : Dword 1 ReadLength: 1 ReadLength: 0 instead of the more obvious: 0xfffbb568: 0x00010000 : Dword 1 ReadLength[0]: 0 ReadLength[1]: 1 0xfffbb568: 0x00000001 : Dword 1 ReadLength[2]: 1 ReadLength[3]: 0 (Yes, the ralloc context here is bogus - the decoder leaks just about everything. We need to use proper ralloc contexts someday...) Acked-by: Lionel Landwerlin <[email protected]>
* genxml: Fix decoding of array groups.Kenneth Graunke2017-06-011-1/+1
| | | | | | | | | | | | | | | | | | | If you had a group as the first element of a struct, i.e. <struct name="3DSTATE_CONSTANT_BODY" length="10"> <group count="4" start="0" size="16"> <field name="ReadLength" start="0" end="15" type="uint"/> </group> ... </struct> we would get a group_offset of 0, causing create_field() to think the field wasn't in a group, and fail to offset forward for successive array elements. So we'd mark all the array elements as offset 0. Using ctx->group->elem_size is a better check for "are we in a group?". Acked-by: Lionel Landwerlin <[email protected]>
* genxml: Fix decoder for groups with multiple fields.Kenneth Graunke2017-06-011-4/+2
| | | | | | | | | | | | | | | | | If you have something like: <group count="0" start="96" size="32"> <field name="Entry_0" start="0" end="15" type="GATHER_CONSTANT_ENTRY"/> <field name="Entry_1" start="16" end="31" type="GATHER_CONSTANT_ENTRY"/> </group> We would reset ctx->group_count to 0 after processing the first field, so the second would not have a group count. This is largely untested, as the only groups with multiple fields are packets we don't emit in Mesa. Found by inspection. Acked-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Handle the BLT ring in gen_group_get_lengthJason Ekstrand2017-05-261-0/+4
| | | | Reviewed-by: Jordan Justen <[email protected]>
* intel/decoder: Handle gen4 VF_STATISTICS and PIPELINE_SELECTJason Ekstrand2017-05-261-2/+7
| | | | | | | These need special handling because they have no "DWord Length" parameter and they have an unusual bias of 1. Reviewed-by: Jordan Justen <[email protected]>
* intel/decoder: Fix indentationMatt Turner2017-05-151-4/+4
| | | | Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel: gen-decoder: fix xml parser leakLionel Landwerlin2017-05-151-6/+7
| | | | | | | | In the unlikely case the parsing of genxml files fails, we were leaking an xml parser object. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]>
* i965: Drop INTEL_DEBUG=stats.Kenneth Graunke2017-05-102-2/+1
| | | | | | | | | | | | | | | | For whatever reason, we had an INTEL_DEBUG=stats option that enabled various statistics counters on Gen4-5 systems. It's been around forever, though I can't think of a single time that it's been useful. On Gen6+, we enable statistics all the time because they're necessary to support various query object targets. Turning them off would break those queries. Gen4-5 don't support those queries, so the statistics counters generally aren't useful; we disabled them by default. This patch disables them altogether. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel: gen decoder: don't check for size_t negative valuesLionel Landwerlin2017-05-091-1/+1
| | | | | | | | | We should get either 0 or 1 here. CID: 1373562 (Control flow issues) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Matt Turner <[email protected]>
* intel/aubinator: Correctly read variable length structs.Rafael Antognolli2017-04-242-6/+35
| | | | | | | | | | | | | | | | Before this commit, when a group with count="0" is found, only one field is added to the struct representing the instruction. This causes only one entry to be printed by aubinator, for variable length groups. With this commit we "detect" that there's a variable length group (count="0") and store the offset of the last entry added to the struct when reading the xml. When finally reading the aubdump file, we check the size of the group and whether we have variable number of elements, and in that case, reuse the last field to add the remaining elements. Signed-off-by: Rafael Antognolli <[email protected]> Tested-by: Jason Ekstrand <[email protected]> Acked-by: Kenneth Graunke <[email protected]>
* intel/decoder: Fix is_header_field starting condition.Kenneth Graunke2017-04-161-1/+1
| | | | | | | | | | | Starting positions >= 32 are not part of the header, rather than >. Caught by Coverity, which found that "bits <<= field->start" may shift by 32, which has undefined behavior. CID: 1404968 Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/gen_decoder: return -1 for unknown command formatsJordan Justen2017-04-061-7/+15
| | | | | | | | | | | | | | | | | Decoding with aubinator encountered a command of 0xffffffff. With the previous code, it caused aubinator to jump 255 + 2 dwords to start decoding again. Instead we can attempt to detect the known instruction formats. If the format is not recognized, then we can advance just 1 dword. v2: * Update aubinator_error_decode * Actually convert the length variable returned into a *signed* integer in aubinator.c, intel_batchbuffer.c and aubinator_error_decode.c. Signed-off-by: Jordan Justen <[email protected]> Acked-by: Lionel Landwerlin <[email protected]>
* intel/gen_decoder: Fix length for Media State/Object commandsJordan Justen2017-04-061-2/+10
| | | | | | | | From BDW PRM, Volume 6: Command Stream Programming, 'Render Command Header Format'. Signed-off-by: Jordan Justen <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* intel: tools: add aubinator_error_decode toolLionel Landwerlin2017-04-042-0/+12
| | | | | | | | | | | | | | | | This is pretty much the same tool as what i-g-t has, only with a more fancy decoding of the instructions/registers. It also doesn't support anything before gen4. v2 (from Matt): Drop authors Remove undefined automake variable v3: Fix incorrect offsets for dword > 1 (Jordan) v4: Fix decompression error with large blobs (Jordan) Signed-off-by: Lionel Landwerlin <[email protected]> Acked-by: Matt Turner <[email protected]>
* aubinator/gen_decoder/i965: decode instructions from dword 0Lionel Landwerlin2017-04-032-5/+18
| | | | | | | | | Some packets like 3DSTATE_VF_STATISTICS, 3DSTATE_DRAWING_RECTANGLE, 3DPRIMITIVE, PIPELINE_SELECT, etc... have configurable fields in dword0, we probably want to print those. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: gen_decoder: store pointer to current decoded field in iteratorLionel Landwerlin2017-04-032-25/+26
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel: genxml: compress all gen files into oneLionel Landwerlin2017-03-311-40/+22
| | | | | | | | | | | | | Combining all the files into a single string didn't make any difference in the size of the aubinator binary. With this change we now also embed gen4/4.5/5 descriptions, which increases the aubinator size by ~16Kb. v2 (Lionel): rebase makefiles Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]>
* intel/common: consistently use ifndef guards over pragma onceEmil Velikov2017-03-221-1/+5
| | | | | | | | Signed-off-by: Emil Velikov <[email protected]> Acked-by: Lionel Landwerlin <[email protected]> Acked-by: Vedran Miletić <[email protected]> Acked-by: Juha-Pekka Heikkila <[email protected]> Reviewed-by: Edward O'Callaghan <[email protected]>
* intel: Move tools/decoder.[ch] to common/gen_decoder.[ch].Kenneth Graunke2017-03-212-0/+1000
| | | | | | | This way they become part of libintel_common.la so I can use them in the i965 driver. Reviewed-by: Emil Velikov <[email protected]>
* intel: Add a INTEL_DEBUG=color option.Kenneth Graunke2017-03-212-0/+2
| | | | | | | This will be used for color output in debug messages. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* i965: Allow a per gen timebase scale factorRobert Bragg2017-03-172-2/+35
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | Prior to Skylake the Gen HW timestamps were driven by a 12.5MHz clock with the convenient property of being able to scale by an integer (80) to nanosecond units. For Skylake the frequency is 12MHz or a scale factor of 83.333333 This updates gen_device_info to track a floating point timebase_scale factor and makes corresponding _queryobj.c changes to no longer assume a scale factor of 80 works across all gens. Although the gen6_ code could have been been left alone, the changes keep the code more comparable, and it now shares a few utility functions for scaling raw timestamps and calculating deltas. The utility for calculating deltas takes into account 32 or 36bit overflow depending on the current kernel version. Note: this leaves the timestamp handling of ARB_query_buffer_object untouched, which continues to use an incorrect scale of 80 on Skylake for now. This is more awkward to solve since the scaling is currently done using a very limited uint64 ALU available to the command parser that doesn't support multiply or divide where it's already taking a large number of instructions just to effectively multiple by 80. This fixes piglit arb_timer_query-timestamp-get on Skylake v2: (Ken) Update timebase_scale for platforms past Skylake/Broxton too. Signed-off-by: Robert Bragg <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>