summaryrefslogtreecommitdiffstats
path: root/src/intel
Commit message (Collapse)AuthorAgeFilesLines
* intel/compiler: Use nir's info when checking uses_streams.Kenneth Graunke2018-11-281-1/+1
| | | | | | | | | Vulkan and Gallium don't use Mesa's gl_program data structure, so they can't poke at 'prog'. But we can simply use the copy of the shader info stored with the NIR shader, which is guaranteed to exist. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]>
* intel/compiler: fix register allocation in opt_peephole_selIago Toral Quiroga2018-11-281-2/+1
| | | | | | This wasn't handling 64-bit cases properly. Found by inspection. Reviewed-by: Ian Romanick <[email protected]>
* intel/compiler: fix indentation style in opt_algebraic()Iago Toral Quiroga2018-11-271-10/+10
|
* anv/icl: Set use full ways in L3CNTLREGAnuj Phogat2018-11-262-0/+2
| | | | | | | | L3 allocation table in h/w specification recommends using 4 KB granularity for programming allocation fields in L3CNTLREG. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* intel/icl: Set way_size_per_bank to 4Anuj Phogat2018-11-261-1/+2
| | | | | Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* i965/icl: Fix L3 configurationsAnuj Phogat2018-11-261-6/+6
| | | | | | | | | | Use L3 configuration specified in h/w specification. V2: Drop configs which do under allocation of l3 cache. Bump up the comment above table. Signed-off-by: Anuj Phogat <[email protected]> Reviewed-by: Francisco Jerez <[email protected]>
* anv: correctly use vulkan 1.0 by defaultEric Engestrom2018-11-261-1/+1
| | | | | | | | | | | | | Per chapter 3.2 "Instances": > Providing a NULL VkInstanceCreateInfo::pApplicationInfo or providing > an apiVersion of 0 is equivalent to providing an apiVersion of > VK_MAKE_VERSION(1,0,0). Reported-by: Niklas Haas <[email protected]> Fixes: 8c048af5890d43578ca4 "anv: Copy the appliation info into the instance" Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* anv: allow exporting an imported SYNC_FD semaphore typeTapani Pälli2018-11-231-1/+2
| | | | | | | | | | | Fixes issues with following SkQP tests: unitTest_VulkanHardwareBuffer_Vulkan_EGL_Syncs unitTest_VulkanHardwareBuffer_Vulkan_Vulkan_Syncs Signed-off-by: Tapani Pälli <[email protected]> Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/nir: Rework arguments to apply_pipeline_layoutJason Ekstrand2018-11-223-4/+8
| | | | | | | | | Instead of taking a whole pipeline (which could be anything!), just take a physical device and robust_buffer_access boolean. This makes it easier to verify that only the things in the hash actually affect pipeline compilation. Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Put robust buffer access in the pipeline hashJason Ekstrand2018-11-221-0/+6
| | | | | | | | It affects apply_pipeline_layout. Shaders compiled with the wrong value will work but they may not be robust as requested by the app. Cc: [email protected] Reviewed-by: Iago Toral Quiroga <[email protected]>
* anv: Expose VK_EXT_scalar_block_layoutJason Ekstrand2018-11-222-0/+8
| | | | | | | | Our compile already splits UBO loads into scalars and the untyped surface read messages we use for SSBO reads and writes only require dword alignment. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Do NIR shader cloning in the caller.Kenneth Graunke2018-11-206-15/+10
| | | | | | | | | | | | This moves nir_shader_clone() to the driver-specific compile function, rather than the shared src/intel/compiler code. This allows i965 to do key-specific passes before calling brw_compile_*. Vulkan should not need this cloning as it doesn't compile multiple variants. We do need to continue cloning in the compute shader code because we lower various things in NIR based on the SIMD width. Reviewed-by: Alejandro Piñeiro <[email protected]>
* meson: Add tests to suitesDylan Baker2018-11-203-3/+6
| | | | | | | | | | | | | | | | Meson test has a concepts of suites, which allow tests to be grouped together. This allows for a subtest of tests to be run only (say only the tests for nir). A test can be added to more than one suite, but for the most part I've only added a test to a single suite, though I've added a compiler group that includes nir, glsl, and glcpp tests. To use this you'll need to invoke meson test directly, instead of ninja test (which always runs all targets). it can be invoked as: `meson test -C builddir --suite $suitename` (meson test has addition options that are pretty useful). Tested-By: Gert Wollny <[email protected]> Acked-by: Eric Engestrom <[email protected]>
* i965: Allow only one slot of clip distances to be set on Gen4-5.Kenneth Graunke2018-11-191-1/+3
| | | | | | | | | | | | The existing backend code assumed that if VARYING_SLOT_CLIP_DIST0 was written, then VARYING_SLOT_CLIP_DIST1 would be as well. That's true with the current lowering, but not necessary if there are 4 or fewer clip distances. Separate out the checks to allow this. The new NIR-based lowering will trigger this case, which would have caused backend validation errors (src is null) without this patch. Reviewed-by: Eric Anholt <[email protected]>
* intel/fs,vec4: Fix a compiler warningJason Ekstrand2018-11-192-3/+3
| | | | | | | | | | | | | | ../src/intel/compiler/brw_fs_nir.cpp:3534:46: warning: comparison of integer expressions of different signedness: ‘unsigned int’ and ‘int’ [-Wsign-compare] assert(nir_intrinsic_write_mask(instr) == ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~ (1 << instr->num_components) - 1); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This was caused by 6339aba775ecdc which added these completely valid checks. However clang likes to complain about signedness mismatches. Fixes: 6339aba775ecdc "intel/compiler: Lower SSBO and shared..." Reviewed-by: Alejandro Piñeiro <[email protected]>
* intel,nir: Move gl_LocalInvocationID lowering to nir_lower_system_valuesJason Ekstrand2018-11-192-33/+1
| | | | | | | | It's not at all intel-specific; the formula is dictated by OpenGL and Vulkan. The only intel-specific thing is that we need the lowering. As a nice side-effect, the new version is variable-group-size ready. Reviewed-by: Plamena Manolova <[email protected]>
* i965: Correct L8_UNORM_SRGB table entryGert Wollny2018-11-191-1/+1
| | | | | | | As the name says, the format is an sRGB format. Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/aub_viewer: Print blend states properlyLionel Landwerlin2018-11-161-2/+16
| | | | | | | | | | | | | Identical fix to : commit 70de31d0c106f58d6b7e6d5b79b8d90c1c112a3b Author: Jason Ekstrand <[email protected]> Date: Fri Aug 24 16:05:08 2018 -0500 intel/batch_decoder: Print blend states properly Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Toni Lönnberg <[email protected]>
* intel/aub_viewer: fix dynamic state printingLionel Landwerlin2018-11-161-2/+2
| | | | | | | | | | | | | Identical fix to : commit cbd4bc1346f7397242e157bb66099b950a8c5643 Author: Jason Ekstrand <[email protected]> Date: Fri Aug 24 16:04:03 2018 -0500 intel/batch_decoder: Fix dynamic state printing Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Toni Lönnberg <[email protected]>
* intel/aubinator: fix ring buffer pointerLionel Landwerlin2018-11-162-4/+4
| | | | | | | | | We can only start parsing commands from the head pointer. This was working fine up to now because we only dealt with a "made up" ring buffer (generated by aub_write) which always had its head at 0. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Toni Lönnberg <[email protected]>
* intel/decoders: read ring buffer lengthLionel Landwerlin2018-11-162-2/+5
| | | | | | | Use this value to limit reading the ring buffer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Toni Lönnberg <[email protected]>
* intel/compiler: Lower SSBO and shared loads/stores in NIRJason Ekstrand2018-11-157-405/+421
| | | | | | | | | | We have a bunch of code to do this in the back-end compiler but it's fairly specific to typed surface messages and the way we emit them. This breaks it out into NIR were it's easier to do things a bit more generally. It also means we can easily share the code between the vec4 and FS back-ends if we wish. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* intel/compiler: Disassemble GEN6_SFID_DATAPORT_SAMPLER_CACHE as dp_samplerSagar Ghuge2018-11-151-1/+1
| | | | | | | | | Both BRW_SFID_SAMPLER and GEN6_SFID_DATAPORT_SAMPLER_CACHE are getting disassembled as "sampler", which is misleading for assembler tool. Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Sagar Ghuge <[email protected]>
* intel/tools: avoid 'unused variable' warningsAndrii Simiklit2018-11-142-5/+8
| | | | | | | | | | | | | | | | | | 1. tools/aub_read.c:271:31: warning: unused variable ‘end’ const uint32_t *p = data, *end = data + data_len, *next; 2. tools/aub_mem.c:292:13: warning: unused variable ‘res’ void *res = mmap((uint8_t *)bo.map + map_offset, 4096, PROT_READ, tools/aub_mem.c:357:13: warning: unused variable ‘res’ void *res = mmap((uint8_t *)bo.map + (page - bo.addr), 4096, PROT_READ, v2: The i965_disasm.c changes was moved into a separate patch The 'end' variable declared separately with MAYBE_UNUSED to avoid effect of it to other variables. ( Eric Engestrom <[email protected]> ) Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* nir: Allow to skip integer ops in nir_lower_to_source_modsGert Wollny2018-11-141-1/+1
| | | | | | | | | | Some hardware supports source mods only for float operations. Make it possible to skip lowering to source mods in these cases. v2: use option flags instead of a boolean (Jason Ekstrand) Signed-off-by: Gert Wollny <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: replace nir_load_system_value calls with appropiate builder functionsKarol Herbst2018-11-141-2/+1
| | | | | | | | | this helps reduce the overall code changes when a bit_size parameter is added to nir_load_system_value Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Eric Anholt <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* anv: move helper function internallyLionel Landwerlin2018-11-132-22/+22
| | | | | | | | It's only used in anv_image.c Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: use image aspects rather than computed onesLionel Landwerlin2018-11-131-1/+1
| | | | | | | | | | | This shouldn't make any difference but I feel uneasy to use the expanded aspects that do not represent the image in its entirety. If we ever change the implementation of the anv_image_aspect_to_plane() helper, this is safer. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: associate vulkan formats with aspectsLionel Landwerlin2018-11-132-41/+69
| | | | | | | | This will make it easier to associate an aspect with a plane number. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/lower_ycbcr: make sure to set 0s on all componentsLionel Landwerlin2018-11-131-5/+5
| | | | | | | | | To play around with debugging, we might want to disable one or the other component. Having 0s as default values makes this work. Otherwise we might have NULL components, leading to crashes. Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: remove unused parameterLionel Landwerlin2018-11-131-3/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: simplify internal address offsetLionel Landwerlin2018-11-131-2/+1
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen11)Toni Lönnberg2018-11-131-116/+116
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen10)Toni Lönnberg2018-11-131-113/+113
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine definition to MI_TOPOLOGY_FILTER. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen9)Toni Lönnberg2018-11-131-117/+117
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added more missing engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen8)Toni Lönnberg2018-11-131-116/+116
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. v4: Added missing engine tag for MI_TOPOLOGY_FILTER and MI_LOAD_URB_MEM. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen75)Toni Lönnberg2018-11-131-107/+107
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen7)Toni Lönnberg2018-11-131-83/+83
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen6)Toni Lönnberg2018-11-131-54/+54
| | | | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions v4: Added missing engine to MEDIA_GATEWAY_STATE Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen5)Toni Lönnberg2018-11-131-30/+30
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen45)Toni Lönnberg2018-11-131-27/+27
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added addition engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/genxml: Add engine definition to render engine instructions (gen4)Toni Lönnberg2018-11-131-25/+25
| | | | | | | | | | | | Instructions meant for the render engine now have a definition specifying that so that can differentiate instructions meant for different engines due to shared opcodes. v2: Divided into individual patches for each gen v3: Added additional engine definitions. Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: tools: Use engine for decoding batch instructionsToni Lönnberg2018-11-138-53/+69
| | | | | | | | | | | | | | | | The engine to which the batch was sent to is now set to the decoder context when decoding the batch. This is needed so that we can distinguish between instructions as the render and video pipe share some of the instruction opcodes. v2: The engine is now in the decoder context and the batch decoder uses a local function for finding the instruction for an engine. v3: Spec uses engine_mask now instead of engine, replaced engine class enums with the definitions from UAPI. v4: Fix up aubinator_viewer (Lionel) Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: tools: gen_engine to drm_i915_gem_engine_classToni Lönnberg2018-11-134-25/+19
| | | | | | | | | | Removed the gen_engine enum and changed the involved functions to use the drm_i915_gem_engine_class enum from UAPI instead. v3: Wrong engine was being used for blocks in video ring v4: Fixed aubinator_viewer.cpp Reviewed-by: Lionel Landwerlin <[email protected]>
* intel/decoder: Engine parameter for instructionsToni Lönnberg2018-11-132-0/+31
| | | | | | | | | | | | | | | | | | | Preliminary work for adding handling of different pipes to gen_decoder. Each instruction needs to have a definition describing which engine it is meant for. If left undefined, by default, the instruction is defined for all engines. v2: Changed to use the engine class definitions from UAPI v3: Changed I915_ENGINE_CLASS_TO_MASK to use BITSET_BIT, change engine to engine_mask, added check for incorrect engine and added the possibility to define an instruction to multiple engines using the "|" as a delimiter in the engine attribute. v4: Fixed the memory leak. v5: Removed an unnecessary ralloc_free(). Reviewed-by: Lionel Landwerlin <[email protected]>
* anv/i965: make use of nir_link_constant_varyings()Timothy Arceri2018-11-131-0/+3
| | | | | | | | | | | | | | | | | | | shader-db results for SLK: total instructions in shared programs: 13106498 -> 13091573 (-0.11%) instructions in affected programs: 1186244 -> 1171319 (-1.26%) helped: 6186 HURT: 0 total cycles in shared programs: 332062633 -> 331961653 (-0.03%) cycles in affected programs: 8537165 -> 8436185 (-1.18%) helped: 5371 HURT: 862 LOST: 6 GAINED: 14 Reviewed-by: Jason Ekstrand <[email protected]>
* i965: add support for sampling from AYUVLionel Landwerlin2018-11-122-0/+2
| | | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Tapani Pälli <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/fs: Prevent emission of IR instructions not aligned to their own ↵Francisco Jerez2018-11-091-3/+17
| | | | | | | | | | | | | | | | | execution size. This can occur during payload setup of SIMD-split send message instructions, which can lead to the emission of header setup instructions with a non-zero channel group and fixed SIMD width. Such instructions could end up using undefined channel enable signals except they don't care since they're always marked force_writemask_all. Not known to affect correctness of any workload at this point, but it would be trivial to back-port to stable if something comes up. Reported-by: Sagar Ghuge <[email protected]> Reviewed-by: Matt Turner <[email protected]> Tested-by: Sagar Ghuge <[email protected]>
* intel/aub_read: remove useless breaksLionel Landwerlin2018-11-091-6/+0
| | | | | Signed-off-by: Lionel Landwerlin <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* intel/compiler: fix node interference of simd16 instructionsIago Toral Quiroga2018-11-091-19/+17
| | | | | | | | | | | | | | | | | SIMD16 instructions need to have additional interferences to prevent source / destination hazards when the source and destination registers are off by one register. While we already have code to handle this, it was only running for SIMD16 dispatches, however, we can have SIDM16 instructions in a SIMD8 dispatch. An example of this are pull constant loads since commit b56fa830c6095, but there are more cases. This fixes a number of CTS test failures found in work-in-progress tests that were hitting this situation for 16-wide pull constants in a SIMD8 program. Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>