aboutsummaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* intel/fs: Fix Gen7 compressed source region alignment restriction for SIMD32Francisco Jerez2018-06-281-1/+7
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Implement 32-wide FS payload setup on Gen6+Francisco Jerez2018-06-281-67/+57
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Extend thread payload layout to SIMD32Francisco Jerez2018-06-283-22/+45
| | | | | | | | | | And handle 32-wide payload register reads in fetch_payload_reg(). v2 (Jason Ekstrand); - Fix some whitespace and brace placement Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Wrap FS payload register look-up in a helper function.Francisco Jerez2018-06-283-12/+23
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Use fs_regs instead of brw_regs in the unlit centroid workaroundFrancisco Jerez2018-06-281-12/+12
| | | | | | | | | | While we're here, we change to using horiz_offset() instead of abusing half(). v2 (Jason Ekstrand): - Use horiz_offset() instead of half() Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Simplify fs_visitor::emit_samplepos_setupFrancisco Jerez2018-06-281-21/+7
| | | | | | | | | | | | | The original code manually handled splitting the MOVs to 8-wide to handle various regioning restrictions. Now that we have a SIMD width splitting pass that handles these things, we can just emit everything at the full width and let the SIMD splitting pass handle it. We also now have a useful "subscript" helper which is designed exactly for the case where you want to take a W type and read it as a vector of Bs so we may as well use that too. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* i965: Add plumbing for shader time in 32-wide FS dispatch mode.Francisco Jerez2018-06-287-5/+15
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Disable opt_sampler_eot() in 32-wide dispatch.Francisco Jerez2018-06-282-1/+6
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Emit LINE+MAC for LINTERP with unaligned coordinatesJason Ekstrand2018-06-282-10/+56
| | | | | | | | | | | | | | | | | | | | | On g4x through Sandy Bridge, src1 (the coordinates) of the PLN instruction is required to be an even register number. When it's odd (which can happen with SIMD32), we have to emit a LINE+MAC combination instead. Unfortunately, we can't just fall through to the gen4 case because the input registers are still set up for PLN which lays out the four src1 registers differently in SIMD16 than LINE. v2 (Jason Ekstrand): - Take advantage of both accumulators and emit LINE LINE MAC MAC (Based on a patch from Francisco Jerez) - Unify the gen4 and gen4x-6 cases using a loop v3 (Jason Ekstrand): - Don't unify gen4 with gen4x-6 as this turns out to be more fragile than first thought without reworking the gen4 barycentric coordinate layout. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Mark LINTERP opcode as writing accumulator on platforms without PLNJason Ekstrand2018-06-281-1/+2
| | | | | | | | | | | | | | When we don't have PLN (gen4 and gen11+), we implement LINTERP as either LINE+MAC or a pair of MADs. In both cases, the accumulator is written by the first of the two instructions and read by the second. Even though the accumulator value isn't actually ever used from a logical instruction perspective, it is trashed so we need to make the scheduler aware. Otherwise, the scheduler could end up re-ordering instructions and putting a LINTERP between another an instruction which writes the accumulator and another which tries to use that result. Cc: [email protected] Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Rework INTERPOLATE_AT_PER_SLOT_OFFSETFrancisco Jerez2018-06-283-19/+9
| | | | | | | | | | This reworks INTERPOLATE_AT_PER_SLOT_OFFSET to work more like an ALU operation and less like a send. This is less code over-all and, as a side-effect, it now properly handles execution groups and lowering so SIMD32 support just falls out. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Add the group to the flag subreg number on SNB and olderJason Ekstrand2018-06-281-1/+7
| | | | | | | | | | | We want consistent behavior in the meaning of the flag_subreg field between SNB and IVB+. v2 (Jason Ekstrand): - Add some extra commentary Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix FB read header setup for SIMD32.Francisco Jerez2018-06-281-4/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix logical FB write lowering for SIMD32Francisco Jerez2018-06-281-5/+20
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix FB write message control codegen for SIMD32.Francisco Jerez2018-06-281-18/+34
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Don't enable dual source blend if no outputs are writtenFrancisco Jerez2018-06-281-1/+2
| | | | | | | | This prevents a crash in some arb_enhanced_layouts tests that would be caused by the next commit. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix codegen of FS_OPCODE_SET_SAMPLE_ID for SIMD32.Francisco Jerez2018-06-281-11/+13
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Fix pixel interpolator queries for SIMD32.Francisco Jerez2018-06-281-1/+2
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Disable SIMD32 dispatch for fragment shaders with discard.Francisco Jerez2018-06-281-0/+2
| | | | | | | | | | Current discard handling requires dedicating the second flag register to discard. However, control-flow in SIMD32 requires both flag registers so it's incompatible with the current discard handling. Just don't support SIMD32+discard for now. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Disable SIMD32 dispatch on Gen4-6 with control flowFrancisco Jerez2018-06-281-0/+8
| | | | | | | | | The hardware's control flow logic is 16-wide so we're out of luck here. We could, in theory, support SIMD32 if we know the control-flow is uniform but we don't have that information at this point. Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Split instructions low to high in lower_simd_widthJason Ekstrand2018-06-281-2/+35
| | | | | | | | | | | | | Commit 0d905597f fixed an issue with the placement of the zip and unzip instructions. However, as a side-effect, it reversed the order in which we were emitting the split instructions so that they went from high group to low instead of low to high. This is fine for most things like texture instructions and the like but certain render target writes really want to be emitted low to high. This commit just switches the order back around to be low to high. Reviewed-by: Matt Turner <[email protected]> Fixes: 0d905597f "intel/fs: Be more explicit about our placement of [un]zip"
* intel/fs: Rework KSP data to be SIMD width-basedJason Ekstrand2018-06-283-47/+43
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/compiler: Add and use helpers for working with KSP indicesJason Ekstrand2018-06-285-55/+183
| | | | | | | | The pixel shader dispatch table is kind-of a confusing mess. This adds some helpers for dealing with it and for easily extracting the correct data from wm_prog_data. Reviewed-by: Matt Turner <[email protected]>
* i965: Re-arrange shader kernel setup in WM stateJason Ekstrand2018-06-281-37/+57
| | | | Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Remove program key argument from generator.Francisco Jerez2018-06-287-10/+7
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Set up FB write message headers in the visitorJason Ekstrand2018-06-282-83/+86
| | | | | | | | | | | Doing instruction header setup in the generator is awful for a number of reasons. For one, we can't schedule the header setup at all. For another, it means lots of implied writes which the instruction scheduler and other passes can't properly read about. The second isn't a huge problem for FB writes since they always happen at the end. We made a similar change to sampler handling in ff4726077d86. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix implied_mrf_writes() for headerless FB writes.Francisco Jerez2018-06-281-1/+2
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Fix fs_inst::flags_written() for Gen4-5 FB writes.Francisco Jerez2018-06-281-1/+2
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/eu: Return new instruction to caller from brw_fb_WRITE().Francisco Jerez2018-06-282-21/+23
| | | | | Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Pull FB write implied headers from src[0]Jason Ekstrand2018-06-281-9/+6
| | | | | | | | Now that we have the implied header in src[0] for tracking purposes, we may as well use it in the generator. This makes things a tiny bit more general. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: Properly track implied header regs read by FB writesJason Ekstrand2018-06-281-1/+16
| | | | | | | | The FB write opcode on gen4-5 does implied copies from g0 and g1 to the message payload. With this commit, we start tracking that as part of the IR by having the FB write read from g0-1. Reviewed-by: Matt Turner <[email protected]>
* intel/fs: FS_OPCODE_REP_FB_WRITE has side effectsJason Ekstrand2018-06-281-0/+1
| | | | | | | It doesn't matter since we don't ever run replicated write shaders through the optimizer but it's good to be complete. Reviewed-by: Matt Turner <[email protected]>
* docs: Add news item for mesa 18.1.2Dylan Baker2018-06-281-0/+6
| | | | | | Which I forgot to do when 18.1.2 came out. Signed-off-by: Dylan Baker <[email protected]>
* nvc0: remove magic values in nve4_set_tex_handles()Rhys Perry2018-06-281-1/+1
| | | | | | | | | With this commit, things no longer break if NVC0_CB_AUX_TEX_INFO is changed to anything other than 0x20. Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* nvc0/ir: fix TargetNVC0::insnCanLoadOffset()Rhys Perry2018-06-281-0/+1
| | | | | | | | | | | | | | Previously, TargetNVC0::insnCanLoadOffset() returned whether the offset could be set to a specific value. The IndirectPropagation pass expected it to return whether the offset could be increased by a specific value, which is what TargetNV50::insnCanLoadOffset() does. Fixes: 37b67db6ae34fb6586d640a7a1b6232f091dd812 ("nvc0/ir: be careful about propagating very large offsets into const load") Signed-off-by: Rhys Perry <[email protected]> Reviewed-by: Karol Herbst <[email protected]> Signed-off-by: Karol Herbst <[email protected]>
* swr/rast: Updating code style based on current clang-format rulesAlok Hota2018-06-284-253/+260
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Fix addPassesToEmitFile usage with llvm-7.0.Vinson Lee2018-06-281-0/+4
| | | | | | | | | | | | Fix build error after llvm-7.0svn r332881 ("CodeGen: Add a dwo output file argument to addPassesToEmitFile and hook it up to dwo output."). CXX rasterizer/jitter/libmesaswr_la-JitManager.lo rasterizer/jitter/JitManager.cpp:368:93: error: too few arguments to function call, expected at least 4, have 3 pTarget->addPassesToEmitFile(*pMPasses, filestream, TargetMachine::CGFT_AssemblyFile); ~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ^ Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Handling removed LLVM intrinsics in trunkAlok Hota2018-06-281-0/+40
| | | | | | | - Functionality replaced with emulated intrinsics - Fixes Bug 106558 Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Adding SCATTERPS functionality to BuilderGfxMemAlok Hota2018-06-282-0/+19
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Adding Read/Write specifier to TranslateGfxAddress stackAlok Hota2018-06-282-27/+28
| | | | | | | - Removing unused generic translate function - Requiring read/write specifier in builder_gfx_mem Reviewed-by: Bruce Cherniak <[email protected]>
* gallium: Fix automake for Android (v2)Chad Versace2018-06-273-0/+13
| | | | | | | | | | | | | | | Chromium OS uses Autotools and pkg-config when building Mesa for Android. The gallium drivers were failing to find the headers and libraries for zlib and Android's libbacktrace. v2: - Don't add a check for zlib.pc. configure.ac already checks for zlib.pc elsewhere. [for tfiga] - Check for backtrace.pc separately from the other Android libs. [for tfiga] Reviewed-by: Tomasz Figa <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* glsl: skip comparison opt when adding vars of different sizeTimothy Arceri2018-06-281-0/+6
| | | | | | | | | | | The spec allows adding scalars with a vector or matrix. In this case the opt was losing swizzle and size information. This fixes a bug with Doom (2016) shaders. Fixes: 34ec1a24d61f ("glsl: Optimize (x + y cmp 0) into (x cmp -y).") Reviewed-by: Ian Romanick <[email protected]>
* Revert "anv: Print the actual enum for ignored structure types"Jason Ekstrand2018-06-271-3/+1
| | | | | This reverts commit fda7014c35e5f5dfa26f078ad0512d13ead8b717. It was hitting an unreachable when the sType was unknown.
* anv: Print the actual enum for ignored structure typesJason Ekstrand2018-06-271-1/+3
| | | | Reviewed-by: Tapani Pälli <[email protected]>
* i965/bufmgr: Use the correct argument order for bo_alloc_internalJason Ekstrand2018-06-271-2/+2
| | | | | | | The memzone and flags parameters were accidentally flipped in the call from brw_bo_alloc_tiled_2d. Reviewed-by: Kenneth Graunke <[email protected]>
* vulkan/wsi_common_display: Return SURFACE_LOST for fatal DRM errorsKeith Packard2018-06-271-7/+7
| | | | | | | | | | | | | Instead of encouraging the client to re-create the swapchain and keep going with an OUT_OF_DATE error, tell the client that further use of the current surface will not succeed as the associated kernel objects are no longer valid. In particular, when a DRM lease is revoked, then the client needs to get another lease and create a new surface for that. Signed-off-by: Keith Packard <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* glsl: Make sure that packed varyings reflect always_active_io properly.Eric Anholt2018-06-271-2/+7
| | | | | | | | The always_active_io flag was only set according to the first variable that got packed in, so NIR io compaction would end up compacting XFB varyings that shouldn't move at that point. Reviewed-by: Timothy Arceri <[email protected]>
* v3d: Fix Z clipping when viewport.scale[2] is negative.Eric Anholt2018-06-271-4/+6
| | | | | | Fixes: dEQP-GLES3.functional.shaders.builtin_variable.depth_range_fragment dEQP-GLES3.functional.shaders.builtin_variable.depth_range_vertex
* v3d: Convert a bunch of our "minus one" fields over to the new XML attr.Eric Anholt2018-06-276-33/+35
| | | | | This fixes up their formatting for CLIF files and makes the code more legible.
* v3d: Add pack/unpack/decode support for fields with a "- 1" modifier.Eric Anholt2018-06-273-17/+46
| | | | | | | | | | Right now, we name these fields as "field name minus one" so that your C code obviously states what the value should be. However, it's easy enough to handle at the codegen level with another little XML attribute, meaning less C code and easier-to-read values in CLIF dumping and gdb as well. (The actual CLIF format for simulator and FPGA replay takes in pre-minus-one values, so we need it there too).