summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* nir/int64: Properly handle imod/iremJason Ekstrand2017-03-031-3/+21
| | | | | | | | | | | The previous implementation was fine for GLSL which doesn't really have a signed modulus/remainder. They just leave the behavior undefined whenever either source is negative. However, in SPIR-V, there is a defined behavior for negative arguments. This commit beefs up the pass so that it handles both correctly. Tested using a hacked up version of the Vulkan CTS test to get 64-bit support. Reviewed-by: Matt Turner <[email protected]>
* nir/builder: Add an int64 immediate helperJason Ekstrand2017-03-031-0/+11
| | | | Reviewed-by: Matt Turner <[email protected]>
* genxml: Fill out Gen4 and G45 XML.Kenneth Graunke2017-03-032-1/+2232
| | | | | | | | This is a work in progress - some things may still need fixing. But it should be in pretty decent shape. Signed-off-by: Kenneth Graunke <[email protected]> Signed-off-by: Jason Ekstrand <[email protected]>
* ac: normalize build helper namesMarek Olšák2017-03-036-318/+317
| | | | | | s/emit/build/ Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.vs.load.input with amdgcn.buffer.load.formatMarek Olšák2017-03-031-0/+20
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move SI.vs.load.input building into amd/commonMarek Olšák2017-03-033-15/+33
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: detect and mark loads/stores from read-only/write-only memoryMarek Olšák2017-03-031-10/+105
|
* ac: replace llvm.SI.tbuffer.store with llvm.amdgcn.buffer.store if ADD_TID=0Marek Olšák2017-03-034-15/+73
| | | | | | | | ADD_TID doesn't work. Needs more investigation. v2: remove leftover dead code Reviewed-by: Dave Airlie <[email protected]> (v1)
* radeonsi: use the writeonly LLVM attributeMarek Olšák2017-03-031-3/+6
|
* ac: remove offen parameter from ac_build_buffer_store_dwordMarek Olšák2017-03-034-23/+20
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: enable TC L2 for tessellation offchip storesMarek Olšák2017-03-031-8/+8
| | | | Vulkan does the same thing.
* radeonsi: merge and simplify tbuffer_store functionsMarek Olšák2017-03-034-114/+77
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set noalias on input shader pointersMarek Olšák2017-03-031-0/+1
|
* radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfeMarek Olšák2017-03-033-7/+33
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move kill intrinsic building into amd/commonMarek Olšák2017-03-034-14/+29
| | | | | | just a cleanup Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set readnone on reads from read-only memoryMarek Olšák2017-03-033-13/+21
|
* radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.loadMarek Olšák2017-03-031-45/+19
|
* radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtzMarek Olšák2017-03-033-5/+21
|
* ac: replace old image intrinsics with new onesMarek Olšák2017-03-031-0/+80
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: remove last use of llvm.SI.resinfoMarek Olšák2017-03-031-48/+49
| | | | and move one function up to reuse the code.
* radeonsi: move image intrinsic building to amd/commonMarek Olšák2017-03-033-92/+159
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.export with amdgcn.exp.*Marek Olšák2017-03-032-3/+36
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move llvm.SI.export building to amd/commonMarek Olšák2017-03-033-162/+170
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: unify build_type_name_for_intr functionsMarek Olšák2017-03-034-82/+47
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as wellMarek Olšák2017-03-031-1/+2
| | | | | | It was harmless, because we also set unorm in the sampler state. Reviewed-by: Dave Airlie <[email protected]>
* gallivm, ac: add writeonly and inaccessiblememonly attributesMarek Olšák2017-03-034-0/+8
| | | | Reviewed-by: Dave Airlie <[email protected]>
* tgsi/scan: record load/store/atomic image usageMarek Olšák2017-03-033-11/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* glapi: Fix a comment typoEric Anholt2017-03-031-1/+1
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa/main: *TextureSubImage* generates INVALID_OPERATION on wrong targetAlejandro Piñeiro2017-03-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Equivalent *TexSubImage* methods generates INVALID_ENUM. From OpenGL 4.5 spec, section 8.6 Alternate Texture Image Specification Commands: "An INVALID_ENUM error is generated by *TexSubImage* if target does not match the command, as shown in table 8.15." And: "An INVALID_OPERATION error is generated by *TextureSubImage* if the effective target of texture does not match the command, as shown in table 8.15." Fixes: GL45-CTS.direct_state_access.textures_copy_errors v2: slightly change commit summary (Samuel) Reviewed-by: Samuel Pitoiset <[email protected]>
* i965: Add Kaby Lake brandstringsBen Widawsky2017-03-021-10/+10
| | | | | | | | While here, use the spacing defined in Ark. https://ark.intel.com/products/codename/82879/Kaby-Lake Reviewed-by: Jason Ekstrand <[email protected]> Signed-off-by: Ben Widawsky <[email protected]>
* tgsi/ureg: return correct token count in ureg_get_tokensGrazvydas Ignotas2017-03-031-1/+1
| | | | | | | | | | | | | Valgrind reports that the shader cache writes uninitialized data to disk. Turns out ureg_get_tokens() is returning the count of allocated tokens instead of how many are actually used, so the cache writes out unused space at the end. Use the real count instead. This change should not cause regressions elsewhere because the only ureg_get_tokens() user that cares about token count is the shader cache. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add support for an on-disk shader cacheTimothy Arceri2017-03-031-7/+60
| | | | | | | | | | | | | V2: - when loading from disk cache also binary insert into memory cache. - check that the binary loaded from disk is the correct size. If not delete the cache item and skip loading from cache. V3: - remove unrequired variable Reviewed-by: Grigori Goronzy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util/disk_cache: compress individual cache entriesTimothy Arceri2017-03-034-24/+162
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi (these numbers differ from Grigori's results below probably due to different graphics quality settings). I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark on an i5-6400 [email protected], with a HDD: no-cache: ~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note: The with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we mainly care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Grigori Goronzy provided the following stats for Deus Ex: Mankind Divided start-up times on a Athlon X4 860k with a SSD: No Cache 215 sec Cold Cache zlib BEST_COMPRESSION 285 sec Warm Cache zlib BEST_COMPRESSION 33 sec Cold Cache zlib BEST_SPEED 264 sec Warm Cache zlib BEST_SPEED 33 sec Cold Cache no compression 266 sec Warm Cache no compression 34 sec The total cache size for that game is 48 MiB with BEST_COMPRESSION, 56 MiB with BEST_SPEED and 170 MiB with no compression. These numbers suggest that it may be ok to go with Z_BEST_SPEED but we should gather some actual decompression times before doing so. Other options might be to do the compression in a separate thread, this might allow us to use a higher compression algorithim such as LZMA. Reviewed-by: Grigori Goronzy <[email protected]> Acked-by: Marek Olšák <[email protected]>
* util/disk_cache: add support for detecting corrupt cache entriesTimothy Arceri2017-03-031-3/+34
| | | | | | | V2: fix pointer increments for writing/reading crc Acked-by: Marek Olšák <[email protected]> Reviewed-by: Grigori Goronzy <[email protected]>
* glsl: fix subroutine mismatch between declarations/definitionsSamuel Pitoiset2017-03-035-8/+18
| | | | | | | | | | | | | | | | | | | | | Previously, when q.subroutine was set to 1, a new subroutine declaration was added to the AST, while 0 meant a subroutine definition has been detected by the parser. Thus, setting the q.subroutine flag in both situations is obviously wrong because a new type identifier is added instead of trying to match the declaration. To fix it up, introduce ast_type_qualifier::is_subroutine_decl() to differentiate declarations and definitions easily. This fixes a regression with: arb_shader_subroutine/compiler/direct-call.vert Cc: Mark Janes <[email protected]> Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* genxml: Depend on Makefile.am for generated sources.Matt Turner2017-03-021-1/+1
| | | | | | | Depending on the generated Makefile means that all generated sources are recreated after ./configure. Reviewed-by: Lionel Landwerlin <[email protected]>
* clover: Work around build failure with AltiVec.Matt Turner2017-03-022-0/+17
| | | | | | Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504 Acked-by: Francisco Jerez <[email protected]>
* anv/image: Allow HiZ on input attachment-capable depth/stencil imagesNanley Chery2017-03-021-14/+0
| | | | | | | | | | | While an input attachment may only take on one of those two layouts, other depth/stencil attachments that use the same image may have HiZ-enabled layouts. Improves the average frame rate on a release candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my SKL GT4. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Centralize automatic layout transitionsNanley Chery2017-03-021-42/+12
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Add attachment transitioning functionsNanley Chery2017-03-021-0/+85
| | | | | | | This is needed to transition input attachments. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Encapsulate subpass id queryingNanley Chery2017-03-022-6/+17
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Enable render pass awarenessNanley Chery2017-03-022-0/+10
| | | | | | | v2: Update cmd_state_reset (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pass: Store subpass attachment reference listNanley Chery2017-03-022-2/+13
| | | | | | | | | | We'll loop through this array when performing automatic layout transitions. v2: Adjust formatting of an assignment (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pass: Fix size of anv_render_pass:subpass_attachmentsNanley Chery2017-03-021-2/+1
| | | | | | | Don't allocate space for resolve attachments if the subpass has none. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Store the user's VkAttachmentReferenceNanley Chery2017-03-028-52/+47
| | | | | | | | We will be using the image layout. Store the full struct directly from the user. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Remove extra resolve for certain depth buffersNanley Chery2017-03-021-42/+29
| | | | | | | | | Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer when reading from a depth image subresource that is in the general layout. Remove this unneeded resolve. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Conditionally choose the sampled image surface stateNanley Chery2017-03-021-7/+8
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/descriptor_set: Store aux usage of sampled image descriptorsNanley Chery2017-03-023-18/+23
| | | | | | | | v2: Rebase onto latest changes v3: Account for NULL image_view in aux_usage assignment Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Create an additional surface state for samplingNanley Chery2017-03-022-1/+27
| | | | | | | | This will be used to sample a depth input attachment without having to pass through the HiZ buffer. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Simplify setup of HiZ sampler surface stateNanley Chery2017-03-021-18/+12
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>