summaryrefslogtreecommitdiffstats
path: root/src
Commit message (Collapse)AuthorAgeFilesLines
* radeonsi: use the writeonly LLVM attributeMarek Olšák2017-03-031-3/+6
|
* ac: remove offen parameter from ac_build_buffer_store_dwordMarek Olšák2017-03-034-23/+20
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: enable TC L2 for tessellation offchip storesMarek Olšák2017-03-031-8/+8
| | | | Vulkan does the same thing.
* radeonsi: merge and simplify tbuffer_store functionsMarek Olšák2017-03-034-114/+77
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set noalias on input shader pointersMarek Olšák2017-03-031-0/+1
|
* radeonsi: replace AMDGPU.bfe.* with amdgcn.*bfeMarek Olšák2017-03-033-7/+33
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move kill intrinsic building into amd/commonMarek Olšák2017-03-034-14/+29
| | | | | | just a cleanup Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set readnone on reads from read-only memoryMarek Olšák2017-03-033-13/+21
|
* radeonsi: replace SI.buffer.load.dword with amdgcn.buffer.loadMarek Olšák2017-03-031-45/+19
|
* radeonsi: replace SI.packf16 with amdgcn.cvt.pkrtzMarek Olšák2017-03-033-5/+21
|
* ac: replace old image intrinsics with new onesMarek Olšák2017-03-031-0/+80
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: remove last use of llvm.SI.resinfoMarek Olšák2017-03-031-48/+49
| | | | and move one function up to reuse the code.
* radeonsi: move image intrinsic building to amd/commonMarek Olšák2017-03-033-92/+159
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: replace SI.export with amdgcn.exp.*Marek Olšák2017-03-032-3/+36
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: move llvm.SI.export building to amd/commonMarek Olšák2017-03-033-162/+170
| | | | Reviewed-by: Dave Airlie <[email protected]>
* ac: unify build_type_name_for_intr functionsMarek Olšák2017-03-034-82/+47
| | | | Reviewed-by: Dave Airlie <[email protected]>
* radeonsi: set unorm=1 for TGSI_TEXTURE_SHADOWRECT as wellMarek Olšák2017-03-031-1/+2
| | | | | | It was harmless, because we also set unorm in the sampler state. Reviewed-by: Dave Airlie <[email protected]>
* gallivm, ac: add writeonly and inaccessiblememonly attributesMarek Olšák2017-03-034-0/+8
| | | | Reviewed-by: Dave Airlie <[email protected]>
* tgsi/scan: record load/store/atomic image usageMarek Olšák2017-03-033-11/+16
| | | | Reviewed-by: Dave Airlie <[email protected]>
* glapi: Fix a comment typoEric Anholt2017-03-031-1/+1
| | | | | Reviewed-by: Nicolai Hähnle <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* mesa/main: *TextureSubImage* generates INVALID_OPERATION on wrong targetAlejandro Piñeiro2017-03-031-3/+3
| | | | | | | | | | | | | | | | | | | | | | | Equivalent *TexSubImage* methods generates INVALID_ENUM. From OpenGL 4.5 spec, section 8.6 Alternate Texture Image Specification Commands: "An INVALID_ENUM error is generated by *TexSubImage* if target does not match the command, as shown in table 8.15." And: "An INVALID_OPERATION error is generated by *TextureSubImage* if the effective target of texture does not match the command, as shown in table 8.15." Fixes: GL45-CTS.direct_state_access.textures_copy_errors v2: slightly change commit summary (Samuel) Reviewed-by: Samuel Pitoiset <[email protected]>
* tgsi/ureg: return correct token count in ureg_get_tokensGrazvydas Ignotas2017-03-031-1/+1
| | | | | | | | | | | | | Valgrind reports that the shader cache writes uninitialized data to disk. Turns out ureg_get_tokens() is returning the count of allocated tokens instead of how many are actually used, so the cache writes out unused space at the end. Use the real count instead. This change should not cause regressions elsewhere because the only ureg_get_tokens() user that cares about token count is the shader cache. Signed-off-by: Grazvydas Ignotas <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* radeonsi: add support for an on-disk shader cacheTimothy Arceri2017-03-031-7/+60
| | | | | | | | | | | | | V2: - when loading from disk cache also binary insert into memory cache. - check that the binary loaded from disk is the correct size. If not delete the cache item and skip loading from cache. V3: - remove unrequired variable Reviewed-by: Grigori Goronzy <[email protected]> Reviewed-by: Marek Olšák <[email protected]>
* util/disk_cache: compress individual cache entriesTimothy Arceri2017-03-033-24/+158
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | This reduces the cache size for Deus Ex from ~160M to ~30M for radeonsi (these numbers differ from Grigori's results below probably due to different graphics quality settings). I'm also seeing the following improvements in minimum fps in the Shadow of Mordor benchmark on an i5-6400 [email protected], with a HDD: no-cache: ~10fps with-cache-no-compression: ~15fps with-cache-and-compression: ~20fps Note: The with cache results are from the second run after closing and opening the game to avoid the in-memory cache. Since we mainly care about decompression I went with Z_BEST_COMPRESSION as suggested on irc by Steinar H. Gunderson who has benchmarked decompression speeds. Grigori Goronzy provided the following stats for Deus Ex: Mankind Divided start-up times on a Athlon X4 860k with a SSD: No Cache 215 sec Cold Cache zlib BEST_COMPRESSION 285 sec Warm Cache zlib BEST_COMPRESSION 33 sec Cold Cache zlib BEST_SPEED 264 sec Warm Cache zlib BEST_SPEED 33 sec Cold Cache no compression 266 sec Warm Cache no compression 34 sec The total cache size for that game is 48 MiB with BEST_COMPRESSION, 56 MiB with BEST_SPEED and 170 MiB with no compression. These numbers suggest that it may be ok to go with Z_BEST_SPEED but we should gather some actual decompression times before doing so. Other options might be to do the compression in a separate thread, this might allow us to use a higher compression algorithim such as LZMA. Reviewed-by: Grigori Goronzy <[email protected]> Acked-by: Marek Olšák <[email protected]>
* util/disk_cache: add support for detecting corrupt cache entriesTimothy Arceri2017-03-031-3/+34
| | | | | | | V2: fix pointer increments for writing/reading crc Acked-by: Marek Olšák <[email protected]> Reviewed-by: Grigori Goronzy <[email protected]>
* glsl: fix subroutine mismatch between declarations/definitionsSamuel Pitoiset2017-03-035-8/+18
| | | | | | | | | | | | | | | | | | | | | Previously, when q.subroutine was set to 1, a new subroutine declaration was added to the AST, while 0 meant a subroutine definition has been detected by the parser. Thus, setting the q.subroutine flag in both situations is obviously wrong because a new type identifier is added instead of trying to match the declaration. To fix it up, introduce ast_type_qualifier::is_subroutine_decl() to differentiate declarations and definitions easily. This fixes a regression with: arb_shader_subroutine/compiler/direct-call.vert Cc: Mark Janes <[email protected]> Fixes: be8aa76afd ("glsl: remove unecessary flags.q.subroutine_def") Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=100026 Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* genxml: Depend on Makefile.am for generated sources.Matt Turner2017-03-021-1/+1
| | | | | | | Depending on the generated Makefile means that all generated sources are recreated after ./configure. Reviewed-by: Lionel Landwerlin <[email protected]>
* clover: Work around build failure with AltiVec.Matt Turner2017-03-021-0/+3
| | | | | | Bugzilla: https://bugs.gentoo.org/show_bug.cgi?id=587210 Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=68504 Acked-by: Francisco Jerez <[email protected]>
* anv/image: Allow HiZ on input attachment-capable depth/stencil imagesNanley Chery2017-03-021-14/+0
| | | | | | | | | | | While an input attachment may only take on one of those two layouts, other depth/stencil attachments that use the same image may have HiZ-enabled layouts. Improves the average frame rate on a release candidate of a proprietary Vulkan benchmark by 9.94% over 3 runs on my SKL GT4. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Centralize automatic layout transitionsNanley Chery2017-03-021-42/+12
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Add attachment transitioning functionsNanley Chery2017-03-021-0/+85
| | | | | | | This is needed to transition input attachments. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/blorp: Encapsulate subpass id queryingNanley Chery2017-03-022-6/+17
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Enable render pass awarenessNanley Chery2017-03-022-0/+10
| | | | | | | v2: Update cmd_state_reset (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pass: Store subpass attachment reference listNanley Chery2017-03-022-2/+13
| | | | | | | | | | We'll loop through this array when performing automatic layout transitions. v2: Adjust formatting of an assignment (Jason Ekstrand) Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pass: Fix size of anv_render_pass:subpass_attachmentsNanley Chery2017-03-021-2/+1
| | | | | | | Don't allocate space for resolve attachments if the subpass has none. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Store the user's VkAttachmentReferenceNanley Chery2017-03-028-52/+47
| | | | | | | | We will be using the image layout. Store the full struct directly from the user. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Remove extra resolve for certain depth buffersNanley Chery2017-03-021-42/+29
| | | | | | | | | Due to recent commits, the sampler now bypasses the auxiliary HiZ buffer when reading from a depth image subresource that is in the general layout. Remove this unneeded resolve. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Conditionally choose the sampled image surface stateNanley Chery2017-03-021-7/+8
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/descriptor_set: Store aux usage of sampled image descriptorsNanley Chery2017-03-023-18/+23
| | | | | | | | v2: Rebase onto latest changes v3: Account for NULL image_view in aux_usage assignment Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Create an additional surface state for samplingNanley Chery2017-03-022-1/+27
| | | | | | | | This will be used to sample a depth input attachment without having to pass through the HiZ buffer. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Simplify setup of HiZ sampler surface stateNanley Chery2017-03-021-18/+12
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Remove extra dependency on HiZ-specific variableNanley Chery2017-03-021-2/+7
| | | | | | | | | | | surf_usage is only useful to image views that may use HiZ buffers. Storage image views don't use HiZ buffers. v2: Update commit message and add an assertion. Fixes: 055ff2ec521 ("anv: Replace anv_image_has_hiz() with ISL_AUX_USAGE_HIZ") Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Update the HiZ sampling helperNanley Chery2017-03-024-8/+14
| | | | | | | | | | | | | | Validate the inputs, verify that this image has a depth buffer, use gen_device_info instead of v2: - Add parenthesis (Jason Ekstrand) - Make parameters const - Use gen_device_info instead of gen - Pass aspect to missed function in transition_depth_buffer Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/cmd_buffer: Replace layout_to_hiz_usage()Nanley Chery2017-03-021-44/+28
| | | | | Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/image: Add anv_layout_to_aux_usage()Nanley Chery2017-03-022-0/+139
| | | | | | | | | | | | This function supersedes layout_to_hiz_usage(). v2: - Don't find the optimal buffer for layout transitions (Jason Ekstrand). - Pass the devinfo instead of the gen (Jason Ekstrand) - Update the function documentation. Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv/pass: Avoid accessing attachment array out of boundsNanley Chery2017-03-021-9/+13
| | | | | | Cc: <[email protected]> Signed-off-by: Nanley Chery <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* ralloc: Make sure ralloc() allocations match malloc()'s alignment.Jonas Pfeil2017-03-021-1/+12
| | | | | | | | | | | | | | | | | | | The header of ralloc needs to be aligned, because the compiler assumes that malloc returns will be aligned to 8/16 bytes depending on the platform, leading to degraded performance or alignment faults with ralloc. Fixes SIGBUS on Raspberry Pi at high optimization levels. This patch is not perfect for MSVC, as maybe in the future the alignment for the most demanding data type might change to more than 8. v2: Commit message reword/typo fix, and add a bigger explanation in the code (by anholt) Signed-off-by: Jonas Pfeil <[email protected]> Reviewed-by: Matt Turner <[email protected]> Reviewed-by: Marek Olšák <[email protected]> Cc: [email protected]
* swr: fix crash in swr_update_derived following st/mesa state changesBruce Cherniak2017-03-022-3/+46
| | | | | | | | | | | | | | | | | | | | Recent change to st/mesa state update logic caused major regressions to swr validation code. swr uses the same validation logic (swr_update_derived) for both draw and Clear calls. New st/mesa state update logic results in certain state objects not being set/bound during Clear. This was causing null ptr exceptions. Creation of static dummy state objects allows setting these pointers during Clear validation, without interfering with relevant state validation. Once fixed, new logic also highlighted an error in dirty bit checking for fragment shader and clip validation. (The alternative is to have a simplified validation routine for Clear. Which may do that at some point.) Reviewed-by: Tim Rowley <[email protected]>
* swr: enable clear_texture with util_clear_textureBruce Cherniak2017-03-022-1/+2
| | | | | | Passes corresponding piglit tests. Reviewed-by: Edward O'Callaghan <[email protected]>
* automake: i965: list correct header in Makefile.sourceEmil Velikov2017-03-021-1/+1
| | | | | Fixes: 7ac47b1af767 ("i965: Add a header for brw_vec4_vs_visitor") Signed-off-by: Emil Velikov <[email protected]>