summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* anv/cmd_buffer: Handle running out of binding tables in compute shadersJason Ekstrand2016-11-221-5/+15
| | | | | | | | | | | If we try to allocate a binding table and fail, we have to get a new binding table block, re-emit STATE_BASE_ADDRESS, and then try again. We already handle this correctly for 3D and blorp but it never got handled for CS. This fixes the new stress.lots-of-surface-state.cs.static crucible test. Reviewed-by: Lionel Landwerlin <[email protected]> Reviewed-by: Jordan Justen <[email protected]> Cc: "13.0" <[email protected]>
* i965/compiler: Disable trig workarounds on KBL+Jason Ekstrand2016-11-222-4/+8
| | | | | | | | | | The precision of our trig instructions appears to have been fixed on Kaby Lake. Neither Ben nor I can find any documentation for this. However, the dEQP precision tests now pass with INTEL_PRECISE_TRIG=0 where they fail on Sky Lake. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* intel/common: Add an is_kabylake field to gen_device_infoJason Ekstrand2016-11-222-0/+6
| | | | | | | | Most of the 3-D engine Kaby Lake is identical to Sky Lake. However, there are a few small differences that we need to be able to detect. Signed-off-by: Jason Ekstrand <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* anv: Fix unintentional integer overflow in anv_CreateDmaBufImageINTELGwan-gyeong Mun2016-11-221-1/+1
| | | | | | | | | | | | | | | Since both pCreateInfo->strideInBytes and pCreateInfo->extent.height are of uint32_t type 32-bit arithmetic will be used. Fix unintentional integer overflow by casting to uint64_t before multifying. CID 1394321 Cc: "13.0" <[email protected]> Signed-off-by: Mun Gwan-gyeong <[email protected]> [Emil Velikov: cast only of the arguments] Reviewed-by: Emil Velikov <[email protected]>
* util/disk_cache: close a previously opened handle in disk_cache_put (v2)Gwan-gyeong Mun2016-11-221-6/+5
| | | | | | | | | | | | | | We're missing the close() to the matching open(). CID 1373407 v2: Fixes from Emil Velikov's review Update the teardown in reverse order of the setup/init. Cc: "13.0" <[email protected]> Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Emil Velikov <[email protected]> Reviewed-by: Iago Toral Quiroga <[email protected]> (v1)
* docs: get rid of duplicated description from sourcetree.htmlGwan-gyeong Mun2016-11-221-1/+0
| | | | | | Fixes: 438086efb17 (docs: sourcetree.html misc updates) Signed-off-by: Mun Gwan-gyeong <[email protected]> Reviewed-by: Emil Velikov <[email protected]>
* docs/submitting patches: mention get_reviewers.plEmil Velikov2016-11-221-0/+18
| | | | | | | | Mention the script - why/how to use alongside a useful trick to make it work interactively (thanks Rob!). Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Elie Tournier <[email protected]>
* docs/submitting patches: add git tipsTimothy Arceri2016-11-221-0/+20
| | | | | | | | | | | | | | v2: [Emil Velikov] - Add the shorthand git send-email -vX - Move to submittingpatches.html - Add to the TOC. v3: [Emil Velikov] - Use @~8 instead of HEAD~8 (Nicolai) Cc: Timothy Arceri <[email protected]> Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Brian Paul <[email protected]> (v1)
* auxiliary/vl/dri: call get_xcb_screen() only onceEmil Velikov2016-11-221-2/+6
| | | | | Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]>
* egl/x11: store xcb_screen_t *screen instead of int screenEmil Velikov2016-11-223-66/+18
| | | | | | | | | | Just fetch and store it once, rather than doing the xcb_setup_roots_iterator + get_xcb_screen dance five times. v2: Call xcb_disconnect() on error (Eric) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1)
* egl/x11: factor out dri2_get_xcb_connection()Emil Velikov2016-11-221-36/+27
| | | | | | | | | | | | | | | | | Identical throughout dri2, dri3 and drisw. Next patch will add more common code, so rather than duplicating it factor out the function. Note: this also sets eglError on failure. Something that's quite inconsistent throughout the codebase. v2: Call xcb_disconnect() on error (Eric) Note: use xcb_disconnect() even in the xcb_connection_has_error() case as per the manual: ... memory will not be freed until xcb_disconnect... Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Eric Engestrom <[email protected]> (v1)
* mesa/glsl: remove unused uses_builtin_functions fieldTimothy Arceri2016-11-233-3/+0
| | | | | | This has been unused since 943b69cddd Reviewed-by: Samuel Iglesias Gonsálvez <[email protected]>
* i965: Use NIR-based clip/cull lowering for OpenGL as well.Kenneth Graunke2016-11-222-1/+2
| | | | | | | | | | | | The old approach works fine, and this approach isn't necessarily better. But it at least has the advantage that Vulkan and GL use the same approach. I originally wrote it to gain additional testing for the new paths. shader-db statistics show 0 instruction count changes. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Enable clip and cull distance support.Kenneth Graunke2016-11-222-6/+5
| | | | | | | Everything is now in place, and we appear to pass the tests on Gen7+. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* i965/vec4: Handle component qualifiers on non-generic varyings.Kenneth Graunke2016-11-225-73/+53
| | | | | | | | | | | | | ARB_enhanced_layouts only requires component qualifier support for generic varyings, so this is all the vec4 backend knew how to handle. This patch extends the backend to handle it for all varyings, so we can use store_output intrinsics with a component set for things like clip/cull distances. We may want to use that for other VUE header fields in the future as well. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Timothy Arceri <[email protected]>
* i965/fs: Handle compact outputs.Kenneth Graunke2016-11-221-1/+3
| | | | | | | We need to calculate the number of vec4 slots correctly. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* spirv: Silence unsupported capability warnings for Clip/CullDistance.Kenneth Graunke2016-11-221-2/+2
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Set clip/cull distances fields in packets.Kenneth Graunke2016-11-221-6/+26
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* anv: Combine ClipDistance and CullDistance arrays.Kenneth Graunke2016-11-221-0/+3
| | | | | Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: add a pass to compact clip/cull distances.Kenneth Graunke2016-11-223-0/+190
| | | | | | | v2: Use nir_is_per_vertex_io() rather than is_arrays_of_arrays(). Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* nir: Add a "compact array" flag and IO lowering code.Kenneth Graunke2016-11-227-18/+67
| | | | | | | | | | | | | | | | | | | | Certain built-in arrays, such as gl_ClipDistance[], gl_CullDistance[], gl_TessLevelInner[], and gl_TessLevelOuter[] are specified as scalar arrays. Normal scalar arrays are sparse - each array element usually occupies a whole vec4 slot. However, most hardware assumes these built-in arrays are tightly packed. The new var->data.compact flag indicates that a scalar array should be tightly packed, so a float[4] array would take up a single vec4 slot, and a float[8] array would take up two slots. They are still arrays, not vec4s, however. nir_lower_io will generate intrinsics using ARB_enhanced_layouts style component qualifiers. v2: Add nir_validate code to enforce type restrictions. Signed-off-by: Kenneth Graunke <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* radv: add support for shader stats dumpDave Airlie2016-11-223-0/+87
| | | | | | | | | | | I've started working on a shader-db alike for Vulkan, it's based on vktrace and it records pipelines, this adds support to dump the shader stats exactly like radeonsi does, so I can reuse the shader-db scripts it uses. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: fix sample id loadingDave Airlie2016-11-221-1/+18
| | | | | | | | The sample id is packed into bits 8-12, so adjust things properly. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: add implementation of load_sample_pos intrinsic.Dave Airlie2016-11-221-0/+12
| | | | | | | This fixes a bunch of crashes in CTS tests looking for this. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/ac: cleanup ddxy emissionDave Airlie2016-11-221-93/+43
| | | | | | | | | | This cleans up the ddxy emission along the same lines as radeonsi. It also means we don't use LDS on VI chips we use the dspermute interface, it also removes some duplicated code. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv/meta: cleanup resolve vertex state emissionDave Airlie2016-11-221-47/+2
| | | | | | | | For the hw resolve there is no need to emit any sort of texture coordinates, so drop them all in the meta path. Reviewed-by: Bas Nieuwenhuizen <[email protected]> Signed-off-by: Dave Airlie <[email protected]>
* radv: Incorporate GPU family into cache UUID.Bas Nieuwenhuizen2016-11-221-3/+5
| | | | | | Invalidates the cache when someone switches cards. Signed-off-by: Bas Nieuwenhuizen <[email protected]>
* radv: Use library mtime for cache UUID.Bas Nieuwenhuizen2016-11-221-4/+32
| | | | | | | | | | | | We want to also invalidate the cache when LLVM gets changed. As the specific LLVM revision is not fixed at build time, we will need to check at runtime. Computing a checksum for LLVM is going to be very expensive, so just use the mtime. Tested on my computer that the returned DSO for the LLVM symbol is actually the LLVM DSO. Signed-off-by: Bas Nieuwenhuizen <[email protected]>
* radv: Store UUID in physical device.Bas Nieuwenhuizen2016-11-223-14/+16
| | | | | | | No sense in repeatedly determining it. Also, it might be dependent on the device as shaders get compiled differently for SI/CIK/VI etc. Signed-off-by: Bas Nieuwenhuizen <[email protected]>
* glsl: fix NULL checkTimothy Arceri2016-11-221-1/+1
| | | | Fixes copy and paste error in 9d96d3803ab
* swr: calculate viewport width/height based on the scaleIlia Mirkin2016-11-211-6/+12
| | | | | | | | The former calculations were for min/max y. The width/height don't take translate into account. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: don't claim to allow setting layer/viewport from VSIlia Mirkin2016-11-211-1/+1
| | | | | | | | | | | | This may ultimately be possible to support, but for now it's not hooked up and the swr core only supports this output from GS. This normally wouldn't matter, but we lie about supporting GL 3.2, and also the blitter and st/mesa will make use of this functionality if claimed. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: allocate all scratch space in one go for vertex buffersIlia Mirkin2016-11-212-5/+31
| | | | | | | | | | | Multiple buffers may reference client arrays. When this happens, we might reach for scratch space multiple times, which could cause later arrays to invalidate the pointers allocated for the earlier ones. This fixes copyteximage 2D_ARRAY. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: call swr_update_derived unconditionally when drawing/clearingIlia Mirkin2016-11-212-4/+2
| | | | | | | | | | | | | | | | | | Currently a sequence like draw/map/draw/map will cause the second map to not wait for the second draw. This is because the first map will clear the resource business bit, and the second draw won't reset it since no state has changed. swr_update_derived does a tiny bit of extra work, including updating the SWR_BACKEND_STATE as well as waiting for prending fences. If that's a problem, we could call swr_update_resource_status directly from draw/clear handlers. Fixes clearbuffer-stencil, clearbuffer-depth, clearbuffer-depth-stencil, and clearbuffer-display-lists. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Bruce Cherniak <[email protected]>
* swr: [rasterizer memory] minify texture width before alignmentIlia Mirkin2016-11-211-2/+2
| | | | | | | | | The minification should happen before alignment, not after. See similar logic on ComputeLODOffsetY. The current logic requires unnecessarily large textures when there's an initial NPOT size. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* swr: [rasterizer memory] minify original sizes for block formatsIlia Mirkin2016-11-211-11/+25
| | | | | | | | | There's no guarantee that mip width/height will be a multiple of the compressed block size. Doing a divide by the block size first yields different results than GL expects, so we do the divide at the end. Signed-off-by: Ilia Mirkin <[email protected]> Reviewed-by: Tim Rowley <[email protected]>
* radeonsi: remove all varyings for depth-only rendering or rasterization offMarek Olšák2016-11-213-1/+21
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: eliminate VS outputs that aren't used by PS at runtimeMarek Olšák2016-11-213-9/+61
| | | | | | | | | | | | | | | | | | A past commit added the ability to compile "optimized" shader variants asynchronously (not stalling the app). This commit builds upon that and adds what is basically a runtime shader linker. If a VS output isn't used by the currently-bound PS, a new VS compilation is started without that output. The new shader variant is used when it's ready. All apps using separate shader objects I've seen had unused VS outputs. Eliminating unused/useless VS outputs also eliminates the corresponding vertex attribute loads. Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: record information about all written and read varyingsMarek Olšák2016-11-213-3/+98
| | | | | | | It's just tgsi_shader_info with DEFAULT_VAL varyings removed. Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: make si_shader_io_get_unique_index stricterMarek Olšák2016-11-212-11/+14
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't export ClipVertex and ClipDistance[] if clipping is disabledMarek Olšák2016-11-214-5/+37
| | | | | | | | | This is the first user of optimized monolithic shader variants. Cull distances can't be disabled by states. Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: add infrastr. for compiling optimized shader variants asynchronouslyMarek Olšák2016-11-212-34/+109
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: don't set vs.epilog.export_prim_id if TES is boundMarek Olšák2016-11-211-4/+4
| | | | | | | there is no VS epilog in this case Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: simplify checking for monolithic compilationMarek Olšák2016-11-214-8/+9
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: print all flags in si_dump_shader_keyMarek Olšák2016-11-211-0/+5
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: split the shader key into 3 logical partsMarek Olšák2016-11-215-194/+203
| | | | | | | | | key->part.*: prolog and epilog flags only key->as_{ls,es}: special flags key->mono.*: flags for monolithic compilation only Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: fix culling if clip & cull distances are used at the same timeMarek Olšák2016-11-211-2/+3
| | | | | | | | | Fixed piglits: - arb_cull_distance/clip-cull-3 - arb_cull_distance/clip-cull-4 Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: clean up si_emit_clip_regsMarek Olšák2016-11-211-4/+5
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* radeonsi: assume that a VS without POSITION is LSMarek Olšák2016-11-211-0/+7
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>
* tgsi/scan: record if a shader writes the position outputMarek Olšák2016-11-212-0/+3
| | | | | Tested-by: Edmondo Tommasina <[email protected]> Reviewed-by: Nicolai Hähnle <[email protected]>