summaryrefslogtreecommitdiffstats
Commit message (Collapse)AuthorAgeFilesLines
* pan/midgard: Share mir_nontrivial_outmodAlyssa Rosenzweig2019-07-263-16/+17
| | | | | | To be used with redundant move elimination. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Implement texture RAAlyssa Rosenzweig2019-07-265-143/+271
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | total instructions in shared programs: 3916 -> 3665 (-6.41%) instructions in affected programs: 1405 -> 1154 (-17.86%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.17 x̃: 3 helped stats (rel) min: 3.00% max: 28.57% x̄: 20.11% x̃: 21.74% 95% mean confidence interval for instructions value: -9.35 -4.99 95% mean confidence interval for instructions %-change: -22.75% -17.46% Instructions are helped. total bundles in shared programs: 2472 -> 2256 (-8.74%) bundles in affected programs: 906 -> 690 (-23.84%) helped: 32 HURT: 0 helped stats (abs) min: 1 max: 18 x̄: 6.75 x̃: 3 helped stats (rel) min: 5.56% max: 32.26% x̄: 20.83% x̃: 16.67% 95% mean confidence interval for bundles value: -9.09 -4.41 95% mean confidence interval for bundles %-change: -23.77% -17.89% Bundles are helped. total quadwords in shared programs: 3965 -> 3689 (-6.96%) quadwords in affected programs: 1568 -> 1292 (-17.60%) helped: 35 HURT: 0 helped stats (abs) min: 1 max: 21 x̄: 7.89 x̃: 3 helped stats (rel) min: 2.08% max: 28.57% x̄: 19.87% x̃: 20.00% 95% mean confidence interval for quadwords value: -10.38 -5.39 95% mean confidence interval for quadwords %-change: -22.57% -17.17% Quadwords are helped. total registers in shared programs: 411 -> 392 (-4.62%) registers in affected programs: 76 -> 57 (-25.00%) helped: 15 HURT: 0 helped stats (abs) min: 1 max: 2 x̄: 1.27 x̃: 1 helped stats (rel) min: 9.09% max: 50.00% x̄: 30.97% x̃: 33.33% 95% mean confidence interval for registers value: -1.52 -1.01 95% mean confidence interval for registers %-change: -39.12% -22.82% Registers are helped. total threads in shared programs: 426 -> 432 (1.41%) threads in affected programs: 6 -> 12 (100.00%) helped: 3 HURT: 0 helped stats (abs) min: 2 max: 2 x̄: 2.00 x̃: 2 helped stats (rel) min: 100.00% max: 100.00% x̄: 100.00% x̃: 100.00% Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix backwards blend color loadAlyssa Rosenzweig2019-07-261-1/+1
| | | | | | | | | The source and destination were incorrectly flipped in the move, but some details of our internal regalloc made this function anyway. Now that we're changing the regalloc, we need to fix this to avoid regressing blend shaders. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix scheduling mishapAlyssa Rosenzweig2019-07-261-1/+1
| | | | | | | | | | | | | | | | | | | | | | | | | | | We shouldn't try to schedule onto a vmul if the last unit was a smul; that would force a break ("traveling back in time"). total bundles in shared programs: 2519 -> 2472 (-1.87%) bundles in affected programs: 791 -> 744 (-5.94%) helped: 20 HURT: 0 helped stats (abs) min: 1 max: 9 x̄: 2.35 x̃: 1 helped stats (rel) min: 1.52% max: 11.76% x̄: 7.94% x̃: 7.69% 95% mean confidence interval for bundles value: -3.47 -1.23 95% mean confidence interval for bundles %-change: -9.36% -6.51% Bundles are helped. total quadwords in shared programs: 4028 -> 3965 (-1.56%) quadwords in affected programs: 1223 -> 1160 (-5.15%) helped: 17 HURT: 0 helped stats (abs) min: 1 max: 17 x̄: 3.71 x̃: 2 helped stats (rel) min: 2.97% max: 10.64% x̄: 6.97% x̃: 7.14% 95% mean confidence interval for quadwords value: -5.71 -1.70 95% mean confidence interval for quadwords %-change: -8.03% -5.91% Quadwords are helped. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix vector->scalar swizzlesAlyssa Rosenzweig2019-07-261-5/+8
| | | | | | | The swizzle should be taken on the masked component, rather than unconditionally X. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add dead move elimination passAlyssa Rosenzweig2019-07-262-0/+44
| | | | | | | This is a special case of DCE designed to run after the out-of-ssa pass to cleanup special register lowering. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Move DCE into its own fileAlyssa Rosenzweig2019-07-264-22/+48
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Add mir_rewrite_dst_tag helperAlyssa Rosenzweig2019-07-262-0/+15
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Fix flipped register bias fieldsAlyssa Rosenzweig2019-07-263-23/+6
| | | | | | | | We mixed up component_lo and full, which made it appear that we had less freedom in RA than we actually do. Fix this to fix some disassemblies as well as prepare for RA with the bias field. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* pan/midgard: Update RA for cubemap coordsAlyssa Rosenzweig2019-07-263-10/+8
| | | | | | | Following the RA work, we apply the same technique to eliminate the move to r27 when loading cubemaps. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* anv+tu+radv: delete unusable dev_icd.jsonEric Engestrom2019-07-263-39/+0
| | | | | | | | | | | As per previous commit, Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, so delete them. Suggested-by: Jason Ekstrand <[email protected]> Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> # for anv Reviewed-by: Eric Anholt <[email protected]> # for tu Reviewed-by: Bas Nieuwenhuizen <[email protected]> # for radv
* docs: fix intel_icd.json pathEric Engestrom2019-07-261-1/+1
| | | | | | | | | Meson doesn't support using uninstalled libs, they're simply not ready until `ninja install` is ran, at which point one might as well use the proper icd.json file in the install folder. Signed-off-by: Eric Engestrom <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]>
* vulkan/wsi/x11: Increase the effective min. images for mailbox.Bas Nieuwenhuizen2019-07-261-2/+5
| | | | | | | | | | | We need 5 images: 1) CPU work 2) GPU work 3) idle 4) queued for flip 5) presenting Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi/x11: Wait for GPU work before present with mailbox.Bas Nieuwenhuizen2019-07-261-1/+12
| | | | | | | | | | | | Otherwise the wait only happens at flip time, which messes with keeping idle buffers around if the GPU work makes the image miss the next flip. I decided not to use the wait fences as those are still xshm fences, so that means we'd still have to wait in the application. Just doing it before presenting makes things simpler. Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi/x11: Allow using thread present-only.Bas Nieuwenhuizen2019-07-261-34/+51
| | | | | | This allows doing a potential long blocking operation before present. Reviewed-by: Lionel Landwerlin <[email protected]>
* vulkan/wsi: Use one fence per image.Bas Nieuwenhuizen2019-07-262-20/+26
| | | | | | | Much easier to work with if we want to use them in the WS-specific WSI implementation. Reviewed-by: Lionel Landwerlin <[email protected]>
* spirv: propagate access qualifiers through ssa & pointerLionel Landwerlin2019-07-263-4/+62
| | | | | | | | | | | | | | | | | | | | | | | | | Not only variables can be flagged as NonUniformEXT but also expressions. We're currently ignoring it in an expression such as : imageLoad(data[nonuniformEXT(rIndex)], 0) The associated SPIRV : OpDecorate %69 NonUniformEXT ... %69 = OpLoad %61 %68 This changes propagates access qualifiers through ssa & pointers so that when it hits a OpLoad/OpStore style instructions, qualifiers are not forgotten. Fixes failure the following tests : dEQP-VK.descriptor_indexing.* Signed-off-by: Lionel Landwerlin <[email protected]> Fixes: 8ed583fe523703 ("spirv: Handle the NonUniformEXT decoration") Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* spirv: wrap push ssa/pointer valuesLionel Landwerlin2019-07-264-69/+89
| | | | | | | | | | This refactor allows for common code to apply decoration on all ssa/pointer values. In particular this will allow to propagage access qualifiers. Signed-off-by: Lionel Landwerlin <[email protected]> Suggested-by: Caio Marcelo de Oliveira Filho <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* nir: add access to image_deref intrinsicsLionel Landwerlin2019-07-261-0/+3
| | | | | | | | | | | SPIRV added the ability to access variables and have expressions non dynamically uniform and because spirv_to_nir generates deref instructions, we'll need to have that access there. Signed-off-by: Lionel Landwerlin <[email protected]> Cc: <[email protected]> Reviewed-by: Jason Ekstrand <[email protected]> Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* main: unreference ATIFragmentShader program before creating new oneYevhenii Kolesnikov2019-07-261-1/+4
| | | | | | | Old program was overwritten without release of memory. Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* state_tracker: Add destroying routine for feedback and select stagesYevhenii Kolesnikov2019-07-261-2/+2
| | | | | | | Fixes leaking memory on iris. Signed-off-by: Yevhenii Kolesnikov <[email protected]> Reviewed-by: Lionel Landwerlin <[email protected]>
* v3d: fix glDrawTransformFeedback{Instanced}()Iago Toral Quiroga2019-07-262-2/+18
| | | | | | | | | | | | | | | This needs to take the vertex count from the provided transform feedback buffer. v2: - don't take the vertex count from the underlying buffer, instead, take it from a v3d subclass of pipe_stream_output_target (Eric). Fixes piglit tests: spec/ext_transform_feedback2/draw-auto spec/ext_transform_feedback2/draw-auto instanced Reviewed-by: Eric Anholt <[email protected]>
* v3d: subclass pipe_streamout_output_target to record TF vertices writtenIago Toral Quiroga2019-07-263-8/+25
| | | | Reviewed-by: Eric Anholt <[email protected]>
* v3d: refactor v3d_tf_statistics_record slightlyIago Toral Quiroga2019-07-261-7/+7
| | | | Reviewed-by: Eric Anholt <[email protected]>
* Revert "panfrost: Don't DIY point size/coord fields"Alyssa Rosenzweig2019-07-252-2/+9
| | | | | | | | This reverts commit 4508f43eed5a4528f0e8ca9d1cfcdc78857043e0, which broke a bunch of dEQP tests (e.g. in dEQP-GLES2.functional.draw.draw_arrays.*) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* anv: Disable transform feedback on gen7Jason Ekstrand2019-07-251-1/+1
| | | | | | | It's totally implementable, it's just that the plumbing is a bit different and we never hooked it up. Don't advertise a broken feature. Fixes: 36ee2fd61c8 "anv: Implement the basic form of VK_EXT_transform_feedback"
* mesa: Fix GetTextureImage error reporting, againPierre-Eric Pelloux-Prayer2019-07-251-4/+20
| | | | | | | | | | | | | | | | | | | | | | | | | | Iago Toral Quiroga fixed this in commit 94f740e3fce0cb26e4d90cb9de75b, but it recently regressed in 0d8826f723cd8868b5271f17f18a1ab4548a1199. Quoting Iago's original commit message for the fix: GetTex*Image should return INVALID_ENUM if target is not valid, however, GetTextureImage does not receive a target, and instead should return INVALID_OPERATION if the effective target is not valid. From the OpenGL 4.6 core profile spec, section 8.11 Texture Queries: "An INVALID_OPERATION error is generated by GetTextureImage if the effective target is not one of TEXTURE_1D, TEXTURE_2D, TEXTURE_3D, TEXTURE_1D_ARRAY, TEXTURE_2D_ARRAY, TEXTURE_CUBE_MAP_ARRAY, TEXTURE_RECTANGLE, or TEXTURE_CUBE_MAP (for GetTextureImage only)." Note that this differs from the original ARB_direct_state_access spec. However, the EXT_direct_state_access version does take a target parameter, so it should continue reporting INVALID_ENUM. Fixes KHR-GL45.direct_state_access.textures_image_query_errors. Fixes: 0d8826f723c ("mesa: refactor get_texture_image to remove duplicate code") Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Use gen_mi_builder to handle CS ALU operations.Kenneth Graunke2019-07-256-474/+151
| | | | | | | | In a few cases, we switch to MI_MATH instead of MI_PREDICATE, just because we were already doing math and it's easier to chain together. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/mi: Add a unit test for gen_mi_store_if().Kenneth Graunke2019-07-251-0/+43
| | | | | | This tests that predicated stores work. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/mi: Add a new gen_mi_store_if() helper.Kenneth Graunke2019-07-251-0/+53
| | | | | | | This performs predicated MI_STORE_REGISTER_MEM commands, assuming that the condition is already loaded into MI_PREDICATE_DATA. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/mi: Add gen_mi_nz() and gen_mi_z() helpers.Kenneth Graunke2019-07-251-0/+20
| | | | | | These provide comparisons against zero. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/mi: Add a gen_mi_ior() to go with gen_mi_iand()Kenneth Graunke2019-07-251-0/+8
| | | | Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* intel/mi: Optimize away LOAD_REGISTER_REG from a register to itselfKenneth Graunke2019-07-251-3/+5
| | | | | | | | We might want to resolve something to be in a particular register, so we can access it outside of the gen_mi framework...but it may already be in that register, at which point there's no work to do. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Make iris_query.c a genxml-compiled file.Kenneth Graunke2019-07-256-65/+48
| | | | | | This will let us use Jason's new MI-builder shortly. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Move iris_resolve_conditional_render to the vtable.Kenneth Graunke2019-07-253-5/+8
| | | | | | It's going to be in genxml code shortly. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Refactor genxml macros and inlines into iris_genx_macros.h.Kenneth Graunke2019-07-254-73/+125
| | | | | | | This will let us put the genxml boilerplate in one place, before we expand genxml to more files shortly. Like i965/genX_boilerplate.h. Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* iris: Make an iris_genx_protos.h header for prototypes.Kenneth Graunke2019-07-254-28/+74
| | | | | | | This lets us specify the prototypes once, instead of cut and pasting them per generation. isl uses a similar approach (isl_genX_priv.h). Reviewed-by: Caio Marcelo de Oliveira Filho <[email protected]>
* radeonsi: fix DAL hang due to incorrect DCC offset on RavenMarek Olšák2019-07-251-1/+22
| | | | | | Set the correct relative offset. Fixes: f8b6c5a "radeonsi: rewrite si_get_opaque_metadata, also for gfx10 support"
* anv: Disable subgroup arithmetic on gen7Jason Ekstrand2019-07-251-3/+10
| | | | Reviewed-by: Lionel Landwerlin <[email protected]>
* gitlab-ci: Add a shader-db run using v3d on drm-shim.Eric Anholt2019-07-254-2/+36
| | | | | | | | | | | | | | This provides significant compiler coverage during CI at a fairly low cost in CPU time (~17s per thread for 4 threads on gst-gitlab-htz-runner3). I'm leaving wget in the docker image, as once this is in master I'm planning on having an automatic shader-db comparison between master and the branch included in the artifacts. I also haven't done freedreno yet, because it has some races when run in multithreaded mode that I'm still tracking down. Reviewed-by: Eric Engestrom <[email protected]>
* gitlab-ci: Only keep the build logs as artifacts.Eric Anholt2019-07-251-2/+5
| | | | | | | | | | | On a build failure, we were tarring up the whole ccache directory, build.ninja, build products, etc. This was over 400MB compressed on a recent early meson-main build failure, which fd.o then has to hang on to for 4 weeks. The build logs are probably the interesting part, are potentially useful regardless ("how did CI's build flags differ from mine?"), and are <500k uncompressed on my personal meson build. Reviewed-by: Michel Dänzer <[email protected]>
* gitlab-ci: Always set libdir to lib/Eric Anholt2019-07-252-1/+1
| | | | | | | | I introduced libdir for cross-builds so we could point at the resulting drivers without per-arch dependencies, but I'd rather not have to type x86_64-linux-whatever for non-cross-builds either. Reviewed-by: Eric Engestrom <[email protected]>
* freedreno: Add support for drm-shim.Eric Anholt2019-07-256-0/+229
| | | | | | I'm using this for shader-db analysis on x86_64 systems. Reviewed-by: Rob Clark <[email protected]>
* v3d: Introduce a DRM shim for calling out to the simulator.Eric Anholt2019-07-2515-2/+1815
| | | | | | | | | | | | The goal is to enable testing of parts of drivers without depending on any particular kernel version or hardware being present. Simply set LD_PRELOAD=$PREFIX/lib/libv3d_drm_shim.so in your environment, and we'll fake a /dev/dri/renderD128 (or whatever the next available node is) using v3dv3. That node can then be used with the surfaceless or gbm EGL platforms. Acked-by: Iago Toral Quiroga <[email protected]>
* glsl: report no function instead of empty candidate listErik Faye-Lund2019-07-251-2/+17
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | When generating the error message for a missing function error where all available overloads were missing due to a too low GLSL version, we used to report something like this: ---8<--- 0:224(14): error: no matching function for call to `textureCubeLod(samplerCube, vec3, float)'; candidates are: 0:224(14): error: type mismatch ---8<--- This is a pretty confusing error message, and can throw people off when debugging. So let's instead check if any overload is available before we decide what to print. This allow us to report something like this instead: ---8<--- 0:224(14): error: no function with name 'textureCubeLod' 0:224(14): error: type mismatch ---8<--- This is arguably easier to understand for programmers, and doesn't send you on a wild goose chase to figure out what argument is wrong just because you stopped reading the message prematurely. I'm of course referring to a friend, not me. For sure. I would never do that. Signed-off-by: Erik Faye-Lund <[email protected]> Reviewed-by: Matt Turner <[email protected]>
* radv: Set correct metadata size for GFX9+.Bas Nieuwenhuizen2019-07-251-1/+2
| | | | | | | | | | | Without correct size, radeonsi assumes the metadata is incorrect, which can and will cause issues. Since the metadata is really incorrect without the size, let us fix that. Fixes: e43cc3e3afc "radv/gfx9: handle GFX9 opaque metadata" Reviewed-by: Samuel Pitoiset <[email protected]>
* anv: report HOST_ALLOCATION as supported for imagesArcady Goldmints-Orlov2019-07-251-0/+4
| | | | | | | | | | Report VK_EXTERNAL_MEMORY_HANDLE_TYPE_HOST_ALLOCATION_BIT_EXT as supported for images. It was being shown supported for buffers, but not images. Fixes: 69cc6272fbc1 ("anv: Implement VK_EXT_external_memory_host") Reviewed-by: Lionel Landwerlin <[email protected]>
* radv/gfx10: fix intensity formats by setting ALPHA_IS_ON_MSBSamuel Pitoiset2019-07-251-6/+11
| | | | | | | | | | This fixes dEQP-VK.rasterization.primitive_size.points.point_size_* This also fixes some black squares with the Sascha SSAO demo. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* radv/gfx10: use L2 for DMA copy/fill operationsSamuel Pitoiset2019-07-251-0/+16
| | | | | | | | | | | | | It's coherent and faster. GFX7-GFX9 should also support this but for now only uses L2 for GFX10 because it's untested on previous gens. This fixes dEQP-VK.memory.pipeline_barrier.transfer_* This also fixes some missing geometry in Dawn Of War III because VBOs weren't updated correctly. Signed-off-by: Samuel Pitoiset <[email protected]> Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* pan/midgard: Optimize varying projectionAlyssa Rosenzweig2019-07-255-14/+114
| | | | | | | | | | | | | | | | | | | | | | | | We add a new opt pass fusing perspective projection with varyings. Minor win..? We don't combine non-varying projections, since if we're too agressive, the extra load/store traffic will hurt us so it's not really a win in practice. total instructions in shared programs: 3915 -> 3913 (-0.05%) instructions in affected programs: 76 -> 74 (-2.63%) helped: 1 HURT: 0 total bundles in shared programs: 2520 -> 2519 (-0.04%) bundles in affected programs: 46 -> 45 (-2.17%) helped: 1 HURT: 0 total quadwords in shared programs: 4027 -> 4025 (-0.05%) quadwords in affected programs: 80 -> 78 (-2.50%) helped: 1 HURT: 0 Signed-off-by: Alyssa Rosenzweig <[email protected]>