aboutsummaryrefslogtreecommitdiffstats
path: root/src/gallium/drivers
Commit message (Collapse)AuthorAgeFilesLines
* i915: support NULL-resourcesErik Faye-Lund2019-04-291-2/+5
| | | | | | | It's legal for a buffer-object to have a NULL-resource, but let's just skip over it, as there's nothing to do. Signed-off-by: Erik Faye-Lund <[email protected]>
* lima/ppir: fix pointer referenced after a freePatrick Lerda2019-04-291-1/+2
| | | | | | | | | Issue detected by valgrind. Fixes: 92d7ca4b1cd ("gallium: add lima driver") Signed-off-by: Patrick Lerda <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* lima/ppir: Add gl_FragCoord handlingAndreas Baierl2019-04-297-2/+33
| | | | | | | | | Treat gl_FragCoord variable as a system value and lower the w component with a nir pass. Add the necessary bits for correct codegen. Signed-off-by: Andreas Baierl <[email protected]> Reviewed-by: Qiang Yu <[email protected]>
* panfrost: Workaround -bshadow regressionAlyssa Rosenzweig2019-04-281-1/+8
| | | | | | | I have *no* idea what's happening here, but let's not regress an app that used to work in the mean time while we're figuring it out.. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Safety check immediate precision degradationsAlyssa Rosenzweig2019-04-281-1/+14
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost: Use fp32 (not fp16) varyingsAlyssa Rosenzweig2019-04-281-4/+4
| | | | | | | | | | In a perfect world, we'd use fp16 varyings for mediump and fp32 for highp, allowing us to get a performance win without sacrificing conformance. Unfortunately, we're not there (yet), so it's better we assume always fp32 than always fp16 to avoid artefacts / breaking a lot of deqp. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: imov workaroundAlyssa Rosenzweig2019-04-281-6/+27
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix tex propogationAlyssa Rosenzweig2019-04-281-7/+22
| | | | | | Unbreaks mpv. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Fix regressions in -bjellyfishAlyssa Rosenzweig2019-04-281-3/+7
| | | | | | | | | Two fixes here, one is that we tried to copyprop non-strictly-SSA values which was bound to fly in our face. The other was peeling back the imov workaround.. Turns out we still need that. More research is needed still, but let's not regress real apps. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Only copyprop without an outmodAlyssa Rosenzweig2019-04-281-0/+1
| | | | | | | With an outmod, we would need to propagate that through, which is for future work. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* Revert "panfrost/midgard: Extend copy propagation pass"Alyssa Rosenzweig2019-04-281-48/+8
| | | | | | | | | Fixes: commit b53b4573c3f0571253672e44ce7d6310d9f987bf. Optimization gone wrong. In the future, we should try this again (it's a net win if implemented right), but at the moment this just regresses. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* iris: Fix zeroing of transform feedback offsets in strange cases.Kenneth Graunke2019-04-272-4/+18
| | | | | | | | | | | | | | | | | | | | Some of the dEQP.functional.transform_feedback tests end up doing the following sequence of operations: 1. BeginTransformFeedback 2. PauseTransformFeedback 3. Draw 4. ResumeTransformFeedback At step 1, we'd pack 3DSTATE_SO_BUFFER commands saying to zero the SO_WRITE_OFFSET registers. At step 2, we disable streamout, so step 3 doesn't bother emitting those commands. Then, step 4 re-packs new 3DSTATE_SO_BUFFER commands with offset = 0xFFFFFFFF, saying to continue appending at the existing offset. This loads the value from the BO as the offsets - but we never actually zeroed it. So, just maintain a flag saying "we actually emitted the commands", and stomp offset back to zero until we emit some.
* vc4: Fall back to renderonly if the vc4 driver doesn't have v3d.Eric Anholt2019-04-261-1/+0
| | | | | | | I have a platform with vc4 display but V3D 4.x. We can fall back on kmsro's probing to bring up the v3d gallium driver. Acked-by: Rob Clark <[email protected]>
* radeonsi: don't ignore PIPE_FLUSH_ASYNCMarek Olšák2019-04-261-1/+1
| | | | Reviewed-by: Michel Dänzer <[email protected]>
* Revert "v3d: Disable PIPE_CAP_BLIT_BASED_TEXTURE_TRANSFER."Eric Anholt2019-04-261-1/+9
| | | | | | This reverts commit ccce9409470c1053c40c822d759b9bd417062bc0, leaving a note as to why we had to (corruption in chromium, breaking some GLES3.1 tests).
* v3d: Don't try to update the shadow texture for separate stencil.Eric Anholt2019-04-261-1/+2
| | | | | | | | | | | There are two cases where v3d's sampler view's resource doesn't match the base's: shadow textures for sampling from raster, and pointing at the separate depth texture for z32f_s8x24. We only want to update shadow for the first case. Fixes dEQP-GLES31.functional.stencil_texturing.render.depth32f_stencil8_draw when run after the previous testcase.
* vc4: Use _mesa_hash_table_remove_key() where appropriate.Eric Anholt2019-04-261-12/+9
|
* v3d: Use _mesa_hash_table_remove_key() where appropriate.Eric Anholt2019-04-261-13/+8
|
* v3d: Apply the GFXH-930 workaround to the case where the VS loads attrs.Eric Anholt2019-04-261-0/+15
| | | | | | | We were emitting a dummy load for when the VS doesn't load any attributes, but we also need to emit a dummy load for when the render VS loads attributes but the binner VS doesn't. Fixes simulator assertion failures and GPU hangs on KHR-GLES31.core.texture_gather.\*
* v3d: Fill in the ignored segment size fields to appease new simulator.Eric Anholt2019-04-261-2/+4
| | | | | | We are assured that the input segment size field is ignored for !separate_segs mode, and now the simulator wants an in-range value set regardless of whether it's functionally ignored or not.
* swr/rast: enforce use of tile offsetsAlok Hota2019-04-264-0/+5
| | | | Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: AVX512 support compiled in by defaultAlok Hota2019-04-2612-560/+333
| | | | | | | | | - Emulation of AVX512 built into SIMDLIB - Remove associated macros - Remove knobs controlling AVX512 and let emulation handle it - Refactor variable names for SIMD16 Reviewed-by: Bruce Cherniak <[email protected]>
* swr/rast: Remove deprecated 4x2 backend codeAlok Hota2019-04-268-834/+19
| | | | | | | | | - Use 8x2 tiling by default - Remove associated macros - Use SIMDLIB emulation for SIMD16 on SIMD8 hardware - Remove code rot in Load/StoreTile Reviewed-by: Bruce Cherniak <[email protected]>
* llvmpipe: Always return some fence in flush (v2)Tomasz Figa2019-04-261-0/+2
| | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | | If there is no last fence, due to no rendering happening yet, just create a new signaled fence and return it, to match the expectations of the EGL sync fence API. Fixes random "Could not create sync fence 0x3003" assertion failures from Skia on Android, coming from the following code: https://android.googlesource.com/platform/frameworks/base/+/master/libs/hwui/pipeline/skia/SkiaOpenGLPipeline.cpp#427 Reproducible especially with thread count >= 4. One could make the driver always keep the reference to the last fence, but: - the driver seems to explicitly destroy the fence whenever a rendering pass completes and changing that would require a significant functional change to the code. (Specifically, in lp_scene_end_rasterization().) - it still wouldn't solve the problem of an EGL sync fence being created and waited on without any rendering happening at all, which is also likely to happen with Android code pointed to in the commit. Therefore, the simple approach of always creating a fence is taken, similarly to other drivers, such as radeonsi. Tested with piglit llvmpipe suite with no regressions and following tests fixed: egl_khr_fence_sync conformance eglclientwaitsynckhr_flag_sync_flush eglclientwaitsynckhr_nonzero_timeout eglclientwaitsynckhr_zero_timeout eglcreatesynckhr_default_attributes eglgetsyncattribkhr_invalid_attrib eglgetsyncattribkhr_sync_status v2: - remove the useless lp_fence_reference() dance (Nicolai), - explain why creating the dummy fence is the right approach. Signed-off-by: Tomasz Figa <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: correctly handle waiting in llvmpipe_fence_finishEmil Velikov2019-04-261-1/+6
| | | | | | | | | | | Currently if the timeout differs from 0, we'll end up with infinite wait... even if the user is perfectly clear they don't want that. Use the new lp_fence_timedwait() helper guarding both waits in an !lp_fence_signalled block like the rest of llvmpipe. Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]>
* llvmpipe: add lp_fence_timedwait() helperEmil Velikov2019-04-262-0/+32
| | | | | | | | | | | | | | | | The function is analogous to lp_fence_wait() while taking at timeout (ns) parameter, as needed for EGL fence/sync. v2: - use absolute UTC time, as per spec (Gustaw) - bail out on cnd_timedwait() failure (Gustaw) v3: - check count/rank under mutex (Gustaw) Signed-off-by: Emil Velikov <[email protected]> Reviewed-by: Roland Scheidegger <[email protected]> (v1) Reviewed-by: Gustaw Smolarczyk <[email protected]>
* iris: Silence unused function warningKenneth Graunke2019-04-251-1/+1
|
* freedreno/a6xx: sample-shading supportRob Clark2019-04-254-21/+67
| | | | | | | | | | Enables: OES_sample_shading OES_sample_variables OES_shader_multisample_interpolation Signed-off-by: Rob Clark <[email protected]>
* freedreno: wire up core sample-shading supportRob Clark2019-04-252-0/+11
| | | | Signed-off-by: Rob Clark <[email protected]>
* freedreno/a6xx: add VALIDREG/CONDREG helper macrosRob Clark2019-04-251-7/+8
| | | | | | | | There are a few places that we check if a shader stage input reg is used/valid (ie. not r63.x).. and there are about to be a bunch more. So add some helper macros for less open-coding. Signed-off-by: Rob Clark <[email protected]>
* freedreno: update generated headersRob Clark2019-04-252-7/+7
| | | | | | Pull in updates for sample shading. Signed-off-by: Rob Clark <[email protected]>
* compiler: rename SYSTEM_VALUE_VARYING_COORDRob Clark2019-04-254-4/+4
| | | | | | | And add corresponding enums for different sorts of varying interpolation. Signed-off-by: Rob Clark <[email protected]>
* freedreno: add robustness supportRob Clark2019-04-254-0/+57
| | | | Signed-off-by: Rob Clark <[email protected]>
* panfrost/midgard: Add new bitwise opsAlyssa Rosenzweig2019-04-252-6/+24
| | | | | | These fused NOT-ops could maybe help somehow...? Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Identify inandAlyssa Rosenzweig2019-04-253-3/+7
| | | | | | | This was previously thought to be inot, but it's actually a bit more general than that! :) Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Copy prop for texture registersAlyssa Rosenzweig2019-04-251-2/+35
| | | | | | | | We'll want to unify this with main copy prop (and extend to varyings), but that'll take more care to handle some special cases, so leave it as a stub pass for now. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Optimize csel involving 0Alyssa Rosenzweig2019-04-252-15/+30
| | | | Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Extend copy propagation passAlyssa Rosenzweig2019-04-251-8/+48
| | | | | | | | This extends copy propagation to respect output modifiers for ALU instructions, as well as potentially fixing some bugs related to looping (all dEQP loop tests pass). Signed-off-by: Alyssa Rosenzweig <[email protected]>
* panfrost/midgard: Reduce fmax(a, 0.0) to fmov.posAlyssa Rosenzweig2019-04-251-3/+33
| | | | | | | This will allow us to copyprop away the move and eliminate the instruction entirely. Signed-off-by: Alyssa Rosenzweig <[email protected]>
* ac/nir: Add support for planes.Bas Nieuwenhuizen2019-04-251-0/+7
| | | | Reviewed-by: Samuel Pitoiset <[email protected]>
* iris: make the TFB result visible to othersAndrii Simiklit2019-04-251-10/+15
| | | | | | | | | | | | | | | | | | | | | | OpenGL 4.6 Spec: "5.3.3 Rules ....... Note: “Updates” via rendering or transform feedback are treated consistently with updates via GL commands. Once EndTransformFeedback has been issued, any subsequent command in the same context that uses the results of the transform feedback operation will see the results." v2: removed a wrong comment ( Kenneth Graunke <[email protected]> ) v3: - flush+dirty depends on buffers usage history - removed an old hack ( Kenneth Graunke <[email protected]> ) Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110404 Signed-off-by: Andrii Simiklit <[email protected]> Reviewed-by: Kenneth Graunke <[email protected]>
* iris: Some tidying for preemption supportKenneth Graunke2019-04-254-98/+102
| | | | | | | | Just enable it during init_render_context on Gen10+, and move the Gen9 state tracking into iris_genx_state so it only exists on Gen9. Reviewed-by: Mike Blumenkrantz <[email protected]> Reviewed-by: Rafael Antognolli <[email protected]>
* radeonsi: remove dirty slot masks from scissor and viewport statesMarek Olšák2019-04-256-93/+40
| | | | | | All registers in the array need to be updated if any of them is changed. Only apps writing gl_ViewportIndex were affected by this bug.
* radeonsi/gfx9: rework the gfx9 scissor bug workaround (v2)Marek Olšák2019-04-258-48/+68
| | | | | | | | | | | Needed to track context rolls caused by streamout and ACQUIRE_MEM. ACQUIRE_MEM can occur outside of draw calls. Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=110355 v2: squashed patches and done more rework Cc: 19.0 <[email protected]>
* radeonsi/gfx9: set that window_rectangles always roll the contextMarek Olšák2019-04-251-1/+2
| | | | Cc: 19.0 <[email protected]>
* radeonsi: add radeonsi_sync_compile optionNicolai Hähnle2019-04-252-3/+11
| | | | | | | | | Force the driver thread to sync immediately with a compiler thread (but compilation still happens in a separate thread). This can be useful to simplify debugging compiler issues. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add radeonsi_aux_debug option for aux context debug dumpsNicolai Hähnle2019-04-253-1/+33
| | | | | | | | Enabling this option will create ddebug-style dumps for the aux context, except that instead of intercepting the pipe_context layer we just dump the IB contents on flush. Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add si_debug_options for convenient adding/removing of optionsNicolai Hähnle2019-04-256-16/+37
| | | | | | | | | | | | | | | | | | | | | | | | | | | | Move the definition of radeonsi_clear_db_cache_before_clear there, as well as radeonsi_enable_nir. This removes the AMD_DEBUG=nir option. We currently still have two places for options: the driconf machinery and AMD_DEBUG/R600_DEBUG. If we are to have a single place for options, then the driconf machinery should be preferred since it's more flexible. The only downside of the driconf machinery was that adding new options was quite inconvenient. With this change, a simple boolean option can be added with a single line of code, same as for AMD_DEBUG. One technical limitation of this particular implementation is that while almost all driconf features are available, the translation machinery doesn't pick up the description strings for options added in si_debvug_options. In practice, translations haven't been provided anyway, and this is intended for developer options, so I'm not too worried. It could always be added later if anybody really cares. v2: - use bool instead of uint8_t for options - si_debug_options.inc -> si_debug_options.h Reviewed-by: Marek Olšák <[email protected]>
* radeonsi: add BOs after need_cs_spaceMarek Olšák2019-04-242-6/+6
| | | | | | | | need_cs_space may clear the buffer list. Fixes: 951d60f8cdc88 "radeonsi: delay adding BOs at the beginning of IBs until the first draw" Reviewed-by: Bas Nieuwenhuizen <[email protected]>
* v3d: Disable SSBOs and atomic counters on vertex shaders.Eric Anholt2019-04-241-0/+3
| | | | | | | | | | The CTS fails on dEQP-GLES31.functional.shaders.opaque_type_indexing.atomic_counter.*vertex when they are enabled, due to the VS being run for both bin and render. I think this behavior is expected to be valid, but I can't find text in atomic counters or SSBO specs saying so (the closed I found was in shader_image_load_store). Just disable it for now, since the closed source driver doesn't expose vertex atomic counters/SSBOs either.