libav: fix mp4 audio sync problem

Initial CTS (composition offset) was essentially getting added twice to the computed PTS Fixes https://github.com/HandBrake/HandBrake/issues/568 Here's a description of how mp4 timestamps work and what is going wrong for the curious. Terminology: pts = presentation timestamp, when a frame is displayed dts = decode timestamp, when a frame is decoded cts = composition offset, pts - dts empty edit = defines the pts of the first frame in an mp4 track mp4 timestamps are computed from 3 primary values that are in the mp4 stream. An "empty edit" in the track edit list per frame duration per frame cts Here's where things get messy. How do you compute pts(N) and dts(N) for some frame N from only the above 3 values in the mp4 file? empty edit == pts(0) and is read from the mp4 file (EDTS table) duration(N) is read from the mp4 file (STTS table) cts(N) is read from the mp4 file (CTTS table) We know cts(0) = pts(0) - dts(0) by definition of cts And cts(0) and pts(0) are known since they can be read from the mp4 file This is the step libav gets wrong! Therefore we can compute dts(0) = pts(0) - cts(0). libav computes dts(0) = pts(0) which shifts all frames by cts(0) After that dts(N) = dts(0) + duration(0) + ... + duration(N-1) And finally pts(N) = dts(N) + cts(N)
author: John Stebbins <[email protected]> 2017-02-10 11:26:59 -0700
committer: John Stebbins <[email protected]> 2017-02-10 11:26:59 -0700
commit: 88343d5a0ee9969071bb8a263dab0e0a66c4c8ff (patch)
tree: 8bef48518975ec3f44d38a9e5c4b96524db5854b /contrib
parent: e232f46b7dd9614e73c92765dbd506cb0c3f9936 (diff)
1 files changed, 14 insertions, 0 deletions
diff --git a/contrib/ffmpeg/A06-edit-list-offset.patch b/contrib/ffmpeg/A06-edit-list-offset.patch
new file mode 100644
index 000000000..8c1d0a668
--- /dev/null
+++ b/contrib/ffmpeg/A06-edit-list-offset.patch
@@ -0,0 +1,14 @@
+diff --git a/libavformat/mov.c b/libavformat/mov.c
+index 2810960..71c37c2 100644
+--- a/libavformat/mov.c
++++ b/libavformat/mov.c
+@@ -2321,6 +2321,9 @@ static void mov_build_index(MOVContext *mov, AVStream *st)
+         if (sc->time_offset < 0)
+             sc->time_offset = av_rescale(sc->time_offset, sc->time_scale, mov->time_scale);
+         current_dts = -sc->time_offset;
++        if (sc->ctts_data && sc->ctts_count) {
++            current_dts -= sc->ctts_data[0].duration;
++        }
+     }
+ 
+     /* only use old uncompressed audio chunk demuxing when stts specifies it */
author	John Stebbins <[email protected]>	2017-02-10 11:26:59 -0700
committer	John Stebbins <[email protected]>	2017-02-10 11:26:59 -0700
commit	88343d5a0ee9969071bb8a263dab0e0a66c4c8ff (patch)
tree	8bef48518975ec3f44d38a9e5c4b96524db5854b /contrib
parent	e232f46b7dd9614e73c92765dbd506cb0c3f9936 (diff)