1372102 - Investigate how zero duration causes playback stuttering

Reporter

Description

•

8 years ago

Bug 1371290 fixes the stuttering by giving each sample a reasonable duration. We need to figure out how duration affects playback in order to remove duration and depend only on the sample start time in our playback pipeline.

JW Wang [:jwwang] [:jw_wang]

Reporter

Updated

•

8 years ago

Priority: -- → P1

Comment 2

•

8 years ago

(In reply to JW Wang [:jwwang] [:jw_wang] from comment #1) > Created attachment 8876635 [details] > Bug 1372102 - set zero duration to repro playback stuttering. > > Review commit: https://reviewboard.mozilla.org/r/147978/diff/#index_header > See other reviews: https://reviewboard.mozilla.org/r/147978/ The patch deliberately set zero duration and allows you to repro the issue on all platforms.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 3

•

8 years ago

OK. I see what's going with zero duration. Suppose we have some frames in the video queue (with zero duration): [0], [100], [200], [300], [400] Render loop 1 (t=0): We sent all frames to the image container and schedule the next loop at t=100. Render loop 2 (t=100): [0] and [100] are removed from the queue according to their end times. See (1) for the code. So we send [200], [300], [400] to the image container. Depending on the vsync timing, [100] might haven't displayed to the user yet when we replace the current images. Therefore [100] is dropped and never displayed to the user. This is how jitters come from. Also, |frame->IsSentToCompositor()| doesn't correctly report dropped frames. A frame sent to the image container and then replaced before reaching the user is not reported as a dropped frame. I will add some tweaks to VideoSink to take zero duration into account. (1) http://searchfox.org/mozilla-central/rev/d840ebd5858a61dbc1622487c1fab74ecf235e03/dom/media/mediasink/VideoSink.cpp#419

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 4

•

8 years ago

(In reply to walmartguy from bug 1314458 comment #146) > I was using this video in 4k 60fps: > https://www.youtube.com/watch?v=aqz-KE-bpKQ > In the beginning, the camera is moving down gradually which makes stutter > easy to spot. Attach the video link that reproduces the issue.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 5

•

8 years ago

Well, after thinking more about it, there is no easy way to determine whether a frame is late and should be dropped without looking at the end time (calculated by mTime + mDuration). Using the start time of next frame as the end time of the previous frame doesn't work when it comes to frame-skipping. This is preventing us from removing duration from our media playback pipeline. Hi Jya, Can you share your thoughts here how we can move on?

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 6

•

8 years ago

isn't a frame late, only late if we have a new frame to play? If there's no new frame, it's not late (yet) and should be displayed. :kentuckyfriedtakahe did raise the issue that if we always displayed the frame that we had, when we had them it makes thing looks very weird and like very broken A/V sync as the frame being displayed may no longer have any links to what you're listening to. If we assume that the duration of a frame is always defined as start_next_frame_time - start_previous_frame_time. Then we don't need an explicit duration

Flags: needinfo?(jyavenard)

Jean-Yves Avenard [:jya]

Comment 7

•

8 years ago

Will think more about it...

Flags: needinfo?(jyavenard)

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 8

•

8 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #6) > If we assume that the duration of a frame is always defined as > start_next_frame_time - start_previous_frame_time. Then we don't need an > explicit duration For 2 frames in the video queue, VideoSink doesn't know if they are adjacent frames or there are some frames missing due to skip-to-next-key-frame. |start_next_frame_time - start_previous_frame_time| could give a wrong duration to a frame due to skip-to-next-key-frame and the frame is wrongly reported as presented instead of dropped. My idea is that since demuxer knows the start time of each video frame, it should be able to calculate the duration by using |start_next_frame_time - start_previous_frame_time| before sending a demuxed frame to the decoder. It works even in the case of skip-to-next-key-frame. So we still need 'duration'. But it will be calculated by demuxer and we don't have to depend on the mercy of decoder to provide a valid duration. This is another strong reason to remove the legacy reader where we have no control over the demuxer.

Jean-Yves Avenard [:jya]

Comment 9

•

8 years ago

I'm not sure I understand how the example you give (skip to next frame got triggered) justify requiring a duration. obviously a frame is only to be displayed if the frame's start time is <= currentTime. until the new frame is ready to be displayed, it's the previous one that is to be seen no? So even in the case of skip_to_next_frame the rule of duration=next_frame-current_frame is true. And you always have the issue with the last video frame, you can't calculate how long that frame will last. It is to be displayed until either duration is reached or there's no more audio to play.

Flags: needinfo?(jyavenard)

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 10

•

8 years ago

Suppose we have 3 frames in the encoded file: [100], [200], [300]. Now we have [100] and [300] in the video queue and [200] is dropped due to skip-to-next-key-frame. Now with currentTime=150, and both [100] and [300] are not displayed yet, should we still display [100] or just drop it? I think we should drop it for we are supposed to render [200] when currentTime is 150. However, since [200] is missing, there is no way for VideoSink to know how long a frame should last without the duration. Using |duration=next_frame-current_frame| will bring bug 1258870 back. For the last frame, it is fine to assume the duration is infinity since there is no next frame to the last one.

Jean-Yves Avenard [:jya]

Comment 11

•

8 years ago

if skip-to-next-key-frame has kicked in, we must have audio. As such, we have a value for currentTime. if currentTime is 150, then 100 is to be displayed. 300 can only be displayed once currentTime is >= 300. Why would you render 200 when currentTime is 150? it's too early to display this one.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 12

•

8 years ago

Sorry. I mean currentTime=250 when it is too late to render [100], but too early for [300].

Jean-Yves Avenard [:jya]

Comment 13

•

8 years ago

the only way if we had [100],[300] in the queue is that playback is linear. 100 should be displayed (as we have to display something) as it's the first frame anyway... if 100 isn't the first frame, we have a broken A/V sync anyway, so I'm not sure displaying 100 would be a big deal.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 14

•

8 years ago

Yeah, the A/V sync is broken anyway no matter you display [100] or not. However, the point is we want to report [100] as dropped instead of presented so the player can detect frame dropping and switch to a lower resolution when the decoder can't keep up. Using |duration=next_frame-current_frame| will give a wrong duration to [100] and not report dropped frames correctly.

Jean-Yves Avenard [:jya]

Comment 15

•

8 years ago

At this stage I would argue that no frames dropped by the compositor are reported anyway (bug 1245400)

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 16

•

8 years ago

I am still not convinced we should report dropped frames based on whether they are rendered by the compositor or not. This is what we used to have in bug 1258870: render loop 1: With currentTime=250 and [100] in the queue, [100] is sent to the compositor. render loop 2: With currentTime=350 and [100],[200] in the queue, [100] is removed from the queue and [200] is sent to the compositor. render loop 2: With currentTime=450 and [200],[300] in the queue, [200] is removed from the queue and [300] is sent to the compositor. Despite we send every frame to the compositor without any dropped frame, A/V sync is broken. The fix of bug 1258870 ensures those late frames are reported as dropped so the player has a chance to switch to a lower resolution for better playback experience. Therefore we should report dropped frames according to their end time instead of whether they are sent to or rendered by the compositor. Hi Chris, Can you share your opinions here since you fixed bug 1258870?

Flags: needinfo?(cpearce)

Jean-Yves Avenard [:jya]

Comment 17

•

8 years ago

If the frames were dropped because they were late, they definitely should be reported as it's indicating that this machine is unable to keep up... > Therefore we should report dropped frames according to their end time instead of whether they are sent to or rendered by the compositor. but I'd like to get rid of that end time, it's causing pain with webrtc (which has no concept of end time) or even webm.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 18

•

8 years ago

Here is my proposal: 1. Preload some video frames before starting playback. This is already done by MDSM. 2. When VideoSink starts playback, peek all frames in the queue to calculate an average duration. 3. Use the average duration to determine the end of the last frame in the queue. The end time of each frame is calculated either by: 1. current_end_time = next_start_time or 2. current_end_time = current_start_time + average_duration. Then we can remove mDuration and GetEndTime() from VideoData.

Jean-Yves Avenard [:jya]

Comment 19

•

8 years ago

That sounds like a good plan.. This is what the WebMDemuxer does already. It always demux one frame ahead to determine the duration of the previous frame). when it gets to the last frame, it checks the average frame duration and use that instead. It's not 100% accurate, but its good enough.

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 20

•

8 years ago

(In reply to JW Wang [:jwwang] [:jw_wang] from comment #16) > Therefore we should report dropped frames according to their end time > instead of whether they are sent to or rendered by the compositor. > > Hi Chris, > Can you share your opinions here since you fixed bug 1258870? I'm not keen on regressing bug 1258870. Could we have the MediaFormatReader tag a frame which occurs after a skip-to-next-keyframe as a discontinuity, and we special case that in the dropping logic so we drop up to that whenever it's in the queue? Maybe if we just gave discontinuities a timestamp of 0 that would be sufficient?

Flags: needinfo?(cpearce)

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 21

•

8 years ago

(In reply to Chris Pearce (:cpearce) from comment #20) > Maybe if we > just gave discontinuities a timestamp of 0 that would be sufficient? This is a bit vague, what I mean is drop all frames with start time less than the clock time, provided the start time of the *following* frame is also less than or equal to the clock time.

Jean-Yves Avenard [:jya]

Comment 22

•

8 years ago

I don't think the issue is related to skip-to-next-keyframe in any ways. In the scenario described above, skip-to-next-keyframe was never triggered.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 23

•

8 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #22) > I don't think the issue is related to skip-to-next-keyframe in any ways. > In the scenario described above, skip-to-next-keyframe was never triggered. IIUC, you are talking about the example in comment 16 where the decoder doesn't fall behind playback enough to trigger skip-to-next-keyframe. However, A/V async issue is still noticeable to the user.

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 24

•

8 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #22) > I don't think the issue is related to skip-to-next-keyframe in any ways. > In the scenario described above, skip-to-next-keyframe was never triggered. Comment #10? That's the scenario that bug 1258870 fixed.

JW Wang [:jwwang] [:jw_wang]

Reporter

Comment 25

•

8 years ago

Depending on how incapable the computer is, it could the example in comment 10 or the one in comment 16 (decoder isn't slow enough to trigger skip-to-next-keyframe).

Jean-Yves Avenard [:jya]

Comment 26

•

8 years ago

A/V sync would typically still be good, it just cause visible jitter.

Jean-Yves Avenard [:jya]

Comment 27

•

8 years ago

(In reply to Chris Pearce (:cpearce) from comment #24) > (In reply to Jean-Yves Avenard [:jya] from comment #22) > > I don't think the issue is related to skip-to-next-keyframe in any ways. > > In the scenario described above, skip-to-next-keyframe was never triggered. > > Comment #10? That's the scenario that bug 1258870 fixed. I think there's too many misunderstanding at hand here. I personally never referred about "stnk" when describing the problem (ability to remove the concept of frame duration. We have the decoder that returned 100, 300, we only have a start time. the video queue contains 100,200 100 arrived a tad too late, 300 arrived early, currentTime is 150. Which frame do we display now?

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 28

•

8 years ago

(In reply to Jean-Yves Avenard [:jya] from comment #27) > (In reply to Chris Pearce (:cpearce) from comment #24) > > (In reply to Jean-Yves Avenard [:jya] from comment #22) > > > I don't think the issue is related to skip-to-next-keyframe in any ways. > > > In the scenario described above, skip-to-next-keyframe was never triggered. > > > > Comment #10? That's the scenario that bug 1258870 fixed. > > I think there's too many misunderstanding at hand here. I personally never > referred about "stnk" when describing the problem (ability to remove the > concept of frame duration. Sure, but STNKF is an important case to consider, as it exacerbates any problems here. > We have the decoder that returned 100, 300, we only have a start time. the > video queue contains 100,200 > > 100 arrived a tad too late, 300 arrived early, currentTime is 150. > > Which frame do we display now? I'm confused by your example here now. It's not clear whether the queue contains 100,200, or 100,300? In either case, one would think we should display 150. But, if we're struggling to keep up, chances are that a lot of the time we have only 1 frame in the queue, as I discovered: https://bugzilla.mozilla.org/show_bug.cgi?id=1258870#c6 And so if we don't drop this frame, we'll end up always showing frames which are late, and break A/V sync. Also, re-reading https://bugzilla.mozilla.org/show_bug.cgi?id=1258870#c6 made me realise, if we often have only 1 frame in the queue when we're struggling, we're unlikely to be able to infer a duration. At least at this level.

Chris Pearce [:cpearce (Not reading bugmail)]

Comment 29

•

8 years ago

FWIW, Chrome's logic to do this is in this class (I think): https://cs.chromium.org/chromium/src/media/filters/video_renderer_algorithm.cc?rcl=a2e6fd993b05675adbedb635639f218d10ff0aa7

Emma Humphries ☕️🎸🧞‍♀️✨ (she/they) [:emceeaich] (Pacific Time) use needinfo

Comment 30

•

8 years ago

This is a P1 bug without an assignee. P1 are bugs which are being worked on for the current release cycle/iteration/sprint. If the bug is not assigned by Monday, 28 August, the bug's priority will be reset to '--'.

Keywords: stale-bug

Bulk Bug Changes for mreavy's org

Comment 31

•

8 years ago

Mass change P1->P2 to align with new Mozilla triage process

Priority: P1 → P2

Sylvestre Ledru [:Sylvestre]

Comment 32

•

7 years ago

Moving to p3 because no activity for at least 1 year(s). See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information

Priority: P2 → P3

BMO Automation

Updated

•

3 years ago

Severity: normal → S3