Improve the performance of selecting cues in order to avoid lagging when dealing with too many cues.
Categories
(Core :: Audio/Video: Playback, defect, P3)
Tracking
()
People
(Reporter: ke5trel, Unassigned)
References
(Blocks 1 open bug, )
Details
(Keywords: perf:responsiveness)
STR:
- Visit https://www.newsmax.com on Ubuntu 23.04 or Windows 11.
- Make sure the "Newsmax TV Live" video is playing on the right side of the page.
- Leave playing for 10-15 minutes.
Move cursor over hoverable elements to see if the page is responsive. After 4 minutes it starts stalling for 2 seconds every 6 seconds. After 10 - 15 minutes stalls increase to 10 seconds and video starts stalling and buffering.
Performance profile:
https://share.firefox.dev/3RRaovu
Most time spent in text tracks:
mozilla::dom::TextTrack::GetCurrentCuesAndOtherCues(RefPtr<mozilla::dom::TextTrackCueList>&, RefPtr<mozilla::dom::TextTrackCueList>&, mozilla::media::Interval<mozilla::media::TimeUnit> const&) const [dom/media/webvtt/TextTrack.cpp]
mozilla::dom::TextTrackManager::TimeMarchesOn() [dom/html/TextTrackManager.cpp]
mozilla::dom::HTMLMediaElement::NotifyCueAdded(mozilla::dom::TextTrackCue&) [dom/html/HTMLMediaElement.h]
mozilla::dom::TextTrack::SetMode(mozilla::dom::TextTrackMode) [dom/media/webvtt/TextTrack.cpp]
mozilla::dom::TextTrack_Binding::set_mode(JSContext*, JS::Handle<JSObject*>, void*, JSJitSetterCallArgs) [dom/bindings/TextTrackBinding.cpp]
Site uses Akamai Adaptive Media Player (AMP v9.1.20+premier).
Reproducible back to 2020-02-01 so not a recent regression.
Does not happen on Chrome.
Comment 1•2 years ago
•
|
||
I can reproduce the issue on Nightly120.0a1 Windows10.
https://share.firefox.dev/3RVoyf8
I took a look at this and was able to repro as well.
The WebVTT log output is very dense, to the point it crashes my browser running in debug mode after a minute or two if I enable WebVTT:5. From the log entries I was able to collect I notice that there are many duplicate entries, for instance the following lines appear close to 2,000 times:
1892 WebVTT TextTrack=28b71f31580, cue 28b6b3eb4c0 [41.972633:48.006000], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b70899e80 [61.992633:66.039333], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b708a7380 [51.982633:54.012000], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b7091b800 [102.032633:108.066000], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b7091d9c0 [112.042633:114.082000], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b71b4ab00 [72.002633:78.050000], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b72dab940 [82.012633:84.044667], playbackTime=100.357562
1892 WebVTT TextTrack=28b71f31580, cue 28b72dace80 [92.022633:96.055333], playbackTime=100.357562
1898 WebVTT TextTrack=28b6cc8f580, Add cue 28b73cf1440 [100.130578:100.531578] to current cue list
1920 WebVTT TextTrack=28b71f31580, Add cue 28b6b3eb4c0 [41.972633:48.006000] to current cue list
1980 WebVTT TextTrack=28b6fa5c3c0, Add cue 28b6e852580 [76.040000:89.620233] to current cue list
1980 WebVTT TextTrack=28b6fa5c3c0, Add cue 28b6e8526c0 [76.040000:89.620233] to current cue list
1980 WebVTT TextTrack=28b6fa5c3c0, Add cue 28b6e852940 [76.040000:89.620233] to current cue list
alwu, would you have any thoughts here?
Comment 3•2 years ago
|
||
That iframe Newsmax TV Live uses a very bad way to use WebVTT API. Ideally, website would prepare their subtitle in advance. If they want to dynamically generate the cue, they should edit text in the same cue in a reasonable way, not generating too many cues which only have a little difference.
See following logs as an example,
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf946a80 [33.931644:34.064644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf946bc0 [34.064644:34.198644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
when)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf946d00 [34.198644:34.498644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
when th)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf946e40 [34.498644:34.598644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
when the)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf946f80 [34.598644:34.765644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
when the power)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf9470c0 [34.765644:35.799644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
when the power goes)
2023-10-13 18:20:04.286000 UTC - [Child 27244: Main Thread]: D/WebVTT TextTrack=22ca2141d60, cue 22caf947200 [35.799644:36.000644], playbackTime=39.076000, (the victim's Justice Group
helpline now. What will you do
The website would generate a new cue very frequently, and they all share the same start time and end time. The better way should be only using one cue, and modifying its content as necessary. Or create a cue time after website can determine what the complete text should be in that cue.
But I agree with that we should definitely improve the logic of how we select cues.
Description
•