-moz-element leaks memory and hangs whole browser with massive CPU usage at closing browser window with large image Theme after landing patch from bug #1330487
Categories
(Core :: SVG, defect, P1)
Tracking
()
People
(Reporter: Virtual, Assigned: jwatt)
References
(Regression)
Details
(8 keywords)
Attachments
(3 files)
|
50.90 KB,
image/png
|
Details | |
|
53.83 KB,
image/png
|
Details | |
|
47 bytes,
text/x-phabricator-request
|
jcristau
:
approval-mozilla-beta+
|
Details | Review |
STR:
- Open few browser windows
- Start long browsing session
- Close one window
and enjoy browser hang for few seconds and in some cases even many minutes
Regression range:
TODO - started few days ago
I will try to catch profile with Gecko Profiler extension.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 1•6 years ago
|
||
I suspect that it could be related to relanded LSNG patches or maybe some Fission work.
Comment 2•6 years ago
|
||
:Virtual, could you try to find a regression range in using for example mozregression?
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 3•6 years ago
|
||
Sure, working on it right now.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 4•6 years ago
|
||
I'm having trouble to easily reproduce this issue.
In my case sometimes it happens, sometimes it doesn't,
so it will take some time to find regression range.
If someone can reproduce this issue easily, help to find regression range will be nice.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
For what it's worth, I had noted extended shut-down times with one of the Firefox processes taking about 32% of the CPU (the equivalent of about 1.28 cores in my 4-core CPU) for up to a few minutes if I had viewed a number of YouTube videos earlier in the session. No-YouTube = no elongated shut-down time. Note: when the problem exists, if I shut down the YouTube tabs but other tabs are open, no CPU usage is noticed at that time, but rather later when I shut down Firefox. Also, it seems the more I had been running YouTube videos that session, the longer it takes Firefox to shut down. Since the CPU shoots up after the Firefox windows had closed, I don't have opportunity to run the Gecko Profiler on it.
My system is Windows 10 (64-bit), Firefox Nightly 68.0a1 (2019-05-16) (64-bit), running on Intel i5-7400 (4-core), with an Intel(R) HD Graphics 630. Maybe relevant: I have the uBlock Origin extension installed with default block lists.
(In reply to Mark from comment #5)
CPU usage happens in the Firefox main process. (I just downloaded "Process Explorer" from www.sysinternals.com and had it running when I shut down Firefox Nightly after a few YouTube videos and an audio podcast.)
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 7•6 years ago
|
||
I will try to catch profile with Gecko Profiler extension, as looking for regression range is unreliable in my case.
I've had similar problems for the last probably 1 -2 weeks as well. In fact, today I installed the new nightly and it took just over 25 (yes, twenty five) mins for firefox to shutdown. And that was maxing 1 core of the CPU the whole time.
I don't get any indication from firefox that anything hung/crashes (no new reports in about:crashes).
If there is a way to get a profile of the shutdown process in firefox, I'm more than willing to do it and post the results for analysis.
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 9•6 years ago
|
||
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 10•6 years ago
|
||
FYI - Looks like it's not caused by LSNG (Local Storage Next Generation), because when I disable it in about:config by setting "dom.storage.next_gen" to "false", it still happens.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Comment 11•6 years ago
|
||
As I said before I have problems with shutdowns taking a long time, but the problem seems to manifest the same way when I close a private browsing window as well.
So I tried to use Gecko Profiler to get some more info about this, but I couldn't get it to work when you have a private window open.
But I took a screenshot (that I will attach) of the Threads tab in Process Hacker for both the main Firefox process and a child one that was using a lot of CPU. The "firefox.exe!TargetNtUnmapViewOfSection" in the child process screenshot seemed to continuously spawn and disappear.
Don't know if they are of any help.
Comment 12•6 years ago
|
||
Comment 13•6 years ago
|
||
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 14•6 years ago
|
||
My profile taken with Gecko Profiler - https://perfht.ml/2HJC1Rz
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 15•6 years ago
|
||
Next profile - https://perfht.ml/2HKxBtD
STR:
- Open few tabs in one browser window
- Open second browser window and start long browsing session in this window with many tabs
- Close all tabs from #2, except one
- Start recording profile with Gecko Profiler
- Wait about 10s doing nothing and close browser window from #2 & #3
- and enjoy waiting nearly 3 mins for Firefox to close browser window...
- Wait 10s doing nothing after profile capture
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 16•6 years ago
|
||
Looks like 80% of time Firefox is spending at "mozilla::SVGRenderingObserver::StopObserving".
Looking on https://searchfox.org/mozilla-central/search?q=mozilla%3A%3ASVGRenderingObserver%3A%3AStopObserving
latest changes to this function name were done in bug #1525372 which landed 2019-03-19 10:52:47 CET,
but oddly I don't think this issue started before 2019-04-15.
@ Matt Woodrow (:mattwoodrow) & Jeff Muizelaar [:jrmuizel] & Jonathan Watt [:jwatt] - Good day, I'm asking about some help in analyzing my profiles and some hints. Could it be that bug #1525372 caused this issue or it's missed call by me? Thank you very much in advance!
Info for later regression hunting - bugs which touched SVG code from 2019-04-15 to 2019-05-09 are:
bug #1548985
bug #1547209
bug #1544270
bug #1544216
bug #1544209
bug #1543584
bug #1542646
bug #1541350
bug #1539846
bug #1539318
bug #1533491
bug #1525753
bug #1372537
bug #1323962
bug #1247218
bug #972041
bug #937268
bug #874811
bug #608812
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Comment 17•6 years ago
|
||
The stacks in your profile seem a bit unreliable, but most of that is likely due to identical functions being compiled to a single symbol.
I think the issue is this code, leading to massive hash table leaking - https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/layout/svg/SVGObserverUtils.cpp#1378
This changeset adds a new usage of -moz-element, seems like about the right date range - https://hg.mozilla.org/mozilla-central/rev/e330d90e540084087b6a9c155eabe069ec690c49
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 18•6 years ago
|
||
@ Matt Woodrow (:mattwoodrow) - Thank you very much for your detailed information!
@ Kris Maglione [:kmag] - Could you please look on this issue, as it may be that your patch to fix bug my issue in bug #1540984 caused this issue.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
| Assignee | ||
Comment 20•6 years ago
|
||
(In reply to Matt Woodrow (:mattwoodrow) from comment #17)
I think the issue is this code, leading to massive hash table leaking - https://searchfox.org/mozilla-central/rev/952521e6164ddffa3f34bc8cfa5a81afc5b859c4/layout/svg/SVGObserverUtils.cpp#1378
Let's start with this.
Comment 21•6 years ago
|
||
Why is this code using -moz-element? Doesn't background-image work? I'd expect -moz-element to be much more expensive.
| Assignee | ||
Comment 22•6 years ago
|
||
First, nice sleuthing, Matt, identifying in comment 17 the relevant leak and the bug that triggered it. For completeness, it looks like the memory leak was introduced by the Part 6 patch from bug 1330487. See bug 1330487 comment 101 for details.
I could indeed only reproduce the leak after installing a theme with a large banner image, to trigger the -moz-element introduced in bug 1540984 comment 31.
| Assignee | ||
Comment 23•6 years ago
|
||
| Assignee | ||
Comment 24•6 years ago
|
||
I've got some Try builds with a patch that people can test to confirm whether their issues are resolved at:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=663b5d1bd8fd
Comment 25•6 years ago
|
||
When bug 1330487 regressed this, it goes all the way back to 64. Maybe we could include it in a possible security fix for an upcoming 67.0.x release?
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 26•6 years ago
•
|
||
@ Jonathan Watt [:jwatt] - Thank you very much for rapid diagnosis and fix. I'm testing it right now and I will report back later with results.
STR:
- Install Theme with large image (like Virtualfox [ https://addons.mozilla.org/addon/virtualfox/ ] or Firefox Quantum Nightly [ https://addons.mozilla.org/addon/firefox-quantum-nightly/ ] )
- In first browser window open few tabs and leave it untouched
- Open second browser window and start long browsing session in this window with many tabs
- After long browsing session close second browser window from #3
and enjoy waiting from few seconds to many minutes for Firefox to close browser window,
my record from yesterday was nearly 1 hour of hang with 85-93% of RAM usage, but I waited patiently, as I know session won't be restored properly when I will close Firefox with Windows hang information notification that Firefox stopped responding.
Comment 27•6 years ago
|
||
Comment 28•6 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #21)
Why is this code using
-moz-element? Doesn't background-image work? I'd expect -moz-element to be much more expensive.
It doesn't work, no. At least not with large background images, which flicker every time a window opens or changes focus. See bug 1540984.
I was initially going to use -moz-element with a canvas with a cropped image when the background image was too large, but just using it with an uncropped image element solved the problem on its own, so it didn't seem worth the extra complexity.
Comment 30•6 years ago
|
||
(In reply to Kris Maglione [:kmag] from comment #28)
It doesn't work, no. At least not with large background images, which flicker every time a window opens or changes focus. See bug 1540984.
Which protocol are these images loaded with, moz-extension://? I suspect that's an underlying imagelib bug, we no longer properly cache a bunch of images from those protocols (see https://bugzilla.mozilla.org/show_bug.cgi?id=1406134#c5).
Comment 31•6 years ago
|
||
(In reply to Emilio Cobos Álvarez (:emilio) from comment #30)
(In reply to Kris Maglione [:kmag] from comment #28)
It doesn't work, no. At least not with large background images, which flicker every time a window opens or changes focus. See bug 1540984.
Which protocol are these images loaded with,
moz-extension://? I suspect that's an underlying imagelib bug, we no longer properly cache a bunch of images from those protocols (see https://bugzilla.mozilla.org/show_bug.cgi?id=1406134#c5).
At the moment, yes, but we had this problem in the past even when we used to persist theme images from https: URLs to disk and load them from file: URLs.
Comment 32•6 years ago
|
||
file:// has the same issue actually, I think...
Comment 33•6 years ago
|
||
Well, in any case, I'd love for the -moz-element hack not to be necessary, but I spent more time than I could spare digging through the image loading and caching code trying to figure out why it didn't work correctly for large images before I had to give up and make do with a workaround. If the caching problem gets fixed, I would be more than happy to remove it.
Comment 34•6 years ago
|
||
Kris, can you file a bug on the caching issue? It seems like it shouldn't be too hard to fix.
Comment 35•6 years ago
|
||
(In reply to Jeff Muizelaar [:jrmuizel] from comment #34)
Kris, can you file a bug on the caching issue? It seems like it shouldn't be too hard to fix.
Filed bug 1553923.
Updated•6 years ago
|
Comment 36•6 years ago
|
||
Note also on this profile it's also spending a lot of time deleting RetainedDisplayLists from a hashtable, it appears.
Comment 37•6 years ago
|
||
(In reply to Randell Jesup [:jesup] (needinfo me) from comment #36)
Note also on this profile it's also spending a lot of time deleting RetainedDisplayLists from a hashtable, it appears.
Do you mean the DeleteValue<RetainedDisplayListData> line?
I think that's just a shared symbol, since it doesn't make sense for RetainedDisplayListData deletion to trigger the rest of the code below it.
Comment 38•6 years ago
|
||
I think that's just a shared symbol, since it doesn't make sense for RetainedDisplayListData deletion to trigger the rest of the code below it.
Aha, that makes sense. Darn linkers....
Updated•6 years ago
|
Comment 39•6 years ago
|
||
This regression dates back to 64 (bug 1330487) but became a real problem when bug 1540984 landed in 68. Given that, I think this is not a good candidate for uplifting to 67, so wontfix 67 and we don't take ridealongs in 67.05.
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 40•6 years ago
|
||
@ Jonathan Watt [:jwatt] - I'm reporting back. So far so good. I didn't encounter issue for over 1 day of testing with Mozilla Firefox Nightly 69.0a1 (2019-05-23) (64-bit) [Built from https://hg.mozilla.org/try/rev/663b5d1bd8fdb5c098d3f9bb37eef7f6b986de05] from Comment #24 with STR from Comment #26.
Comment 41•6 years ago
|
||
| bugherder | ||
| Assignee | ||
Comment 42•6 years ago
|
||
(In reply to Virtual_ManPL [:Virtual] - (please needinfo? me - so I will see your comment/reply/question/etc.) from comment #40)
@ Jonathan Watt [:jwatt] - I'm reporting back. So far so good. I didn't encounter issue for over 1 day of testing
Awesome, thanks for testing and reporting back!
| Assignee | ||
Comment 43•6 years ago
|
||
Comment on attachment 9067012 [details]
Bug 1550531. Fix -moz-element memory leak and hangs. r=emilio
Beta/Release Uplift Approval Request
- User impact if declined: Indefinitely long shutdowns for users using a theme with a large banner image (due to memory leak). Possibly other users are affected by other scenarios that hit the same leak.
- Is this code covered by automated tests?: Yes
- Has the fix been verified in Nightly?: Yes
- Needs manual test from QE?: No
- If yes, steps to reproduce: See comment 26.
- List of other uplifts needed: None
- Risk to taking this patch: Low
- Why is the change risky/not risky? (and alternatives if risky): The patch fixes a memory leak, but otherwise should not change behavior. I've followed the code execution through in a debugger both with and without the patch to confirm that it works as expected.
- String changes made/needed: none
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Comment 44•6 years ago
|
||
I'm confirming that bug is fixed, starting in Mozilla Firefox Nightly 68.0a1 (2019-05-25), so I'm marking this bug as VERIFIED.
Thank you very much! \o/
Comment 45•6 years ago
|
||
Comment on attachment 9067012 [details]
Bug 1550531. Fix -moz-element memory leak and hangs. r=emilio
fix for a mem leak, approved for 68.0b5
Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)
Reporter
|
||
Updated•6 years ago
|
Comment 46•6 years ago
|
||
| bugherder uplift | ||
Updated•6 years ago
|
Updated•6 years ago
|
Comment 47•6 years ago
|
||
Managed to verify the fix as well using 68.0b5 on Windows 7x32.
Updated•3 years ago
|
Description
•