Crash in [@ OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | CCGraphBuilder::AddNode]
Categories
(Core :: XPCOM, defect)
Tracking
()
| Tracking | Status | |
|---|---|---|
| firefox-esr115 | --- | unaffected |
| firefox125 | --- | affected |
| firefox126 | --- | affected |
| firefox127 | --- | affected |
People
(Reporter: RyanVM, Unassigned)
References
Details
(Keywords: crash)
Crash Data
Saw this come in from a Reddit thread about today's 125.0.1 release. This appears to have started with 125.0b9.
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_125_0b8_RELEASE&tochange=FIREFOX_125_0b9_RELEASE
Jan, is it possible that bug 1888892 caused this?
Crash report: https://crash-stats.mozilla.org/report/index/a7542232-bcb6-42cf-8ab5-9610e0240417
MOZ_CRASH Reason: [unhandlable oom] Failed to allocate new chunk during GC
Top 10 frames:
0 xul.dll MOZ_Crash(char const*, int, char const*) mfbt/Assertions.h:301
0 xul.dll js::AutoEnterOOMUnsafeRegion::crash_impl(char const*) js/src/vm/JSContext.cpp:1310
1 xul.dll mozilla::detail::HashTable<PtrInfo*const, mozilla::HashSet<PtrInfo*, PtrToNod... mfbt/HashTable.h:1839
1 xul.dll mozilla::detail::HashTable<PtrInfo*const, mozilla::HashSet<PtrInfo*, PtrToNod... mfbt/HashTable.h:2153
1 xul.dll mozilla::HashSet<PtrInfo*, PtrToNodeHashPolicy, mozilla::MallocAllocPolicy>::... mfbt/HashTable.h:623
1 xul.dll CCGraphBuilder::AddNode(void*, nsCycleCollectionParticipant*) xpcom/base/nsCycleCollector.cpp:2040
1 xul.dll CCGraphBuilder::NoteChild(void*, nsCycleCollectionParticipant*, nsTString<cha... xpcom/base/nsCycleCollector.cpp:1946
1 xul.dll CCGraphBuilder::NoteJSChild(JS::GCCellPtr) xpcom/base/nsCycleCollector.cpp:2243
1 xul.dll TraversalTracer::onChild(JS::GCCellPtr, char const*) xpcom/base/CycleCollectedJSRuntime.cpp:429
1 xul.dll JS::CallbackTracer::onEdge(JSObject**, char const*) js/public/TracingAPI.h:245
| Reporter | ||
Updated•1 year ago
|
Comment 1•1 year ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM] from comment #0)
Jan, is it possible that bug 1888892 caused this?
It's very unlikely.
This is an OOM in the cycle collector code. I don't see any commits in that range that look obviously related but maybe a signature change? Andrew, what do you think?
Comment 2•1 year ago
|
||
Previous JS tracing changes have caused the CC graph to explode in size, so it is possible that bug 1888892 is related.
Nothing else looks too related in there. There's some MacOS video changes (which shouldn't cause Windows crashes), networking (which looks too low level to cause problems for CCed things). There are translation changes which I guess could cause memory issues.
Comment 3•1 year ago
•
|
||
Because there is js::Nursery::collect in the stack I think this belongs in bug 1472062. Which makes me think that this signature has replaced OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload. By looking for crashes where the proto signature contains js::Nursery::collect and comparing 125.0b9 to previous b9 versions we can see the transfer occur:
123.0b9
1 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC 430 48.37 % 1867864 1472062
2 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload 305 34.31 % 1472062
3 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::Nursery::maybeMoveRawBufferOnPromotion 29 3.26 %
124.0b9
1 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC 436 40.60 %
2 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload 279 25.98 %
3 js::gc::AllocSite::incTenuredCount 177 16.48 % 1639157
125.0b9
1 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC 312 38.95 %
2 OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | CCGraphBuilder::AddNode 188 23.47 %
3 js::gc::AllocSite::incTenuredCount 172 21.47 %
If you agree with my analysis I think we can mark this bug as a duplicate of bug 1472062 and add the signature there. It might worth double checking the telemetry mentioned in bug 1472062 comment 53 though.
Edit: FWIW here is the global volume of crashes where the proto signature contains js::Nursery::collect across b8 and b9 versions, we don't seem to be in an abnormal situation with this view:
120.0 121.0 122.0 123.0 124.0 125.0
b8 416 361 354 415 333 404
b9 939 1043 925 889 1074 801
Comment 4•1 year ago
|
||
The Reddit user confirmed that they were just low on paging file and that the update timing here was probably just a coincidence. This gives further incentive to mark this as a duplicate.
Description
•