Closed Bug 1891902 Opened 1 year ago Closed 1 year ago

Crash in [@ OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | CCGraphBuilder::AddNode]

Categories

(Core :: XPCOM, defect)

Unspecified
Windows 10
defect

Tracking

()

RESOLVED DUPLICATE of bug 1472062
Tracking Status
firefox-esr115 --- unaffected
firefox125 --- affected
firefox126 --- affected
firefox127 --- affected

People

(Reporter: RyanVM, Unassigned)

References

Details

(Keywords: crash)

Crash Data

Saw this come in from a Reddit thread about today's 125.0.1 release. This appears to have started with 125.0b9.
https://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=FIREFOX_125_0b8_RELEASE&tochange=FIREFOX_125_0b9_RELEASE

Jan, is it possible that bug 1888892 caused this?

Crash report: https://crash-stats.mozilla.org/report/index/a7542232-bcb6-42cf-8ab5-9610e0240417

MOZ_CRASH Reason: [unhandlable oom] Failed to allocate new chunk during GC

Top 10 frames:

0  xul.dll  MOZ_Crash(char const*, int, char const*)  mfbt/Assertions.h:301
0  xul.dll  js::AutoEnterOOMUnsafeRegion::crash_impl(char const*)  js/src/vm/JSContext.cpp:1310
1  xul.dll  mozilla::detail::HashTable<PtrInfo*const, mozilla::HashSet<PtrInfo*, PtrToNod...  mfbt/HashTable.h:1839
1  xul.dll  mozilla::detail::HashTable<PtrInfo*const, mozilla::HashSet<PtrInfo*, PtrToNod...  mfbt/HashTable.h:2153
1  xul.dll  mozilla::HashSet<PtrInfo*, PtrToNodeHashPolicy, mozilla::MallocAllocPolicy>::...  mfbt/HashTable.h:623
1  xul.dll  CCGraphBuilder::AddNode(void*, nsCycleCollectionParticipant*)  xpcom/base/nsCycleCollector.cpp:2040
1  xul.dll  CCGraphBuilder::NoteChild(void*, nsCycleCollectionParticipant*, nsTString<cha...  xpcom/base/nsCycleCollector.cpp:1946
1  xul.dll  CCGraphBuilder::NoteJSChild(JS::GCCellPtr)  xpcom/base/nsCycleCollector.cpp:2243
1  xul.dll  TraversalTracer::onChild(JS::GCCellPtr, char const*)  xpcom/base/CycleCollectedJSRuntime.cpp:429
1  xul.dll  JS::CallbackTracer::onEdge(JSObject**, char const*)  js/public/TracingAPI.h:245
Flags: needinfo?(jdemooij)

(In reply to Ryan VanderMeulen [:RyanVM] from comment #0)

Jan, is it possible that bug 1888892 caused this?

It's very unlikely.

This is an OOM in the cycle collector code. I don't see any commits in that range that look obviously related but maybe a signature change? Andrew, what do you think?

Component: JavaScript: GC → XPCOM
Flags: needinfo?(jdemooij) → needinfo?(continuation)

Previous JS tracing changes have caused the CC graph to explode in size, so it is possible that bug 1888892 is related.

Nothing else looks too related in there. There's some MacOS video changes (which shouldn't cause Windows crashes), networking (which looks too low level to cause problems for CCed things). There are translation changes which I guess could cause memory issues.

Flags: needinfo?(continuation)

Because there is js::Nursery::collect in the stack I think this belongs in bug 1472062. Which makes me think that this signature has replaced OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload. By looking for crashes where the proto signature contains js::Nursery::collect and comparing 125.0b9 to previous b9 versions we can see the transfer occur:

123.0b9
1 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC	430 	48.37 % 	1867864 1472062
2 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload	305 	34.31 % 	1472062
3 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::Nursery::maybeMoveRawBufferOnPromotion	29 	3.26 % 	

124.0b9

1 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC	436 	40.60 %
2 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | JS::Value::bitsFromTagAndPayload	279 	25.98 %
3 	js::gc::AllocSite::incTenuredCount   177 	16.48 % 	1639157

125.0b9
1 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | js::AutoEnterOOMUnsafeRegion::crash | js::gc::AllocateCellInGC	312 	38.95 %
2 	OOM | unknown | js::AutoEnterOOMUnsafeRegion::crash_impl | CCGraphBuilder::AddNode	188 	23.47 %
3 	js::gc::AllocSite::incTenuredCount	172 	21.47 %

If you agree with my analysis I think we can mark this bug as a duplicate of bug 1472062 and add the signature there. It might worth double checking the telemetry mentioned in bug 1472062 comment 53 though.

Edit: FWIW here is the global volume of crashes where the proto signature contains js::Nursery::collect across b8 and b9 versions, we don't seem to be in an abnormal situation with this view:

   120.0  121.0 122.0 123.0 124.0 125.0
b8   416    361   354   415   333   404
b9   939   1043   925   889  1074   801
See Also: → 1891944

The Reddit user confirmed that they were just low on paging file and that the update timing here was probably just a coincidence. This gives further incentive to mark this as a duplicate.

Status: NEW → RESOLVED
Closed: 1 year ago
Duplicate of bug: 1472062
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.