Closed Bug 1719657 Opened 4 years ago Closed 2 months ago

Crash in [@ js::GCMarker::scanChildren<T>]

Categories

(Core :: JavaScript: GC, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox-esr78 --- unaffected
firefox89 --- unaffected
firefox90 --- unaffected
firefox91 --- affected

People

(Reporter: aryx, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash)

Crash Data

27 crashes from 24 installations, all with 91.0a1, oldest reported build ID is 20210619083556, >90% content crashes

Could this be from the changes in bug 1715512?

Crash report: https://crash-stats.mozilla.org/report/index/74368bdc-94d3-468e-ad83-59dc20210620

Reason: EXCEPTION_ACCESS_VIOLATION_READ

Top 10 frames of crashing thread:

0 xul.dll js::GCMarker::scanChildren<js::Shape> js/src/gc/Marking.cpp:1064
1 xul.dll js::GCMarker::processMarkStackTop js/src/gc/Marking.cpp:1968
2 xul.dll js::GCMarker::markUntilBudgetExhausted js/src/gc/Marking.cpp:1763
3 xul.dll js::gc::GCRuntime::incrementalSlice js/src/gc/GC.cpp:7126
4 xul.dll js::gc::GCRuntime::gcCycle js/src/gc/GC.cpp:7593
5 xul.dll js::gc::GCRuntime::collect js/src/gc/GC.cpp:7801
6 xul.dll js::gc::GCRuntime::gcSlice js/src/gc/GC.cpp:7897
7 xul.dll static nsJSContext::GarbageCollectNow dom/base/nsJSEnvironment.cpp:1139
8 xul.dll GCRunnerFired dom/base/nsJSEnvironment.cpp:1653
9 xul.dll std::_Func_impl_no_alloc<`lambda at /builds/worker/checkouts/gecko/dom/base/nsJSEnvironment.cpp:1628:17', bool, mozilla::TimeStamp>::_Do_call 
Severity: -- → S2
Flags: needinfo?(jdemooij)

Keep in mind that this could be a signature change due to different inlining.

(In reply to Andrew McCreight [:mccr8] from comment #1)

Keep in mind that this could be a signature change due to different inlining.

scanChildren<Shape> calls eagerlyMarkChildren(js::Shape*). The latter is now a lot smaller (it used to have a loop walking over all parent shapes) so it makes sense for the compiler to now inline it into scanChildren.

I looked at the crashes and they're all over the place, looks like typical GC memory corruption and I didn't spot anything actionable.

Flags: needinfo?(jdemooij)
Priority: -- → P3
Keywords: stalled

The bug is linked to a topcrash signature, which matches the following criterion:

  • Top 10 content process crashes on release

For more information, please visit auto_nag documentation.

Keywords: topcrash

This ramped up in version 101, although it's possible it could be signature shift from something else.

There were the following GC changes in that time frame:

  • Bug 1765338 - Allow transplanting nursery objects (not used in the browser until a later release)
  • Bug 1763874 - Tuple elements are not traced properly
  • Bug 1764122 - Skip tracing self hosting stencil when collecting the nursery
  • Bug 1763658 - Tidy GCContext and related refactoring

None of those seem likely to be related.

Steve do you have any ideas?

Flags: needinfo?(sphink)

Based on the topcrash criteria, the crash signatures linked to this bug are not in the topcrash signatures anymore.

For more information, please visit auto_nag documentation.

Keywords: topcrash

I looked at this a bit but didn't come up with anything. The crashes really are all over the place. Maybe bad RAM shifted buckets?

I guess for now it dropped out of the topcrashes, though I'm not sure that's going to be permanent. Removing needinfo for now.

Flags: needinfo?(sphink)

Since the crash volume is low (less than 15 per week), the severity is downgraded to S3. Feel free to change it back if you think the bug is still critical.

For more information, please visit BugBot documentation.

Severity: S2 → S3

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → WORKSFORME

Since the bug is closed, the stalled keyword is now meaningless.
For more information, please visit BugBot documentation.

Keywords: stalled
You need to log in before you can comment on or make changes to this bug.