Closed Bug 1964966 Opened 5 months ago Closed 5 months ago

13.5 - 6.05% pdfpaint bug1140761.pdf / pdfpaint issue17061.pdf + 10 more (Linux, OSX) regression on Wed April 30 2025

Categories

(Core :: JavaScript: WebAssembly, defect, P3)

defect

Tracking

()

RESOLVED FIXED
140 Branch
Tracking Status
firefox-esr128 --- unaffected
firefox138 --- unaffected
firefox139 --- unaffected
firefox140 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jseward)

References

(Blocks 1 open bug, Regression)

Details

(4 keywords)

Perfherder has detected a talos performance regression from push 9618824194c55ecf89f162b1044e637171067a7b. As author of one of the patches included in that push, we need your help to address this regression.

Please acknowledge, and begin investigating this alert within 3 business days, or the patch(es) may be backed out in accordance with our regression policy. Our guide to handling regression bugs has information about how you can proceed with this investigation.

If you have any questions or need any help with the investigation, please reach out to fbilt@mozilla.com. Alternatively, you can find help on Slack by joining #perf-help, and on Matrix you can find help by joining #perftest.

Regressions:

Ratio Test Platform Options Absolute values (old vs new)
14% pdfpaint bug1140761.pdf linux1804-64-shippable-qr e10s fission stylo webrender-sw 1,987.11 -> 2,255.30
13% pdfpaint bug1140761.pdf linux1804-64-shippable-qr e10s fission stylo webrender 2,031.91 -> 2,303.45
13% pdfpaint geothermal.pdf linux1804-64-shippable-qr e10s fission stylo webrender 1,186.39 -> 1,337.01
12% pdfpaint geothermal.pdf linux1804-64-shippable-qr e10s fission stylo webrender-sw 1,155.80 -> 1,295.44
10% pdfpaint issue4090.pdf linux1804-64-shippable-qr e10s fission stylo webrender 724.76 -> 798.45
10% pdfpaint bug1072164.pdf linux1804-64-shippable-qr e10s fission stylo webrender 1,137.29 -> 1,245.59
9% pdfpaint bug1072164.pdf linux1804-64-shippable-qr e10s fission stylo webrender-sw 1,114.70 -> 1,218.34
8% pdfpaint issue4090.pdf linux1804-64-shippable-qr e10s fission stylo webrender-sw 705.26 -> 764.65
8% pdfpaint issue12306.pdf macosx1470-64-shippable e10s fission stylo webrender-sw 993.08 -> 1,069.48
7% pdfpaint bug1140761.pdf macosx1470-64-shippable e10s fission stylo webrender-sw 2,143.53 -> 2,283.70
6% pdfpaint issue6364.pdf linux1804-64-shippable-qr e10s fission stylo webrender 695.34 -> 739.61
6% pdfpaint issue17061.pdf macosx1470-64-shippable e10s fission stylo webrender 992.06 -> 1,052.12

Improvements:

Ratio Test Platform Options Absolute values (old vs new)
10% pdfpaint issue9129.pdf windows11-64-24h2-shippable e10s fission stylo webrender-sw 177.76 -> 159.71
9% pdfpaint issue9129.pdf windows11-64-24h2-shippable e10s fission stylo webrender 176.31 -> 159.75
6% pdfpaint issue7014.pdf windows11-64-24h2-shippable e10s fission stylo webrender 202.85 -> 190.40

Details of the alert can be found in the alert summary, including links to graphs and comparisons for each of the affected tests.

If you need the profiling jobs you can trigger them yourself from treeherder job view or ask fbilt@mozilla.com to do that for you.

You can run all of these tests on try with ./mach try perf --alert 44985

The following documentation link provides more information about this command.

Flags: needinfo?(jseward)

Set release status flags based on info from the regressing bug 1957504

Hereby acknowledged. I'm looking into it.

Flags: needinfo?(jseward)
Depends on: 1965195

Some Numbers look to be back to normal.

Other numbers have not improved.

The pdf tests that improved did not regress.

Interesting that the regressions only happen on the pdf tests that ran on Linux. All the improvements were on Windows.
Is there a difference in Lazy-tiering between different platforms?

(In reply to Mayank Bansal from comment #4)

Is there a difference in Lazy-tiering between different platforms?

No .. or at least, I can't think of any difference. It's the same code
generation with the same tier-up heuristics and same thresholds.

Thanks for finding these numbers!

Blocks: sm-js-perf
Severity: -- → S3
Priority: -- → P3

Lazy tiering aims to improve responsiveness of large wasm apps by deferring
optimised compilation of wasm functions until they are hot enough to justify
it. This, plus the inlining it facilitates, has produced various benchmark
wins, see [1] [2] [3] and [4] (a space win), and also the "Improvements" cases
in comment 0. Despite the regressions listed in comment 0, we consider it an
overall win for users.

For the regressions in comment 0:

  • geothermal.pdf has been fixed by [5], which increases the aggressiveness of
    lazy tiering for small modules, and is in m-c now.

  • for bug1072164.pdf we know that further increasing aggressiveness would fix
    it, although we have not done because we know this would reduce performance
    on other benchmarks.

  • for bug1140761.pdf we have not definitively identified a cause, although we
    suspect it is a case of execution becoming stuck in baseline code despite
    optimized code being available, due to the lack of OSR in our wasm
    implementation, as described in [6].

We continue to study potential responsiveness issues relating to lazy tiering
and small modules, eg [7].

[1] https://bugzilla.mozilla.org/show_bug.cgi?id=1957504#c10
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1957504#c11
[3] https://bugzilla.mozilla.org/show_bug.cgi?id=1957504#c12
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=1957504#c13
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=1965195
[6] https://bugzilla.mozilla.org/show_bug.cgi?id=1871158
[7] https://bugzilla.mozilla.org/show_bug.cgi?id=1966645

Thanks for that analysis. It seems like these benchmarks really want aggressive tier up to Ion for everything (because it's a one shot rendering of a PDF), and we have data that doing that would remove the benefits we've been seeing on other tests. We could possibly continue tinkering around here to get the exact right heuristics, but that seems to not likely be worthwhile right now. So I think it makes sense to close this and move on.

Per comment 6 and comment 7, we have partially mitigated this regression, and
on the whole feel that lazy tiering will be a net performance win for users.
Closing for now. If there are further regressions or problems with pdf.js,
please feel free to reopen this.

Assignee: nobody → jseward
Status: NEW → RESOLVED
Closed: 5 months ago
Resolution: --- → FIXED
Target Milestone: --- → 140 Branch
QA Whiteboard: [qa-triage-done-c141/b140]
You need to log in before you can comment on or make changes to this bug.