Closed
Bug 763166
Opened 13 years ago
Closed 13 years ago
crash in mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport
Categories
(Core Graveyard :: Widget: Android, defect)
Tracking
(firefox15+ fixed, firefox16+ verified, firefox17 fixed)
RESOLVED
FIXED
mozilla17
People
(Reporter: scoobidiver, Assigned: kats)
References
Details
(4 keywords, Whiteboard: [native-crash][startupcrash])
Crash Data
Attachments
(2 files, 1 obsolete file)
|
1.73 KB,
patch
|
kats
:
review+
kats
:
checkin+
|
Details | Diff | Splinter Review |
|
2.54 KB,
patch
|
snorp
:
review+
kats
:
checkin+
|
Details | Diff | Splinter Review |
There are only two crashes in 14.0b5 but there's a spike in startup crashes beginning from 16.0a1/20120606. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=a7a905fd70d5&tochange=6338a8988917
It's likely a regression from bug 758361 which was uplifted to Aurora.
Signature _JNIEnv::CallVoidMethod | mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport More Reports Search
UUID 4a54c3c6-09ec-4376-b96b-87aa02120608
Date Processed 2012-06-08 21:40:28
Uptime 28
Last Crash 3.7 weeks before submission
Install Age 9.2 minutes since version was first installed.
Install Time 2012-06-08 21:31:13
Product FennecAndroid
Version 16.0a1
Build ID 20120608030525
Release Channel nightly
OS Linux
OS Version 0.0.0 Linux 3.0.8+ #9 SMP PREEMPT Wed May 2 18:19:29 CEST 2012 armv7l
Build Architecture arm
Build Architecture Info
Crash Reason SIGSEGV
Crash Address 0x0
App Notes
AdapterVendorID: archos, AdapterDeviceID: ARCHOS 80G9.
AdapterDescription: 'Model: 'ARCHOS 80G9', Product: 'G9A80', Manufacturer: 'archos', Hardware: 'archos''.
archos ARCHOS 80G9
archos/G9A80/A80:4.0.3/MR1/20120420.190002:user/release-keys
EMCheckCompatibility True
Frame Module Signature Source
0 libdvm.so libdvm.so@0x57a70
1 libdvm.so libdvm.so@0x57a4f
2 libxul.so _JNIEnv::CallVoidMethod jni.h:631
3 libxul.so mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport widget/android/AndroidJavaWrappers.cpp:664
4 libxul.so mozilla::AndroidBridge::SetFirstPaintViewport widget/android/AndroidBridge.cpp:2047
5 libxul.so mozilla::layers::CompositorParent::SetFirstPaintViewport gfx/layers/ipc/CompositorParent.cpp:429
6 libxul.so mozilla::layers::CompositorParent::TransformShadowTree gfx/layers/ipc/CompositorParent.cpp:374
7 libxul.so mozilla::layers::CompositorParent::Composite gfx/layers/ipc/CompositorParent.cpp:260
8 libxul.so RunnableMethod<mozilla::layers::CompositorParent, void , Tuple0>::Run ipc/chromium/src/base/tuple.h:383
9 libxul.so MessageLoop::RunTask ipc/chromium/src/base/message_loop.cc:318
10 libxul.so MessageLoop::DeferOrRunPendingTask ipc/chromium/src/base/message_loop.cc:326
11 libxul.so MessageLoop::DoWork ipc/chromium/src/base/message_loop.cc:426
12 libxul.so base::MessagePumpDefault::Run ipc/chromium/src/base/message_pump_default.cc:23
13 libxul.so MessageLoop::RunInternal ipc/chromium/src/base/message_loop.cc:208
14 libxul.so MessageLoop::Run ipc/chromium/src/base/message_loop.cc:201
15 libxul.so base::Thread::ThreadMain ipc/chromium/src/base/thread.cc:156
16 libxul.so ThreadFunc ipc/chromium/src/base/platform_thread_posix.cc:31
17 libc.so libc.so@0x12c4e
18 libc.so libc.so@0x127a2
19 libEGL.so libEGL.so@0x23e82
More reports at:
https://crash-stats.mozilla.com/report/list?signature=_JNIEnv%3A%3ACallVoidMethod+|+mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport
https://crash-stats.mozilla.com/report/list?signature=_JNIEnv%3A%3ACallVoidMethod
https://crash-stats.mozilla.com/report/list?signature=JNI_CreateJavaVM+|+_JNIEnv%3A%3ACallVoidMethod+|+mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport
| Reporter | ||
Comment 1•13 years ago
|
||
As 15.0a2 is unaffected, it's not caused by bug 758361.
No longer blocks: 758361
Version: 15 Branch → 16 Branch
| Assignee | ||
Comment 2•13 years ago
|
||
Given that this crash seems like a race condition on startup, I would guess that this change was the one that caused the spike in 16.0:
d945ae198516 Mike Hommey — Bug 760152 - Start library decompression earlier. r=blassey
Just a guess, though.
Setting as top crash : 6th on nightly, does not seem to appear in aurora, or beta
status-firefox14:
--- → unaffected
status-firefox15:
--- → unaffected
status-firefox16:
--- → affected
Keywords: topcrash
| Assignee | ||
Comment 4•13 years ago
|
||
CC'ing :glandium. Does bug 760152 only do library decompression earlier? Does it do any other gecko startup earlier?
Comment 5•13 years ago
|
||
(In reply to Kartikaya Gupta (:kats) from comment #4)
> CC'ing :glandium. Does bug 760152 only do library decompression earlier?
> Does it do any other gecko startup earlier?
Gecko's actual initialization has not moved. Only decompression did.
| Assignee | ||
Comment 6•13 years ago
|
||
This crash is showing up a lot on tbpl; a lot of the android mochitest oranges are due to this. For example:
https://tbpl.mozilla.org/php/getParsedLog.php?id=12739494&tree=Firefox&full=1#error0
https://tbpl.mozilla.org/php/getParsedLog.php?id=12741230&tree=Firefox&full=1#error0
https://tbpl.mozilla.org/php/getParsedLog.php?id=12751669&tree=Firefox&full=1#error0
Comment 7•13 years ago
|
||
Comment 8•13 years ago
|
||
Whiteboard: [native-crash][startupcrash] → [native-crash][startupcrash][orange]
Comment 9•13 years ago
|
||
Comment 10•13 years ago
|
||
Comment 11•13 years ago
|
||
| Reporter | ||
Updated•13 years ago
|
Crash Signature: [@ _JNIEnv::CallVoidMethod | mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport]
[@ _JNIEnv::CallVoidMethod]
[@ JNI_CreateJavaVM | _JNIEnv::CallVoidMethod | mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport] → [@ _JNIEnv::CallVoidMethod | mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport ]
[@ _JNIEnv::CallVoidMethod ]
[@ JNI_CreateJavaVM | _JNIEnv::CallVoidMethod | mozilla::AndroidGeckoLayerClient::SetFirstPaintViewport ]
Comment 12•13 years ago
|
||
Comment 13•13 years ago
|
||
Comment 14•13 years ago
|
||
Comment 15•13 years ago
|
||
Comment 16•13 years ago
|
||
Comment 17•13 years ago
|
||
Comment 18•13 years ago
|
||
Comment 19•13 years ago
|
||
Updated•13 years ago
|
tracking-fennec: --- → ?
Comment 20•13 years ago
|
||
Comment 21•13 years ago
|
||
Comment 22•13 years ago
|
||
Comment 23•13 years ago
|
||
Updated•13 years ago
|
Assignee: nobody → bugmail.mozilla
| Assignee | ||
Comment 24•13 years ago
|
||
This should be sufficient to fix this crash, assuming the problem is in fact the GetJNIForThread call. I can kill the GetJNIForThread() more in a separate bug.
Attachment #641598 -
Flags: review?(blassey.bugs)
Comment 25•13 years ago
|
||
Comment on attachment 641598 [details] [diff] [review]
Add a GetJNIForCompositorThread and make the compositor JNI invocations use it
Review of attachment 641598 [details] [diff] [review]:
-----------------------------------------------------------------
::: widget/android/AndroidBridge.h
@@ +132,5 @@
> + static JNIEnv* GetJNIForCompositorThread() {
> + if (NS_LIKELY(sBridge)) {
> + if (sBridge->mCompositorThread) {
> + if ((void*)pthread_self() != sBridge->mCompositorThread) {
> + __android_log_print(ANDROID_LOG_ERROR, "AndroidBridge", "Non-compositor thread calling GetJNIForCompositorThread!");
this should probably be a NS_ABORT_IF_FALSE()
@@ +145,5 @@
> + MutexAutoLock lock(sBridge->mCompositorJNICreationMutex);
> +
> + if (sBridge->mCompositorThread) {
> + // this means that another thread executed this function between the time we started executing
> + // it and the time we acquired the mutex. fail.
this also means we have two threads that claim to be the compositor thread, which sounds like a disaster. Let's abort here too
Attachment #641598 -
Flags: review?(blassey.bugs) → review+
| Assignee | ||
Comment 26•13 years ago
|
||
Landed with NS_ABORT()s in the relevant places.
https://hg.mozilla.org/integration/mozilla-inbound/rev/083d36bafbc8
| Assignee | ||
Comment 27•13 years ago
|
||
Backed out on suspicion of making all the android talos tests go red. Nothing useful in the log though.
https://hg.mozilla.org/integration/mozilla-inbound/rev/f3215d2dc286
Comment 28•13 years ago
|
||
(In reply to comment #27)
> Backed out on suspicion of making all the android talos tests go red. Nothing
> useful in the log though.
>
> https://hg.mozilla.org/integration/mozilla-inbound/rev/f3215d2dc286
The backout seems to have fixed it.
Comment 29•13 years ago
|
||
Assignee: bugmail.mozilla → blassey.bugs
Attachment #641598 -
Attachment is obsolete: true
Attachment #642940 -
Flags: review?(bugmail.mozilla)
| Assignee | ||
Comment 30•13 years ago
|
||
Comment on attachment 642940 [details] [diff] [review]
patch to actually use tls
Review of attachment 642940 [details] [diff] [review]:
-----------------------------------------------------------------
r+ with changing the if (status < 0) to if (status) as well.
Attachment #642940 -
Flags: review?(bugmail.mozilla) → review+
Comment 31•13 years ago
|
||
Target Milestone: --- → mozilla17
Comment 32•13 years ago
|
||
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 33•13 years ago
|
||
There are still crashes. See https://crash-stats.mozilla.com/report/list?version=FennecAndroid%3A17.0a1&signature=_JNIEnv%3A%3ACallVoidMethod%20|%20mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport and https://crash-stats.mozilla.com/report/list?version=FennecAndroid%3A17.0a1&signature=JNI_CreateJavaVM%20|%20_JNIEnv%3A%3ACallVoidMethod%20|%20mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
| Reporter | ||
Comment 34•13 years ago
|
||
14.0.1 and 15.0 Beta are affected.
| Reporter | ||
Comment 35•13 years ago
|
||
Let's restrict this bug to top crashers.
It's #3 top crasher in 15.0b2. The Beta regression range is:
http://hg.mozilla.org/releases/mozilla-beta/pushloghtml?fromchange=695042299ec9&tochange=166ba24e4239
Bug 760152 is the only bug that belongs to the two regression windows.
Blocks: 760152
status-firefox14:
affected → ---
Comment 36•13 years ago
|
||
Updated•13 years ago
|
Comment 37•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #5)
> (In reply to Kartikaya Gupta (:kats) from comment #4)
> > CC'ing :glandium. Does bug 760152 only do library decompression earlier?
> > Does it do any other gecko startup earlier?
>
> Gecko's actual initialization has not moved. Only decompression did.
actually, one thing moved: static initialization. Is there any static initialization that calls back into java?
Comment 38•13 years ago
|
||
Comment 39•13 years ago
|
||
| Assignee | ||
Comment 40•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #37)
> actually, one thing moved: static initialization. Is there any static
> initialization that calls back into java?
I'm not aware of any offhand, but it's quite possible that there is. Looking at the code again, I found something that might be the problem here. Patch coming in a sec.
| Assignee | ||
Comment 41•13 years ago
|
||
Pushed this to try at https://tbpl.mozilla.org/?tree=Try&rev=b40d1422b1cb, will request review if it comes back green.
Comment 42•13 years ago
|
||
I think I'd rather we just backed out bug 760152 on branches at this point, since it didn't have any obvious perf gains. Is that riskier?
| Assignee | ||
Comment 43•13 years ago
|
||
Doing the backout is probably the better option. I don't know if my patch will actually fix this or not yet. If it does we can re-land it with the fix.
Comment 44•13 years ago
|
||
(In reply to Kartikaya Gupta (:kats) from comment #43)
> Doing the backout is probably the better option. I don't know if my patch
> will actually fix this or not yet. If it does we can re-land it with the fix.
Thanks, sending over to glandium in that case.
Assignee: blassey.bugs → mh+mozilla
| Assignee | ||
Comment 45•13 years ago
|
||
Comment on attachment 647566 [details] [diff] [review]
Remove race condition from AndroidBridge initialization
New try run is at https://tbpl.mozilla.org/?tree=Try&rev=e3e0e2f7ec4f (old one failed to build because i used a busted inbound changeset as a base). It looks about as green as android usually does, so requesting review.
Attachment #647566 -
Flags: review?(snorp)
Updated•13 years ago
|
Attachment #647566 -
Flags: review?(snorp) → review+
| Assignee | ||
Comment 46•13 years ago
|
||
| Assignee | ||
Updated•13 years ago
|
Attachment #642940 -
Flags: checkin+
| Assignee | ||
Updated•13 years ago
|
Attachment #647566 -
Flags: checkin+
Comment 47•13 years ago
|
||
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
| Reporter | ||
Comment 48•13 years ago
|
||
Fixed in Aurora by the backout of bug 760152:
http://hg.mozilla.org/releases/mozilla-aurora/rev/e670dfc55dc8
| Reporter | ||
Comment 49•13 years ago
|
||
There are still crashes in the trunk.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment 50•13 years ago
|
||
(In reply to Scoobidiver from comment #49)
> There are still crashes in the trunk.
Do you have evidence of that? There are no crashes I can see with a build id from a build after the landing:
https://crash-stats.mozilla.com/report/list?range_value=7&range_unit=days&date=2012-08-02&signature=_JNIEnv%3A%3ACallVoidMethod%20|%20mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport&version=FennecAndroid%3A17.0a1
| Reporter | ||
Comment 51•13 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #50)
> Do you have evidence of that?
Of course: https://crash-stats.mozilla.com/query/query?product=FennecAndroid&version=FennecAndroid%3A17.0a1&range_value=1&range_unit=weeks&query_search=signature&query_type=contains&query=mozilla%3A%3AAndroidGeckoLayerClient%3A%3ASetFirstPaintViewport&reason=&do_query=1
| Reporter | ||
Comment 52•13 years ago
|
||
There are no crashes after 16.0a2/20120801, i.e. the backout of bug 760152.
https://crash-stats.mozilla.com/report/index/1fd05d7b-0f24-4063-9a84-ee7782120806
https://crash-stats.mozilla.com/report/index/e4673839-1a8f-4f07-ab22-956902120806
Note to clarify.. that's only 3 report that shows on/after 8/1 out of 2,838+634+1 Crash Reports. Only 1 report after 8/1.
Should we close this out? Or keep this and mark as a lower priority? People just need to update it seems.
| Reporter | ||
Comment 55•13 years ago
|
||
(In reply to Naoki Hirata :nhirata from comment #54)
> Should we close this out? Or keep this and mark as a lower priority?
No. When 17 Branch will go to Aurora, it will be #3 top crasher in 17.0a2 like it was in 16.0a2.
Comment 56•13 years ago
|
||
Comment 57•13 years ago
|
||
Bug 760152 was backed out of Beta 15 yesterday.
| Reporter | ||
Updated•13 years ago
|
Comment 58•13 years ago
|
||
Comment 59•13 years ago
|
||
Comment 60•13 years ago
|
||
Comment 61•13 years ago
|
||
Comment 62•13 years ago
|
||
Comment 63•13 years ago
|
||
Comment 64•13 years ago
|
||
(In reply to Naoki Hirata :nhirata from comment #54)
> Should we close this out? Or keep this and mark as a lower priority?
> People just need to update it seems.
No. This is still occurring on trunk.
Comment 65•13 years ago
|
||
Comment 66•13 years ago
|
||
Updated•13 years ago
|
status-firefox17:
--- → affected
Comment 67•13 years ago
|
||
Comment 68•13 years ago
|
||
Comment 69•13 years ago
|
||
Comment 70•13 years ago
|
||
Comment 71•13 years ago
|
||
Comment 72•13 years ago
|
||
Comment 73•13 years ago
|
||
Comment 74•13 years ago
|
||
Assigning to kats after speaking to glandium.
Assignee: mh+mozilla → bugmail.mozilla
Comment 75•13 years ago
|
||
Comment 76•13 years ago
|
||
Comment 77•13 years ago
|
||
Comment 78•13 years ago
|
||
It looks like all of the recent logs are from mochitest runs -- why would that be?
Comment 79•13 years ago
|
||
Comment 80•13 years ago
|
||
Comment 81•13 years ago
|
||
Comment 82•13 years ago
|
||
Comment 83•13 years ago
|
||
Comment 84•13 years ago
|
||
Comment 85•13 years ago
|
||
Comment 86•13 years ago
|
||
Comment 87•13 years ago
|
||
Comment 88•13 years ago
|
||
Comment 89•13 years ago
|
||
Comment 90•13 years ago
|
||
Comment 91•13 years ago
|
||
Comment 92•13 years ago
|
||
Comment 93•13 years ago
|
||
Comment 94•13 years ago
|
||
Comment 95•13 years ago
|
||
Comment 96•13 years ago
|
||
Comment 97•13 years ago
|
||
Comment 98•13 years ago
|
||
Comment 99•13 years ago
|
||
Comment 100•13 years ago
|
||
Comment 101•13 years ago
|
||
Kats, do you have any ideas for this?
| Assignee | ||
Comment 102•13 years ago
|
||
Unfortunately, no. I've looked through the code multiple times and don't see any other problems that could cause this.
Comment 103•13 years ago
|
||
Comment 104•13 years ago
|
||
Comment 105•13 years ago
|
||
Comment 106•13 years ago
|
||
Comment 107•13 years ago
|
||
Comment 108•13 years ago
|
||
Comment 109•13 years ago
|
||
| Assignee | ||
Comment 110•13 years ago
|
||
(In reply to Ed Morley [:edmorley] from comment #101)
> Kats, do you have any ideas for this?
Actually, we should back out bug 760152 on 17 as well. Backing that out seems to have fixed the error completely on 15 and 16, so it should help significantly on 17. Note that this error was happening even before bug 760152 landed (albeit at a very low crash volume) so I don't expect it to disappear completely, unless one of the other patches on this bug did actually fix something. (i.e. bug 760152 might have masked the original problem, which was fixed by one of the patches on this bug).
I posted a comment on bug 760152 requesting it be backed out on 17 as well.
Comment 111•13 years ago
|
||
Thank you :-)
Comment 112•13 years ago
|
||
Comment 113•13 years ago
|
||
Comment 114•13 years ago
|
||
Comment 115•13 years ago
|
||
Comment 116•13 years ago
|
||
Comment 117•13 years ago
|
||
Comment 118•13 years ago
|
||
Comment 119•13 years ago
|
||
Comment 120•13 years ago
|
||
| Assignee | ||
Comment 121•13 years ago
|
||
I pushed the backout of bug 760152 to inbound so hopefully this should stop happening as that propagates to the various trees. *fingers crossed*
Comment 122•13 years ago
|
||
Comment 123•13 years ago
|
||
Comment 124•13 years ago
|
||
Comment 125•13 years ago
|
||
| Reporter | ||
Updated•13 years ago
|
Comment 126•13 years ago
|
||
This has stopped after the build from the 16th, so the backout worked. Should we mark this bug fixed?
| Assignee | ||
Comment 127•13 years ago
|
||
Sounds good to me.
Status: REOPENED → RESOLVED
Closed: 13 years ago → 13 years ago
Resolution: --- → FIXED
Updated•12 years ago
|
Keywords: intermittent-failure
Updated•12 years ago
|
Whiteboard: [native-crash][startupcrash][orange] → [native-crash][startupcrash]
Updated•11 years ago
|
tracking-fennec: ? → ---
Updated•3 years ago
|
Product: Core → Core Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•