Closed Bug 1077292 Opened 11 years ago Closed 10 years ago

[MTBF][B2G] Phone shutdown due to high power consumption and chopped charging.

Categories

(Firefox OS Graveyard :: Stability, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:-, tracking-b2g:+, b2g-v2.1 affected, b2g-v2.2 affected)

RESOLVED WORKSFORME
blocking-b2g -
tracking-b2g +
Tracking Status
b2g-v2.1 --- affected
b2g-v2.2 --- affected

People

(Reporter: wachen, Unassigned)

References

Details

(Whiteboard: [mtbf])

Attachments

(4 files)

In recent v2.1 or v2.0 with 180 base build or 123 base build, we found that it could shutdown itself after few hours or mtbf. Also, if you put the phone with power cable connected on the table, it might shut down after a day or two.
Blocks: MTBF-meta
Blocks: MTBF-B2G
No longer blocks: MTBF-meta
I didn't get any result from 5 phones set up to just lay on the desktop. I will do a mtbf run with adb logcat on continuously
I was able to reproduce the issue with logcat before its shutdown. Here are the last few lines, and I will attach truncated log later. D/skia (17368): START /proc/cpuinfo: D/skia (17368): Processor : ARMv7 Processor rev 3 (v7l) D/skia (17368): processor : 0 D/skia (17368): BogoMIPS : 38.40 D/skia (17368): D/skia (17368): processor : 1 D/skia (17368): BogoMIPS : 38.40 D/skia (17368): D/skia (17368): Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt D/skia (17368): CPU implementer : 0x41 D/skia (17368): CPU architecture: 7 D/skia (17368): CPU variant : 0x0 D/skia (17368): CPU part : 0xc07 D/skia (17368): CPU revision : 3 D/skia (17368): D/skia (17368): Hardware : Qualcomm MSM8210 D/skia (17368): Revision : 0000 D/skia (17368): Serial : 0000000000000000 D/skia (17368): D/skia (17368): END /proc/cpuinfo D/skia (17368): Device supports ARM NEON instructions! D/QSEECOMAPI: (17650): QSEECom_get_handle sb_length = 0x2000 D/QSEECOMAPI: (17650): App is not loaded in QSEE E/QSEECOMAPI: (17650): Error::Cannot open the file /vendor/firmware/keymaster/keymaster.mdt E/QSEECOMAPI: (17650): Error::Loading image failed with ret = -1 I/cat ( 231): <4>[61453.835489] QSEECOM: qseecom_release: data: released = false, type = 1, data = 0xc666bc00 D/QSEECOMAPI: (17650): QSEECom_get_handle sb_length = 0x2000 D/QSEECOMAPI: (17650): App is not loaded in QSEE E/QSEECOMAPI: (17650): Error::Cannot open the file /firmware/image/keymaste.mdt E/QSEECOMAPI: (17650): Error::Loading image failed with ret = -1 E/QCOMKeyMaster(17650): Loading keymaster app failied E/keystore(17650): could not open keymaster device in keystore (Operation not permitted) E/keystore(17650): keystore keymaster could not be initialized; exiting I/cat ( 231): <4>[61453.849101] QSEECOM: qseecom_release: data: released = false, type = 1, data = 0xc666bc00 E/GeckoConsole(17368): Content JS LOG at app://system.gaiamobile.org/js/wallpaper_manager.js:346 in debug: [WallpaperManager] new wallpaper size already validated E/GeckoConsole(17368): Content JS LOG at app://system.gaiamobile.org/js/wallpaper_manager.js:346 in debug: [WallpaperManager] publishing wallpaperchange event I/GeckoDump(17368): XXX FIXME : Got a mozContentEvent: inputmethod-update-layouts I/GeckoDump(17368): XXX FIXME : Got a mozContentEvent: inputmethod-update-layouts I/Gecko (17368): -*- WifiWorker component: 'tethering.wifi.enabled' is now false I/Gecko (17368): -*- WifiWorker component: No changes for SETTINGS_WIFI_TETHERING_ENABLED flag. Nothing to do. I/PowerManagerService(17368): Call to virtual nsresult mozilla::dom::power::PowerManagerService::PowerOff(). The JS stack is: I/PowerManagerService(17368): 0 sm_actualPowerOff(isReboot = false) ["app://system.gaiamobile.org/js/sleep_menu.js":441] I/PowerManagerService(17368): this = [object Object] I/PowerManagerService(17368): 1 nextAnimation(e = [object AnimationEvent]) ["app://system.gaiamobile.org/js/sleep_menu.js":422] I/PowerManagerService(17368): this = [object HTMLDivElement] I/PowerManagerService(17368): E/GeckoConsole(17368): [JavaScript Error: "IndexedDB UnknownErr: IDBDatabase.cpp:626"]
I added some logs to Gaia system app to understand how is power off started, will check results tomorrow.
ni Alive to have more information
Flags: needinfo?(alive)
According to attachment 8501490 [details], the shutdown is initiated from here: http://lxr.mozilla.org/gaia/source/apps/system/js/battery_manager.js#29 which the battery level is 0, and the charging is false. As the device is usb connected, at least the charging should be true, I'll take a look at this.
We need to put console.trace() in https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/sleep_menu.js#L446 to see who is doing that poweroff. But now it's only possible to be triggered by long press power button + click power off option in device menu.
Flags: needinfo?(alive)
(In reply to Alive Kuo [:alive][NEEDINFO!] from comment #8) > We need to put console.trace() in > https://github.com/mozilla-b2g/gaia/blob/master/apps/system/js/sleep_menu. > js#L446 to see who is doing that poweroff. > > But now it's only possible to be triggered by long press power button + > click power off option in device menu. Hmmm, looks like we may also get "batteryshutdown" as comment 7 described to trigger the shutdown, so this might also be a gecko issue.
GonkHal.cpp reads battery level and charging from following files on Flame: /sys/class/power_supply/battery/capacity, /sys/class/power_supply/battery/status Will add logs to check the readings, but bascially this tends to a Gonk issue as the values are passed to gecko/gaia directly.
Attached file Logcat.e479da53
From battery logs, its level down to 0 and for some reason the device stop charging even it is usb connected, which then poweroff is triggered. I have talked to Danny and there're some cases the device will stop charging, e.g., when the battery is overheat, but it's not something under our control.
Flags: needinfo?(kchang)
Tapas, Just curious if you had any input here or ever seen this issue on your side?
Flags: needinfo?(tkundu)
(In reply to bhavana bajaj [:bajaj] from comment #12) > Tapas, > > Just curious if you had any input here or ever seen this issue on your side? Could you please give us dmesg, logcat logs, output of |adb shell cat /d/ion/heaps/system|, |adb shell /sys/class/kgsl/kgsl/page_alloc|, |adb shell b2g-info| , |adb shell procrank| log when this happens ? It would be great if you also point us to your gaia/gecok SHA1's , branch, device , available memory on your device (look into |adb shell cat /proc/meminfo| . If you can provide a me good STR then I can reproduce this issue on my device and conclude faster. Otherwise I need your help for logs :)
Flags: needinfo?(tkundu) → needinfo?(bbajaj)
Walter, first, please retest it to get log. Second, please use another battery to make sure the same situation is not happened. Thanks very much.
Flags: needinfo?(kchang) → needinfo?(wachen)
I will set up it today. However, it won't be issues of batteries unless 10~20 batteries are bad. I used around 10~20 devices every run, and it does reproduce in different devices everytime.
Attached file logcat20141013.zip
I got some results waiting for Ting-Yu's helping.
Flags: needinfo?(wachen)
Flags: needinfo?(tchou)
sorry, I didn't see comment 11, Ting-yu already fetched the information. Wesly, please also help us with this issue.
Flags: needinfo?(wehuang)
Clear NI per comment 17.
Flags: needinfo?(tchou)
Summary: [MTBF][B2G] Phone shutdown without reason → [MTBF][B2G] Phone shutdown due to high power consumption and chopped charging.
Hi Youlong: The charging behavior (comment#11) we observed looks strange to me, is that your design? Would you help check and clarify? Thank you. @Walter: As discussed with Ken yesterday, for mtbf test in the future please use at least v184 as this is the 1st base that includes QCT CS release. Thank you.
Flags: needinfo?(youlong.jiang)
Flags: needinfo?(wehuang)
Flags: needinfo?(wachen)
See Also: → 1079810
Flags: needinfo?(bbajaj)
(In reply to Wesly Huang from comment #19) > Hi Youlong: > > The charging behavior (comment#11) we observed looks strange to me, is that > your design? Would you help check and clarify? Thank you. > > @Walter: > > As discussed with Ken yesterday, for mtbf test in the future please use at > least v184 as this is the 1st base that includes QCT CS release. Thank you. Hi,all: I am Xiaohui Ma, a T2M engineer on power manage subsystem. Analysis log from comment 11. 1 Battery capacity decreased continuously, battery is discharging. PC can drop about 500ma current as usual. Battery capacity can reduce down to 0, may phone consumed more than 500ma current, battery will discharge. You can get battery current value on running from a file /sys/class/power_supply/battery/current_now, positive value stands for discharg, negative value stands for charge. 2 Charge state will change lower than 3%, change times is more frequent more lower. May low capacity battery voltage leaded to IC charge module voltage no-stable. Thanks.
Based on comment 20, Walter, I guess the 1A USB connector you mentioned should be able to prevent device from stop charging.
(In reply to Ting-Yu Chou [:ting] from comment #21) > Based on comment 20, Walter, I guess the 1A USB connector you mentioned > should be able to prevent device from stop charging. Hi, Wesly Huang && Ting-Yu Chou: How to reproduce this problem? Please tell me your test environment, software version and so on. If you meet this problem, can you give me more kernel log about charge and battery current value list in charge. My explain 1 is to tell you phone consume energy from battery and charger at the same time which leads to battery capacity decreasing. Explain 2nd may be relate to qualcomm.
You can follow the steps on https://developer.mozilla.org/en-US/Firefox_OS/Platform/Automated_testing/MTBF_tests for running MTBF on the device. This is the command I use: $ MTBF_CONF=conf/flame_v210.json MTBF_TIME=1d python mtbf.py --testvars=testvars.json --address=localhost:2828 tests/test_dummy_case.py The test was done on a Flame 319MB with base image v180 and the latest v2.1.
Depends on: 1079810
See Also: 1079810
Whiteboard: [mtbf]
It is reproducible in v180 and v184(as for today). It's harder to reproduce in v184, but still reproducible.
Flags: needinfo?(wachen)
Flags: needinfo?(wachen)
Hi Xiaohui: Do you have update after trying Ting-Yu's procedure? What's the status now? Or have you get any comment from QCT? BTW for 1. if a phone consume more than 500mA it's quite huge I think 2. it's also strange that charging become unstable if battery level is <3%.
Flags: needinfo?(xiaohui.ma)
(In reply to Wesly Huang from comment #25) > Hi Xiaohui: > > Do you have update after trying Ting-Yu's procedure? What's the status now? > Or have you get any comment from QCT? > > BTW for > > 1. if a phone consume more than 500mA it's quite huge I think > 2. it's also strange that charging become unstable if battery level is <3%. I am setting up MTBF test environment to reprduce thi bug existing some problems beause of network.
Flags: needinfo?(xiaohui.ma)
Thanks Xiaohui's update, then pls keep letting us know the progress, thank you.
Flags: needinfo?(xiaohui.ma)
Please let us know if you have any difficulties on setting up MTBF environment.
Hi, Wesly Huang: Can you test in your envirment to speed up solving it? We meet some problems in setting up test envirment. We need get these infos in whole test process. battery status: sys/class/power_supply/battery/status battery current: sys/class/power_supply/battery/current_now positve value: discharging, negative value:charging Then we can anaysis log. Thanks.
Flags: needinfo?(xiaohui.ma)
Hi Walter: Would you either help on the log, or support Xiaohui for their test environment setup? Thanks. @Xiaohui: would you let us know more about the issue for your environment setup? Thanks.
Flags: needinfo?(xiaohui.ma)
Hi, Wesly: I give you and Wakter a email about mozilla test environment. Please help to invest it. Thanks.
Flags: needinfo?(xiaohui.ma)
Hi, Wesly && Walter: I ask my colleague to deal with test environment problem. Can you supply log on comment 29. Many thanks.
Dears - about this issue, we just can ensure status of sleep power consumption and charging logic is normal. if related to third app or auto test module import, we won't follow this situation, in the cause this is not under our control. tks.
Flags: needinfo?(youlong.jiang)
(In reply to youlong.jiang from comment #33) > Dears - > > about this issue, we just can ensure status of sleep power consumption and > charging logic is normal. if related to third app or auto test module > import, we won't follow this situation, in the cause this is not under our > control. > > tks. we'll arrange a pressure test for charging under battery lower than 3% and check if exist stopping charging status, and step forward per test result. tks.
We found an interesting result from powertool of mozilla. If we keep charging our phone, the charging level would go down gradually. We estimate that it charges 1mA less per 10~20 seconds.
Hi, what's the status of this bug now?
Flags: needinfo?(youlong.jiang)
(In reply to Walter Chen[:ypwalter][:wachen] from comment #36) > Hi, what's the status of this bug now? hi walter - could you pls share the v2.1 image version to us and we'll try to repro this problem in MTBF env. tks.
Flags: needinfo?(youlong.jiang)
Flags: needinfo?(wachen)
hi Walter: they now setup the env. to repro. this issue, would you help share a SW for them to do it? Tks.
Flags: needinfo?(wachen)
Hi, Wesly, I think any v2.1 image should be fine. The date of the build is not the main concern. I heard from one QA that partners can build v2.1 image themselves. Hi, youlong, Please build or use any current v2.1 build (engineer build plz)
Flags: needinfo?(wachen)
(In reply to Walter Chen[:ypwalter][:wachen] from comment #39) > Hi, Wesly, > > I think any v2.1 image should be fine. The date of the build is not the main > concern. I heard from one QA that partners can build v2.1 image themselves. > > Hi, youlong, > > Please build or use any current v2.1 build (engineer build plz) hi wesly, walter - we don't have v2.1 code and build branch, could you pls share image to us directly that if you have no any other concern problem. tks.
Flags: needinfo?(wehuang)
Hi Youlong, pls follow the mail we sent you for instructions to get & flash SW, then let us know if any problem/support needed. Thanks.
Flags: needinfo?(wehuang) → needinfo?(youlong.jiang)
(In reply to Wesly Huang from comment #41) > Hi Youlong, pls follow the mail we sent you for instructions to get & flash > SW, then let us know if any problem/support needed. Thanks. hi wesly - as mail mentioned, after flash related gaia/gecko version and run mtbf in our env, crash error, so pls help to analysis. tks.
Flags: needinfo?(youlong.jiang)
Please redo the test again with previous mentioned settings. Also: 請將test_dummy_case.py內time.sleep(120)請降低為time.sleep(15)或time.sleep(20)避免手機充電
There are phone dying everyday. From v2.0 to v2.2, we saw a lot of similar reactions. I am going to keep this bug open all the way. Also, I will try to report the status of our lab daily.
blocking-b2g: --- → 2.2?
There were 9 phones (out of 40) ran out of battery/power over the weekends
Hi Guys, I'd like to share some experience on this kind of shutdown case. There are some possible reason to let device shutdown, I listed as following: 1. The basic one is charging current less than power consumption, there are some cases: - a. In flame, the idle screen on current is 300mA, so I think it's easy to exceed 500mA in MTBF test. Device will be shutdown if you just connect USB cable for charging. - b. With AC charger, there is chance to cause charging current less than power consumption. When battery temperature is higher then a threshold, thermal daemon will be enable and try to limit the charging current. Further, charging will be stopped. 2. Other cases to enable protected mechanism: - a. temperature CPU/battery is higher than a threshold, shutdown will be triggered. - b. With suddenly big current (2A or more), battery voltage will be dropped and cause device shutdown. 3. Maybe there is something wrong on battery algorithm. In this case, we might need more battery log to analyse.
Hi, Danny, Thanks. I have some feedback for you. 1. The phone just sit there and died. 2. We have lots of logs that you can track back to see if there is any useful one. I think mozilla's (our) responsibility is done. I believe that TCL should be taking care for this. BTW, there are 10 phones found dead today again without running tests. It just sat there and died after we run the tests.
[Tracking Requested - why for this release]: [Blocking Requested - why for this release]:
There are 3 phones found dead today (run out of battery). STR: put the phone there without anything
Flags: needinfo?(bbajaj)
Another 5 phones dead just sitting there again today.
I've emailed sku/danny, hoping they are able to look..
Flags: needinfo?(bbajaj)
After discussion with Walter, suspected it might be related to charging/cable detection in driver. We will measure the power consumption and check battery status when issue happen. Moreover, Walter will help to setup MTBF test with USB cable of nexus-4/5.
Flags: needinfo?(wachen)
Helping with Danny on the power tests. Also, another 10 phones dead today.
Flags: needinfo?(wachen)
Testing was blocked by bug 1122119. We will try to restart it in an older build tomorrow. However, We do need some tools for ram dump. Does the partner have such tools?
Flags: needinfo?(youlong.jiang)
Flags: needinfo?(bbajaj)
Update more information of this test. In the test, we met white screen issue and it should be one of bugs to cause power off. With power monitor, we saw the power consumption is ~260mA and it's stable. I think it's device hang in white screen without adb connected, then consume 260mA till out of battery. I have two questions as following: 1. When system hang, watchdog should be trigger and try to restart device, but I didn't see it. Does partner disable watchdog as default? 2. In this case, we cannot get more information by adb, so we might need enable ramdump mechanism to get kernel message for analysis, could we enable it?
There were 4 phones dead yesterday, and 9 phones today. There must be some issues...
wesly, can you please help folow-up on this based on Danny's recent comments ?
Flags: needinfo?(bbajaj) → needinfo?(wehuang)
Hu Youlong: As discussed in phone, would you help check this issue again, and reply the questions in comment#46, comment#54, and comment#55? After that I might arrange a phone discussion with your team and Danny if needed.
Flags: needinfo?(wehuang)
(In reply to Danny Liang [:dliang] from comment #46) > Hi Guys, > I'd like to share some experience on this kind of shutdown case. There are > some possible reason to let device shutdown, I listed as following: > 1. The basic one is charging current less than power consumption, there are > some cases: > - a. In flame, the idle screen on current is 300mA, so I think it's easy > to exceed 500mA in MTBF test. Device will be shutdown if you just connect > USB cable for charging. > - b. With AC charger, there is chance to cause charging current less than > power consumption. When battery temperature is higher then a threshold, > thermal daemon will be enable and try to limit the charging current. > Further, charging will be stopped. > 2. Other cases to enable protected mechanism: > - a. temperature CPU/battery is higher than a threshold, shutdown will be > triggered. > - b. With suddenly big current (2A or more), battery voltage will be > dropped and cause device shutdown. > 3. Maybe there is something wrong on battery algorithm. In this case, we > might need more battery log to analyse. hi Danny - for your point#1, we've tested per MTBF env and not found charging current less than power consumption, can not reproduce this issue. also I think we could double check current under idle mode. point#2, if shutdown triggered by high temperature of CPU/Battery, in my opinion, battery should not be run out when you check. but no more power left. point#3, "battery algorithm" you mentioned here, er, I didn't get your meaning. how we check the status. if so, what result it would bring to. I could loop our engineer to check this point with you. tks.
Flags: needinfo?(youlong.jiang)
(In reply to Walter Chen[:ypwalter][:wachen] from comment #54) > Testing was blocked by bug 1122119. We will try to restart it in an older > build tomorrow. > > However, We do need some tools for ram dump. Does the partner have such > tools? hi Walter - usually, we'll use QPST Configuration tool for memory dump, but it's not feasible for you with license concern. But I think you could try to get dump info from SD card. this function is supported in Qcom base, just need to enable some switches. I can help to check with my guys for this problem. tks.
(In reply to Danny Liang [:dliang] from comment #55) > Update more information of this test. > In the test, we met white screen issue and it should be one of bugs to cause > power off. > With power monitor, we saw the power consumption is ~260mA and it's stable. > I think it's device hang in white screen without adb connected, then consume > 260mA till out of battery. > > I have two questions as following: > 1. When system hang, watchdog should be trigger and try to restart device, > but I didn't see it. Does partner disable watchdog as default? > 2. In this case, we cannot get more information by adb, so we might need > enable ramdump mechanism to get kernel message for analysis, could we enable > it? hi Danny - on msm8x10 platform, if AP system crash, watchdog would work and reboot phone. from your description, I think it may step into memory dump mode. then this is point#2. how to get this part info. pls refer to #60. tks.
blocking - need reliable MTBF statistics
blocking-b2g: 2.2? → 2.2+
(In reply to youlong.jiang from comment #60) > (In reply to Walter Chen[:ypwalter][:wachen] from comment #54) > > Testing was blocked by bug 1122119. We will try to restart it in an older > > build tomorrow. > > > > However, We do need some tools for ram dump. Does the partner have such > > tools? > > hi Walter - > > usually, we'll use QPST Configuration tool for memory dump, but it's not > feasible for you with license concern. > > But I think you could try to get dump info from SD card. this function is > supported in Qcom base, just need to enable some switches. I can help to > check with my guys for this problem. > > tks. hi Danny - pls take following steps to dump info to SD card. #adb shell #cd storage/sdcard1 #mkdir ram_dump #touch rdcookie.txt log saved in storage/sdcard1/1/ file list: CODERAM.BIN, DDRCS0.BIN, LPM.BIN, OCIMEM.BIN,PMIC_RTC.BIN,load.cmm, DATARAM.BIN,DDRCS1.BIN,MSGRAM.BIN,PMIC_PON.BIN, RST_STAT.BIN rawdump.bin pls feel free to contact me if have any problem. tks.
blocking-b2g: 2.2+ → 2.2?
(In reply to youlong.jiang from comment #60) > (In reply to Walter Chen[:ypwalter][:wachen] from comment #54) > > Testing was blocked by bug 1122119. We will try to restart it in an older > > build tomorrow. > > > > However, We do need some tools for ram dump. Does the partner have such > > tools? > > hi Walter - > > usually, we'll use QPST Configuration tool for memory dump, but it's not > feasible for you with license concern. > > But I think you could try to get dump info from SD card. this function is > supported in Qcom base, just need to enable some switches. I can help to > check with my guys for this problem. > > tks. Dears - sorry for misleading. QPST should not be limited by license. I think you could download it from qcom official web interface. pls also have a try. tks.
walter are you still experiencing this ? Looks like T2M is not able to repro this and needs more data from our side if we are able to repro. Can you follow-up ? I am hesitant to block on this till we have more information as this is not particular to 2.2
blocking-b2g: 2.2? → -
Flags: needinfo?(wachen)
I see no less devices with such issues, but it doesn't mean that the problems just went away without any fixes... Also, we are not able to run more time all the way until the phone have such issue.
Flags: needinfo?(wachen)
Triage: not reproducible, remove from radar.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: