前面四章介绍了ANR分类,分析流程,trace信息分析,Event log分析,Logcat分析,这些都是ANR分析的基础知识,以及分析问题的关键信息,熟练掌握以上信息才能让ANR分析尽可能又快又准确。

这章开始分析一些我在开发中遇到的一些ANR实例,通过总结这些实例的分析流程以及一些关键信息,帮助我们对ANR有更多的了解。

这里我们再重复贴一次分析流程:

ANR分类

首先看trace信息:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
----- pid 2758 at 2024-05-30 12:15:49.504759099+0800 -----
Cmd line: com.android.launcher
Build fingerprint: 'xxxxxxxx:user/release-keys'
ABI: 'arm64'
Build type: optimized
suspend all histogram: Sum: 1.340s 99% C.I. 2.663us-6123.885us Avg: 170.696us Max: 23128us
DALVIK THREADS (49):
"main" prio=5 tid=1 Native
| group="main" sCount=1 ucsCount=0 flags=1 obj=0x72f1b068 self=0xb400007397c3bc00
| sysTid=2758 nice=0 cgrp=foreground sched=1073741824/0 handle=0x743b0d74f8
| state=S schedstat=( 6996957292061 2882192477787 10876260 ) utm=558104 stm=141590 core=2 HZ=100
| stack=0x7fd636b000-0x7fd636d000 stackSize=8188KB
| held mutexes=
native: #00 pc 000eed68 /apex/com.android.runtime/lib64/bionic/libc.so (__epoll_pwait+8) (BuildId: a424ed74a6d8a9ba99f74b96671f25ec)
native: #01 pc 00018b5c /system/lib64/libutils.so (android::Looper::pollInner+188) (BuildId: 53f21814a364f4abc16a65a88e752c9b)
native: #02 pc 00018a3c /system/lib64/libutils.so (android::Looper::pollOnce+124) (BuildId: 53f21814a364f4abc16a65a88e752c9b)
native: #03 pc 001869cc /system/lib64/libandroid_runtime.so (android::android_os_MessageQueue_nativePollOnce+44) (BuildId: 97903b567fd59762f825ddea6b65b6ab)
at android.os.MessageQueue.nativePollOnce(Native method)
at android.os.MessageQueue.next(MessageQueue.java:335)
at android.os.Looper.loopOnce(Looper.java:168)
at android.os.Looper.loop(Looper.java:346)
at android.app.ActivityThread.main(ActivityThread.java:8344)
at java.lang.reflect.Method.invoke(Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run(RuntimeInit.java:594)
at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:1067)

第1行是ANR发生的进程、时间,

第2行是发生ANR的应用包名,

第8行状态是Native,

第18行在调用MessageQueue.nativePollOnce方法时发生了ANR,从这个堆栈信息看主线程阻塞,处于等待消息的状态,

根据这些信息还不能查找到ANR的原因,我们需要更多的信息,

然后我们看event log,关键字am_anr,这个在event log中有介绍,这个是anr的类型,

1
05-30 12:15:49.544  1000  1946  8368 I am_anr  : [0,2758,com.android.launcher,751550021,Input dispatching timed out (Application does not have a focused window)]

从这里看到属于Input dispatch ANR类型,从括号看到是该类型下的no focus window类型。

根据流程图,我们需要判断应用(launcher)是否走完onResume,如果没有则是应用问题,如果已经加载完成那么还要判断应用是否绘制完成,

onResume的判断关键字是:wm_on_resume_called(R版本之前为 am_on_resume_called)– event log中查找

是否绘制完成判断关键字:VRI[xxx]: reportDrawFinished – logcat中,例如:VRI[QuickstepLauncher]: reportDrawFinished,Launcher绘制完成,

回到实例,从logcat查找到在ANR之前已经绘制完成

1
05-30 12:15:16.210 10100  2758  2758 D VRI[QuickstepLauncher]: reportDrawFinished

从时间看这个在ANR发生之前,但是如何判定这个之后是否有刷新操作,需要根据log在ANR发生之前已经reportDrawFinished之后是否有Relayout Window的log,如果没有说明执行完了绘制流程,

那么分析到此说明非Launcher问题,我们还有继续找原因,

logcat信息,也就是AnrManager信息,查看ANR发生前的负载情况:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
行 2345605: 05-30 12:15:49.548  1000  1946  8368 I AnrManager: startAnrDump
行 2345606: 05-30 12:15:49.550 1000 1946 8368 I AnrManager: isANRFlowSkipped() AnrFlow = 0
行 2345607: 05-30 12:15:49.550 1000 1946 8368 I AnrManager: enableTraceLog: false
行 2345619: 05-30 12:15:49.571 1000 1946 8368 I AnrManager: dumpAnrDebugInfo begin: AnrDumpRecord{ Input dispatching timed out (Application does not have a focused window) ProcessRecord{d6f31e8 2758:com.android.launcher/u0a100} IsCompleted:false IsCancelled:false }, onlyDumpSelf = false, isSilentANR = true
行 2345620: 05-30 12:15:49.571 1000 1946 8368 I AnrManager: dumpAnrDebugInfoLocked: AnrDumpRecord{ Input dispatching timed out (Application does not have a focused window) ProcessRecord{d6f31e8 2758:com.android.launcher/u0a100} IsCompleted:false IsCancelled:false }, onlyDumpSelf = false, isSilentANR = true
行 2345621: 05-30 12:15:49.573 1000 1946 8370 I AnrManager: /dev/binderfs/binder_logs/timeout_log isn't exist
行 2345622: 05-30 12:15:49.585 1000 1946 8368 I AnrManager: dumpStackTraces begin!
行 2345642: 05-30 12:15:49.861 1000 1946 8368 I AnrManager: dumpStackTraces end!
行 2345643: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ANR in com.android.launcher (com.android.launcher/.uioverrides.QuickstepLauncher), time=501120607
行 2345644: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: Reason: Input dispatching timed out (Application does not have a focused window)
行 2345645: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: Load: 0.0 / 0.0 / 0.0
行 2345646: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- Output from /proc/pressure/memory -----
行 2345647: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: some avg10=0.00 avg60=0.17 avg300=0.30 total=8843096625
行 2345648: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: full avg10=0.00 avg60=0.06 avg300=0.08 total=3657944521
行 2345649: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- End output from /proc/pressure/memory -----
行 2345650: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- Output from /proc/pressure/cpu -----
行 2345651: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: some avg10=1.29 avg60=11.97 avg300=15.55 total=65843288544
行 2345652: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- End output from /proc/pressure/cpu -----
行 2345653: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- Output from /proc/pressure/io -----
行 2345654: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: some avg10=0.00 avg60=0.11 avg300=0.26 total=10711845649
行 2345655: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: full avg10=0.00 avg60=0.00 avg300=0.10 total=4573022827
行 2345656: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: ----- End output from /proc/pressure/io -----
行 2345657: 05-30 12:15:49.866 1000 1946 8368 I AnrManager:
行 2345658: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: Android time :[2024-05-30 12:15:49.86] [501120.938]
行 2345659: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: CPU usage from 46930ms to -1ms ago (2024-05-30 12:15:02.611 to 2024-05-30 12:15:49.542):
行 2345660: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 16% 1946/system_server: 11% user + 5.1% kernel / faults: 29595 minor 243 major
行 2345661: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 13% 1305/surfaceflinger: 8% user + 5.1% kernel / faults: 3356 minor 24 major
行 2345662: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 5.3% 1383/adbd: 1.4% user + 3.9% kernel / faults: 258 minor
行 2345663: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 4.6% 447/logd: 0.7% user + 3.8% kernel / faults: 1050 minor 9 major
行 2345664: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 4.1% 11427/com.github.uiautomator: 3.4% user + 0.7% kernel / faults: 9598 minor 264 major
行 2345665: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 4% 1265/android.hardware.audio.service.mediatek: 2.8% user + 1.1% kernel / faults: 2378 minor 5 major
行 2345666: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 3.5% 2384/com.android.systemui: 2.5% user + 1% kernel / faults: 3583 minor 101 major
行 2345667: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 3.1% 1272/android.hardware.graphics.composer@2.1-service: 1.3% user + 1.7% kernel / faults: 744 minor
行 2345668: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 2.7% 2758/com.android.launcher: 1.8% user + 0.8% kernel / faults: 1542 minor 194 major
行 2345669: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 2.4% 31765/com.google.android.gms.persistent: 1.5% user + 0.9% kernel / faults: 1790 minor 255 major
行 2345670: 05-30 12:15:49.866 1000 1946 8368 I AnrManager: 2.3% 25811/android.process.acore: 1.1% user + 1.1% kernel / faults: 959 minor 547 major
...
行 2345797: 05-30 12:15:49.867 1000 1946 8368 I AnrManager: 15% TOTAL: 8.5% user + 5.6% kernel + 0% iowait + 0.6% irq + 0.1% softirq
行 2345798: 05-30 12:15:49.867 1000 1946 8368 I AnrManager: CPU usage from 501120607ms to 501120607ms ago (1970-01-01 08:00:00.000 to 1970-01-01 08:00:00.000) with 0% awake:
行 2345799: 05-30 12:15:49.867 1000 1946 8368 I AnrManager: 0% TOTAL: 0% user + 0% kernel
行 2345800: 05-30 12:15:49.867 1000 1946 8368 I AnrManager: dumpAnrDebugInfo end: AnrDumpRecord{ Input dispatching timed out (Application does not have a focused window) ProcessRecord{d6f31e8 2758:com.android.launcher/u0a100} IsCompleted:true IsCancelled:false }, onlyDumpSelf = false , isSilentANR = true
行 2345801: 05-30 12:15:49.868 1000 1946 8368 I AnrManager: addErrorToDropBox app = ProcessRecord{d6f31e8 2758:com.android.launcher/u0a100} processName = com.android.launcher activityShortComponentName = com.android.launcher/.uioverrides.QuickstepLauncher parentShortComponentName = com.android.launcher/.uioverrides.QuickstepLauncher parentProcess = ProcessRecord{d6f31e8 2758:com.android.launcher/u0a100} annotation = Input dispatching timed out (Application does not have a focused window) mTracesFile = /data/anr/anr_2024-05-30-12-15-49-586
行 2345802: 05-30 12:15:49.869 1000 1946 8368 I AnrManager: controller = null

由于该log是我从logcat中过滤出来的,因此会带有行字。

我们看到AnrManager信息中

1
行 2345797: 05-30 12:15:49.867  1000  1946  8368 I AnrManager: 15% TOTAL: 8.5% user + 5.6% kernel + 0% iowait + 0.6% irq + 0.1% softirq

整体负载(15%远低于95%)是正常的,因此负载没有问题

我们只能再次回到event log,分析应用焦点切换过程中有什么问题

1
2
3
4
5
05-30 12:15:25.131  1000  1946  1995 I am_proc_start: [0,8341,10150,com.google.android.dialer,next-top-activity,{com.google.android.dialer/com.google.android.dialer.extensions.GoogleDialtactsActivity}]
05-30 12:15:25.138 1000 1946 2126 I input_focus: [Focus leaving 21d0029 com.android.launcher/com.android.launcher.uioverrides.QuickstepLauncher (server),reason=NO_WINDOW]
05-30 12:15:35.143 1000 1946 1994 I am_process_start_timeout: [0,8341,10150,com.google.android.dialer]
05-30 12:15:35.153 1000 1946 1994 I am_kill : [0,8341,com.google.android.dialer,-10000,start timeout]
05-30 12:15:49.544 1000 1946 8368 I am_anr : [0,2758,com.android.launcher,751550021,Input dispatching timed out (Application does not have a focused window)]

第1行:启动dialer应用,

第2行:开始切换焦点,离开Launcher窗口,正常来说应该是切换到启动的dialer应用窗口上面,

第3行:从log中发现,dialer启动超时了,这个时间是12:15:35,比12:15:25晚了10s,也就是10s中dialer也没有起来,我们知道Input Dispatch发生ANR的时间为5s,这个远超过了5s,因此焦点本应该切换到dialer应用窗口上,由于切换失败,导致ANR发生,

第4行,dialer应用被杀,原因是启动超时,

第5行:出现ANR的log,类型Input dispatching time out,原因是没有一个有焦点的窗口。

要解决该问题,就要分析dialer的启动过程以及在启动过程中系统的流程,找到启动超时的原因,这里不再分析。

通过该实例我们结合分析流程以及log信息对一个为Input dispatching time out(Application does not have a focused window)类型的ANR进行了分析,以帮助我们对上面的分析流程进行消化,以及对log信息进行熟悉,只有熟悉这些流程才能比较快的分析出ANR原因。