XO-1.75/Kernel/Issues: Difference between revisions
No edit summary |
|||
Line 12: | Line 12: | ||
This is a hang shortly after the serial console message: |
This is a hang shortly after the serial console message: |
||
mmp2_pm_finish: Enable audio island |
mmp2_pm_finish: Enable audio island |
||
<trac>11542</trac> and others. |
|||
The time period may vary. |
The time period may vary. |
Revision as of 21:13, 9 January 2012
hang, SET_BLOCK_COUNT
This is a hang with a repeating message:
mmcblk0: error -110 sending SET_BLOCK_COUNT command, response 0x0, card status 0x700
<trac>11137</trac> <trac>11525</trac> <trac>11528</trac>
hang, dpm_resume
This is a hang shortly after the serial console message:
mmp2_pm_finish: Enable audio island
<trac>11542</trac> and others.
The time period may vary.
Only seen to occur with olpc-runin-tests in aggressive mode; a ten second awake time, followed by a ten second rtcwake suspend. Easily reproduced.
Instances from previous testing:
- http://dev.laptop.org/~quozl/z/1RiHT6.txt (75.904ms)
- http://dev.laptop.org/~greenfeld/temp/175bringup/os23-ff199462/screenlog.3-resumehang2ecmod.bz2 (80.341ms)
- http://dev.laptop.org/~greenfeld/temp/175bringup/os23-26f404e/screenlog.6-resumehang.bz2 (2.683ms)
- http://dev.laptop.org/~greenfeld/temp/175bringup/os23-26f404e/screenlog.3-ecfail1.bz2 (72.616ms)
- http://dev.laptop.org/~quozl/z/1Rhvmt.txt
- http://dev.laptop.org/~quozl/z/1RhvpV.txt
- http://dev.laptop.org/~quozl/z/1RiK9g.txt (proving it does not require the body of the function)
Analysis.
Tracing the point of hang by gradually adding printk has shown the problem occurs in dpm_resume(), before dpm_complete() is called by dpm_resume_end().
Adding a 60ms mdelay() per device within the dpm_resume() function, within the list processing, patch has shown results:
- C1 SKU201 after 27 suspend cycles,
- C1 SKU202 after 9 suspend cycles,
- C1 SKU202 after 123 suspend cycles,
- B1 after 211 suspend cycles,
- B4 after 137 suspend cycles.
- after 450 suspend cycles.
- a B1 that survived 1083 suspend cycles over 12 hours without hitting this.
- B1 that slowed right down in dpm_resume().
Note that the elapsed time shows no correlation with the previous instances of the issue ... which suggests that it is the operations being performed rather than the time they are performed.
James is using the watchdog on six units to capture as many instances of the hang as possible, to help in identifying next step in analysis. Results:
- 2701 suspend tests, 2571 successful resumes, 130 hung resumes, most frequent last messages before hang were mmcblk mmc2:0001: resume (49), input input9: type resume (32), input input8: type resume (4),