Literacy Project/2012-05-02
Notes
Angela C, David N, David L, Rachel, Richard, Sam
- Timestamps
- review of timestamp problems: 4 sets of stamps, (s, ms) + (1970, 2012). Still unclear what's causing the reset; Richard leaves on the 10th.
- Process of gathering data
RS - I would walk each directory. while copying that, it takes forever; 3-4G. I rsynced the whole thing, so the archive is duplicated of what's on raw.
- No processing yet on week 6; just copied zip files into the raw directory.
- Data variance
DL - When I ran a filter over timestamps... the results seemed odd for week 6. Only 5% had bad timestamps.
- And week 4 had an order of magnitude fewer total records than the previous week
Rachel - b/t week 5 and 6, we had a great increase (65%) of photos taken.
- Blank camera files
DL - still the same % of black files (3/4); it still does this when the screen is covered or it's dark out.
RS - fewer pictures would cut the total data size in half, would make other crunching faster.
DN - I could cut the rate in half.
- Similar deployments
Working with GA researchers (where they are using wifi uploads, no sd cards) They may not want all raw data to sit on worldliteracy.media servers, want to process it there.
- Richard's Ethiopia visit
I will bring an AP and 3G modem with me.
- The AP may have a serious power draw... they generally dont' have a notion of suspend or low-power mode. It would sit there and draw power all day long.
- Wonchi has some spotty 3G. Wolonchete has more? Will doublecheck. Data rates aren't too high (by comparison).
- Audio transcription
- We are Turking transcription of 30sec segments. WPI is doing this w/triplecounting.
TinkrBook data analysis
Some data summaries: