PDF collection managment
description of a project for managing collections of pdfs / book scans from archives
- Make it easy for children to download high-quality content, e.g. digital library resources - Make it easy for schools to distribute reading lists that link to specific resources - Make it easy for children to share books or other resources, and to post comments about the resources - Tie in tightly with metadata systems
- "Feeds" - Contain metadata and data/blobs - "Upstream" feeds: school syllabus; Google Book Search - "Peer" feeds (see "Share" below) - Resource examples - PDFs (books / other documents) - Wikipedia pages, or other web pages, for offline reading - Management of recursive fetching - Browser integration: queuing of links for later download when reading offline - Images/video
- Browse - Metadata browser (browse by category etc.) - Google Book Search and other web-enabled browsers, that allow previewing content without downloading it all - Search - Structured: metadata search - Unstructured: keyword search - Download - Download resources from a feed - Support resume, parallel downloads through a simple download manager, etc. - Server-side support for download of limited page ranges for large PDFs? (e.g. fetch 100 pages at a time, creating smaller PDF) - Share - Management and synchronization of resources - RSS/atom/"playlists" with attached blob resources - Build on libraries for service discovery and file-sharing - Limited total storage space => need to be able to swap resources in and out like iTunes / Synchropated / Conduit - Create own feeds and re-publish resources to the feed - Allow kids to add own "soft" metadata to their own feeds, that is kept separate from actual resource metadata: personalized summary; opinions/comments/essays; star rating; categorization/tags/keywords; illustrations etc.
- Simple, driven by capabilities above - Add a book API to the Google GData APIs?
- Only highlight public domain / CC-BY content, but don't build any sort of content controls into the client (that's not the right place for it). - Strong attention to proper attribution of the original author[s] and publisher[s]. As such primary metadata (metadata from a primary upstream source) will not be editable in the client. "Soft" metadata, e.g. comments / user categorization etc. can be added but is kept separate from primary metadata.