PageCollector

From OLPC
Jump to: navigation, search

PageCollector is a command-line script that fetches pages and rewrites links to turn the pages into a stand-alone bundle. It is currently hosted at http://svn.colorstudy.com/home/ianb/PageCollector/trunk and can be installed with easy_install http://svn.colorstudy.com/home/ianb/PageCollector/trunk (on systems with a C compiler; it doesn't currently install directly on the laptop). It's planned to move it into a laptop.org git repository.

It allows for some pluggability insofar as how URLs are mapped to filenames. Further points of extension can be added if that's useful, for example in creating Content Bundles.

The basic usage would be something like: pagecollector http://wiki.laptop.org/go/Category:Developers, which fetches every page linked to in the Developers category and writes them into a wiki.laptop.org/ directory.

-- Ian Bicking