Bundle project: Difference between revisions
Jump to navigation
Jump to search
(..) |
(..) |
||
(One intermediate revision by the same user not shown) | |||
Line 1: | Line 1: | ||
#redirect [[Bundle (activity)]] |
|||
== guidelines for bundling == |
|||
* '''scripts''': capture and publish any scripts used to create the bundle. Others who see missing metadata or other elements will want to be able to rebuild the bundle with different parameters. Taking care with documenting these scripts is the easiest way to guarantee compatibility with attribution and other licensing as well. |
|||
* '''licenses''': note licensing and attribution as granularly as the original creators did. every image in a collection of images, every article in a set of articles, every definition in a set of definitions. If there is a simple way to pass on the aggregate history of collaborative works, include that; else include a link to the source history for the work (or a script that has options for extracting history, latest-author, date, and similar in the format of the original archive). |
|||
* '''other metadata''': see the [[#metadata]] section below. capture the original URL or source, and as many of the intervening authors, uploaders, and upload dates as possible, to help accurately identify the provenance of a work. |
|||
* check source archives for APIs for gathering such data. Many sites, including modern mediawiki sites, have an API that will directly give you most information you need without [[#screenscraping]]. |
|||
== screenscraping == |
|||
''...and regular expressions'' |
|||
'''extracting licenses from Wikimedia Commons''' |
|||
:%s/<a href="[^h][^>]*>\([^<]*\)<\/a>/\1/gc |
|||
:%s/<hr\_p\{-}<p>.*<\/p>//c (rm gfdl-template excess) |
|||
vim -c "%s///g" -c wq <file> |
|||
for a in *.txt; do mv "$a" "${a%.txt}.baz"; done ( or mv "$f" "${a#proto-}"; ) |
|||
== topics and scope == |
|||
Try to pick a topic that can be covered elegantly in a compact bundle. Most laptop bundles should be under 10M in size. If you think you have a topic that can't possibly be covered this way, consider covering a smaller scope, the same scope with less depth, or the same topic at a different level of abstraction. Larger collections (up to 1G in size) can be packaged for a [[#school|school library]]. |
Latest revision as of 21:31, 15 February 2008
Redirect to: