Bundle project: Difference between revisions
Jump to navigation
Jump to search
(..) |
(..) |
||
(4 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
#redirect [[Bundle (activity)]] |
|||
== guidelines for bundling == |
|||
* '''scripts''': capture and publish any scripts used to create the bundle. Others who see missing metadata or other elements will want to be able to rebuild the bundle with different parameters. Taking care with documenting these scripts is the easiest way to guarantee compatibility with attribution and other licensing as well. |
|||
* '''licenses''': note licensing and attributino as granularly as the original creators did. every image in a collectino of images, every article in a set of articles, every definition in a set of definitions. If there is a simple way to pass on the aggregate history of collaborative works, include that; else include a link to the source history for the work (or a script that has options for extracting history, latest-author, date, and similar in the format of the original archive). |
|||
* '''other metadata''': see the [[#metadata]] section below. capture the original URL or source, and as many of the intervening authors, uploaders, and upload dates as possible, to help accurately identify the provenance of a work. |
|||
* check source archives for APIs for gathering such data. Many sites, including modern mediawiki sites, have an API that will directly give you most information you need without [[#screenscraping]]. |
|||
== screenscraping == |
|||
''...and regular expressions'' |
|||
'''extracting licenses from Wikimedia Commons''' |
|||
:%s/<a href="[^h][^>]*>\([^<]*\)<\/a>/\1/gc |
|||
:%s/<hr\_p\{-}<p>.*<\/p>//c (rm gfdl-template excess) |
Latest revision as of 21:31, 15 February 2008
Redirect to: