Hyperopia: Difference between revisions

From OLPC
Jump to navigation Jump to search
No edit summary
Line 34: Line 34:
category, wikisource, other/custom ...)
category, wikisource, other/custom ...)
</pre>
</pre>
: Here one could include some v. specific custom options; "1000 articles every <PROJECTNAME> should have", &c.
'' Here one could include some v. specific custom options; "1000 articles every <PROJECTNAME> should have", &c. An option to browse existing snapshots could replace this choice and the manual choice of parameters. ''
: An option to browse existing snapshots could replace this choice and the manual choice of parameters.


;snapshot parameters
;snapshot parameters
Line 54: Line 53:
</pre>
</pre>


: Here pdf and odt would simply be very long, somewhat unorganized collections, like a traditional encyclopedia; with autogenerated metadata - TOCs, page numbers, &c.
'' Here pdf and odt would simply be very long, somewhat unorganized collections, like a traditional encyclopedia; with autogenerated metadata - TOCs, page numbers, &c. There could be more specific export formats, such as an '''XO''' format which wrapped a woip or mw-xml export into the directory structure and zipfile needed for a new .xo file. ''
: There could be more specific export formats, such as an '''XO''' format which wrapped a woip or mw-xml export into the directory structure and zipfile needed for a new .xo file.




Line 65: Line 63:
custom snapshots could start from existing snapshots, combining them
custom snapshots could start from existing snapshots, combining them
or extending them to a different set of languages.
or extending them to a different set of languages.



== See also ==
== See also ==

Revision as of 06:56, 28 April 2011

Hyperopia is a planned wikieditor framework for editing a subset of a wiki while offline. It will support synchronizing with source wikis when online again - pushing any local changes made and/or pulling new updates from it.

Related bugs

Fixing the wikibrowse toolchain so creating new wikislices works
<trac>10510</trac> - make wikipedia.xo work on F11 and F14
<trac>10526</trac> - sync mwlib with the latest version upstream.


Default format

In theory, hyperopia could work with multiple formats. The current format being used is that of Wikibrowse. Documents are stored in wikimarkup to simplify updates and changes. Initial support is provided for MediaWiki, including templates and math markup.


Updates

Updates will be posted to the source wiki using three-way diffs to simplify the process. When this seems too complicated, the update can be posted to a new page, and a link to the diff b/t it and the latest revision posted to the artucke talk page.

'complicated' is a customizable concept; in conservative cases this can mean 'when an intervening edit has occurred'; at the other end of the spectrum 'when a merge conflict cannot be reasonably resolved'.


Creating a new snapshot

There are few complete tools for creating new snapshots. Part of the Hyperopia framework will be simple methods for generating these in a suitable format.

Currently the Collections extension for mediawiki makes it easy to create a Zim export of a set of articles. That is a good example of an interface/workflow for compiling and downloading a snapshot, but does not yet export to a format that would support lossless editing and republishing of changes.


Creating large snapshots

Snapshots such as Wikipedia for Schools, or WikiBrowse, or Wikipedia 1.0, are 100M to 10+G in size.

We need a better workflow for creating these sorts of snapshots - which teams of people currently spend a lot of time creating, partly by hand and with one-off scripts. A sample interface might include the following options:

snapshot source material
"snapshot type" (wiktionary, abridged wikipedia, wikipedia by
category, wikisource, other/custom ...)

Here one could include some v. specific custom options; "1000 articles every <PROJECTNAME> should have", &c. An option to browse existing snapshots could replace this choice and the manual choice of parameters.

snapshot parameters
"language[s]"
"articles"  (trusted only, by popularity, by wp1.0 score, all)
"article stubs" (yes, no, only popular ones)
"article length" (1st para, lede, summary, full)
"image size" (none, thumbnails, full)
"target size"  (<50M, 200M, 1G, 4G, 16G, 64G, any size)
"image % of total"  (none, 20%, 50%, 80%)
"templates" (yes, no, oh please no)
export format[s]
"export format"  (zim, wikireader, woip, mw-xml, pdf, odt)

Here pdf and odt would simply be very long, somewhat unorganized collections, like a traditional encyclopedia; with autogenerated metadata - TOCs, page numbers, &c. There could be more specific export formats, such as an XO format which wrapped a woip or mw-xml export into the directory structure and zipfile needed for a new .xo file.


Some of the choices above would limit the selection available for the others.

'WP by category' could include some of the larger sorts of snapshots that can currently be generated as books - especially if one can update those automatically with page-scoring and wikitrust data. New custom snapshots could start from existing snapshots, combining them or extending them to a different set of languages.

See also