This technique allows you to edit the encyclopedia articles in an existing Wikislice, using the WikiBrowse software.
You can then prepare the edits, merging them into the compressed article database, and preparing a new '.xo' file for distribution.
This is a very rough solution -- not a finished product.
Setup the "server"
Note: You can run this on a machine for "local" editing, or on a server. In the case of a server, there is currently no authentication or security of any kind in this mode -- use it only within a safe local network.
Get the latest wikiserver code
git clone git://dev.laptop.org/users/martin/wikiserver
Unpack Wikipedia-XX.xo . In this case, we will experiment with Spanish Wikipedia-20 from http://dev.laptop.org/~cjb/eswiki/0.84/Wikipedia-20.xo
Now copy the data files from Wikipedia-20.xo to the directory where the wikiserver code is. The data files usually reside in a directory with a language-country prefix. In this case, es_PE -- because the very first Wikipedia bundle was made for Perú:
cp -r Wikipedia.activity/es_PE wikiserver/
Prepare a directory to store your edits
Running and stopping the server
Now to run the server,
(python server.py es_PE/es_PE.xml.bz2 8000 ~/wikipediaedits/ 2>&1 ) | tee wikiserver.log
To stop the server, hit control-C . In every run, it will write a logfile (wikiserver.log). If you hit a bug or a problem, please include the logfile in the report.
If you are running the above on a network server, open your webbrowser and go to http://<name-or-IP-of-server>:8080/ .
If you are running it on a single machine, and use http://localhost:8080/
With each page you will see an 'edit' link, leading to a form. Edit and submit your changes. The UI is extremely simple
There is not webbased UI for change review at the moment. However you can review the changes from the commandline:
diff -ur ~/wikipediaedits/wiki.orig ~/wikipediaedits/wiki
Prepare/install the merge/update tools
Compile the tools -- this is only required once.
sudo yum install rubygem-RubyInline ruby \ bzip2-devel automake autotools make gcc cd woip/c ./bootstrap.sh make lsearcher bzipreader blocks cd ../../locate.freebsd make all
Merge edits into data files
We will create a new set of datafiles, based on the old datafiles + your edits
# create a destination directory mkdir es_PE_edited # run the merge, this can take a long time bzcat es_PE/es_PE.xml.bz2.processed | \ tools/mergeupdates.py ~/wikiedits | bzip2 -c \ > es_PE_edited/es_PE.xml.bz2.processed
This process takes a while. As it runs through the file, it will indicate when it is overriding a particular content file, for example:
Merging ~/wikiedit/Andorra Merging ~/wikiedit/Física
Once the main data file (.processed) is ready, reindex.
The resulting files will be in es_PE_edited
Create a new Wikipedia.xo
- Replace the files in Wikipedia.activity/es_PE with the files from your es_PE_edited directory.
- Update the version string in activity.info -- please check with developers on firstname.lastname@example.org to pick a version number that will not conflict.
- Use zip to re-create the bundle file
The use of this editing facility with a group of users can be coordinated with a webbased spreadsheet such as Google Docs. You can import the file es_PE.xml.bz2.index.txt into a spreadsheet to have an listing of all the pages to review.
The software works as is. Motivated programmers might be interested in tackling this informal TO DO list
- Implement HTTP Auth 'Basic' for simple user/password protection
- Use git for history.
- Init a git repo, commit to it on every edit.
- We will have to add files opportunistically -- it is a huge cost to git add the whole dataset.
- Show file history via a gitweb/cgit cgi
- Better UI
- Track Seen/audited status for all pages for more integrated workflow