Wiki-ing the Vista Monograph: Difference between revisions

From OLPC
Jump to navigation Jump to search
(New page: ===== Here is how it was wikied ===== I started with the .doc file of the VistA Monograph * opened it with Open Office * saved it as html * cleaned it up with Dave Raggett's [http://tidy...)
 
No edit summary
 
(10 intermediate revisions by the same user not shown)
Line 1: Line 1:
I started with the MS Word version of the VistA Monograph,
===== Here is how it was wikied =====
[http://www.va.gov/vista_monograph/docs/vista_monograph2005_06.doc vista_monograph2005_06.doc]

I started with the .doc file of the VistA Monograph


* opened it with Open Office
* opened it with Open Office
* saved it as html
* saved it as html
* cleaned it up with Dave Raggett's [http://tidy.sourceforge.net/ HTML Tidy]
* cleaned it up with Dave Raggett's [http://tidy.sourceforge.net/ HTML Tidy]
** funny characters - went away when I got the UTF-8 stuff right in HTML Tidy.
* converted it to MediaWiki using [http://search.cpan.org/~diberri/HTML-WikiConverter-0.61/lib/HTML/WikiConverter.pm HTML::WikiConverter].
* converted it to MediaWiki using [http://search.cpan.org/~diberri/HTML-WikiConverter-0.61/lib/HTML/WikiConverter.pm HTML::WikiConverter].
* manual editing to clean out a lot of junk html. Need to redo it with a sed script.
* manual editing to clean out a lot of junk html. replaced w/ sed scripts. Trust your Browser! to get things right w/o all this junk.
** br
** font
** div
** span

* script to remove excess blank lines.
* script to remove excess blank lines.
It's no wonder it has glitches. I'm surprised it came out as well as it did.
It's no wonder it has glitches. I'm really surprised it came out as well as it did.

NEED to polish scripts and make them available here.

Latest revision as of 03:11, 16 February 2008

I started with the MS Word version of the VistA Monograph, vista_monograph2005_06.doc

  • opened it with Open Office
  • saved it as html
  • cleaned it up with Dave Raggett's HTML Tidy
    • funny characters - went away when I got the UTF-8 stuff right in HTML Tidy.
  • converted it to MediaWiki using HTML::WikiConverter.
  • manual editing to clean out a lot of junk html. replaced w/ sed scripts. Trust your Browser! to get things right w/o all this junk.
    • br
    • font
    • div
    • span
  • script to remove excess blank lines.

It's no wonder it has glitches. I'm really surprised it came out as well as it did.

NEED to polish scripts and make them available here.