Wiki-ing the Vista Monograph: Difference between revisions
Jump to navigation
Jump to search
Drew.einhorn (talk | contribs) m (Wiki the Vista Monograph moved to Wiki-ing the Vista Monograph) |
Drew.einhorn (talk | contribs) No edit summary |
||
(5 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
⚫ | |||
===== Wiki the Vista Monograph ===== |
|||
[http://www.va.gov/vista_monograph/docs/vista_monograph2005_06.doc vista_monograph2005_06.doc] |
|||
⚫ | |||
* opened it with Open Office |
* opened it with Open Office |
||
* saved it as html |
* saved it as html |
||
* cleaned it up with Dave Raggett's [http://tidy.sourceforge.net/ HTML Tidy] |
* cleaned it up with Dave Raggett's [http://tidy.sourceforge.net/ HTML Tidy] |
||
** funny characters - went away when I got the UTF-8 stuff right in HTML Tidy. |
|||
* converted it to MediaWiki using [http://search.cpan.org/~diberri/HTML-WikiConverter-0.61/lib/HTML/WikiConverter.pm HTML::WikiConverter]. |
* converted it to MediaWiki using [http://search.cpan.org/~diberri/HTML-WikiConverter-0.61/lib/HTML/WikiConverter.pm HTML::WikiConverter]. |
||
* manual editing to clean out a lot of junk html. |
* manual editing to clean out a lot of junk html. replaced w/ sed scripts. Trust your Browser! to get things right w/o all this junk. |
||
** br |
** br |
||
** font |
** font |
||
** div |
** div |
||
** span |
** span |
||
** funny characters |
|||
* Need to redo it with a sed script. |
|||
* script to remove excess blank lines. |
* script to remove excess blank lines. |
||
It's no wonder it has glitches. I'm surprised it came out as well as it did. |
It's no wonder it has glitches. I'm really surprised it came out as well as it did. |
||
NEED to polish scripts and make them available here. |
Latest revision as of 03:11, 16 February 2008
I started with the MS Word version of the VistA Monograph, vista_monograph2005_06.doc
- opened it with Open Office
- saved it as html
- cleaned it up with Dave Raggett's HTML Tidy
- funny characters - went away when I got the UTF-8 stuff right in HTML Tidy.
- converted it to MediaWiki using HTML::WikiConverter.
- manual editing to clean out a lot of junk html. replaced w/ sed scripts. Trust your Browser! to get things right w/o all this junk.
- br
- font
- div
- span
- script to remove excess blank lines.
It's no wonder it has glitches. I'm really surprised it came out as well as it did.
NEED to polish scripts and make them available here.