Forth Lesson 2

From OLPC
Revision as of 18:18, 3 January 2007 by IanOsgood (talk | contribs) (+links)
Jump to navigation Jump to search

Review

In the previous lesson we learned to:

  • Display the stack with "showstack" and ".s"
  • Push numbers on the stack by typing them
  • Control the number base with "hex" and "decimal"
  • Execute words by typing their names

Stack Diagrams

Pay attention to this section because it introduces notation that will be used over and over.

Since Forth words use the stack for arguments and results, their description must tell you the arguments that they pop from the stack and the results that they push back on the stack. That is done with a "stack diagram".

 +  ( n1 n2 -- n3 )

That indicates that the Forth word "+", which we saw in the last lesson, pops two numbers n1 and n2 from the stack and pushes back one number n3. The list of items before the "--" is the arguments, the list after is the results. In each list, the items at the right is the top of the stack.

In general, Forth words can take any number of arguments and leave any number of results.

This is purely a notation convention. The Forth interpreter does not process stack diagrams other than to skip them. The way it knows to skip them is because the Forth word "(" introduces a comment, which is terminated by the next ")".

Another example:

 showstack  ( -- )

That means that "showstack" has no net stack effect. It doesn't pop anything off the stack (no arguments) and it doesn't leave any extra items on the stack after it finishes. It might push and pop numbers while it is executing, but when it is done, the stack is the same as it was before.

 .  ( n -- )

The word "." (which displays a number), pops its one argument from the stack and leaves nothing in its place.

The names in the argument and result lists (e.g. "n1", "n2", "n3") are arbitrary, but as an aid to understanding, they are usually chosen to convey extra information. For example:

 type  ( adr len -- )

We haven't seen the word "type" yet, but clearly it takes two arguments, a length on the top of the stack and an address below that. It pops both of the arguments from the stack, leaving nothing in their place.

Conventionally, stack item names beginning with "n" refer to signed integers, "u" to unsigned integers, "d" to double numbers (two stack numbers interpreted as a 64-bit integer), "adr" to addresses, "flag" to values that are either true (0xfffffff) or false (0). But that is not a hard and fast rule.

How Comments Work

The following is in some sense an implementation detail, but it's good to understand it, because it is key to Forth's approach to syntax, which is vastly different to most other languages.

In the section above I mentioned that the word "(" skips to the next ")". That might seem like an exception to the rule that the interpreter only parses whitespace-delimited words, but it is not.

What actually happens is that the main interpreter loop only sees the "(". The "(" must be followed by whitespace, otherwise the interpreter will not parse just "(" but rather some longer string beginning with "(". The interpreter then looks in the list of defined words for one named "(", and executes it.

The behavior of the "(" word is to call the parser, asking it to collect a sequence of characters delimited by ")", and then to discard the result. So

  • Any word can call the input parser, not just the main interpreter.
  • The parser itself can use any character as a delimiter, not just whitespace. The main interpreter only asks the parser for whitespace-delimited words, but other words can and do parse using other delimiters.
  • This same approach (main interpreter calls a word that then parses using a different delimiter) is also used for string literals where the delimiter is ".
  • You can, if you wish, call the parser yourself from user code, to collect any kind of string you want. It's best to use this capability sparingly to avoid confusion, but it illustrates the fact that the entire Forth system is available to you; nothing is magic or hidden.

Comment to End of Line

The "( .... )" comment form stops at the ")"; stuff after it will be interpreted as usual. To comment out everything else on the line, use "\".

\ everything after the first \ will be ignored

Note that the "\" must be followed by whitespace, because "\" is just a word like anything else. The way that "\" works is to call the parser with a delimiter value (-1) that can't possibly match a character, discarding the result. The parser will stop at the end of the line, not having found a delimiter. (Recall from a previous lesson that input is processed a line at a time.)

Thus endeth the lesson

Next Lesson