Forth Lesson 18

From OLPC
Revision as of 22:53, 6 October 2012 by FGrose (talk | contribs) (Proper name for Open Firmware)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search
Mitch Bradley's Forth and
Open Firmware Lessons:

Structs in Forth

As with the previous lesson, this is transcribed from an email on a mailing list.

In a nutshell, all a field word does is to add a fixed offset to an address on the stack.

 struct
 2 field >w
 4 field >l
 1 field >b
 constant /foo

is equivalent to:

 : >w  0 +  ;
 : >l 2 +  ;
 : >b 6 + ;
 7 constant /foo

So struct/field are just syntactic sugar to keep track of a running offset. The running offset value is kept on the stack at compile time (recall that the Forth compiler is just Forth), so if you want to skip some offsets (e.g. a reserved field or one you know you aren't going to use so you don't want to burn space to give it a name), you can just add to the running offset with, e.g.

 4 +

In the following example:

 struct
 4 +                \ >fw-req
 4 field >tx-stat
 ...

The packet descriptor starts with a 4-byte field that is handled at a different level of the protocol, so we don't waste space naming it in this module, but the comment "\ >fw-req" reminds us what it is for.

How It Works

What follows is not necessary for use of the "struct" word, but interesting and useful in other contexts. In particular, the mechanism underlying it is used in the x86 assembler built into Open Firmware.

The implementation of struct is just ": struct 0 ;", It is just a sugary alias for 0, i.e. the initial value of the running offset.

The implementation of field is a bit more complex, because it has to create new words (e.g. >l in the example above).

 ok see field
 : field
   create over , + does> @ +
 ;

Let's break that down, assuming that we are executing:

 4 field >l

When we get to that line, the current running offset (sum of all previous field sizes) is 2, which is sitting on the top of the stack.

The interpreter processes the "4" by pushing it on the stack, so the stack now has ( 2 4 ) . "field" is now interpreted, thus executing its contents:

 create

"create" makes a new word whose name is consumed from the input stream. In this case the next word in the input stream is ">l", so create makes a new word named ">l". The default runtime action of a newly created word is to push the address of the next available location in data space (i.e. the address of the new words "body"), but we are going to override that default shortly. "create" has no effect on the stack, so the stack still has ( 2 4 ).

 over        ( 2 4 2 )

"over" makes a copy of the second item on the stack, which is the running offset.

 ,              ( 2 4 )

"," pops the stack and stuffs the number in the next (empty) location of data space (and increments the filled-amount of data space), which is the body of the new word ">l".

 +             ( 6 )

"+" adds the item size (4) to the running offset (2), giving a new value (6) for the running offset.

 does>

"does>" is sneaky. The combination of "create" and "does>" is what makes Forth a dynamic language. "does>" modifies the behavior of the most recently created word, extending its default action with the rest of the current definition. So the new action for ">l" is now "@ +", i.e. everything after "does>". "does>" then exits from its caller ("field"), so the "@ +" doesn't execute at compile time, but rather is attached to the new word and will execute later if you run ">l".

Recall that the default action of the new word is to push the address of its body, so when ">l" later executes, the stack looks like this:

 ( structure-instance-base-address  body-addr-of->l )

The "structure-instance-base-address" stack item is assumed to have already been placed there by external code, because field words are intended for adding offsets to base addresses. The "body-addr-of->l" is put there by the code that implements the default action of "create"d words.

 @   ( structure-instance-base-address 2 )

"@" reads the number that "," stuffed into the body earlier, i.e. the running offset that was on the stack when ">l" was created, i.e. 2.

 +   ( field-address )

"Create / does>" is thus seen as a compact way to make a class of similar words that basically do the same thing, but with different parameters. In the case of "field", the action is "add an offset to a base address" and the parameter is the specific offset.

Thus endeth the lesson.

Next Lesson: Debugging Notes et alia.