From: wmb@Sun.COM Subject: FORML paper Date: 9 November 1987 at 08:10:54 GMT+1 To: Don Hopkins Here's an ascii copy of a paper I'm presenting at the FORML conference in a few weeks. Interpreting Control Structures - The Right Way Abstract A very simple modification allows the Forth interpreter to execute conditionals and loops in interpret state as well as in compile state. Interpreted loops run at the same speed as compiled loops. The Forth 83 Standard says that control structures (conditionals and loops) are compiled inside colon definitions. There is a easy way to remove this restriction so that control structures work just as well from interpret state. The idea is very simple: when a control structure is encountered while interpreting, switch to compile state and begin compiling an unnamed temporary colon definition. When that control structure is finished, execute the unnamed colon definition and then forget it. Nested control structures can be easily handled. Each word which begins a control structure increments a variable, and each word which ends a control structure decrements the variable. When that variable changes from 0 to 1, begin compiling the unnamed colon definition. When the variable changes from 1 to 0, execute the unnamed colon definition and forget it. This can easily be implemented with 4 words: SAVED-DP ( -- adr ) The address of a variable which contains the starting address of the temporary colon definition, if one is being compiled. LEVEL ( -- adr ) The address of a variable which contains the current control structure nesting level. +LEVEL ( -- ) Increments the value contained in the variable LEVEL . If STATE is interpreting and LEVEL was 0 before being incremented, switch to compile state and begin compiling an unnamed temporary colon definition. -LEVEL ( -- ) Decrements the value contained in the variable LEVEL . If LEVEL is 0 after having been decremented, execute the temporary colon definition, discard it, and return to interpret state. These words are easy to implement on most systems. A "standard" implementation does not appear to be possible, because of differences in the way that different systems compile colon definitions. Another implementation dependency arises from different interpreter organizations; in some systems, the interpreter and compiler are separate loops; in other systems, there is only one loop whose behaviour changes according to the value of STATE . Nevertheless, a person familiar with the internals of a particular system should have little difficulty figuring out how to do it for that system. Here is an implementation for F83: variable saved-dp variable level : +level ( -- ) level @ if \ If in compile state, just increment level 1 level +! else state @ 0= if \ If in interpret state, switch to compile state 1 level ! here saved-dp ! \ Remember the start r> ['] ] >body >r >r \ XXX Execute ] after the caller then then ; : -level ( -- ) state @ 0= abort" Conditionals not paired" level @ if -1 level +! level @ 0= if compile exit \ Finish the definition saved-dp @ here - allot \ Reclaim the memory [compile] [ \ Enter interpret state here >r \ YYY Execute the definition then then ; : begin +level [compile] begin ; immediate : do +level [compile] do ; immediate : ?do +level [compile] ?do ; immediate : if +level [compile] if ; immediate : then [compile] then -level ; immediate : loop [compile] loop -level ; immediate : +loop [compile] +loop -level ; immediate : until [compile] until -level ; immediate : again [compile] again -level ; immediate : repeat [compile] repeat -level ; immediate The words SAVED-DP , LEVEL , +LEVEL , and -LEVEL take up about 200 bytes of dictionary space. If the calls to +LEVEL and -LEVEL are added to the kernel versions of the control structure words BEGIN , IF , LOOP , etc., instead of redefining them, the total increase in the size of the dictionary is just over 200 bytes. Implementation Notes: In the above example, there are two lines of code which are not entirely portable. The line marked XXX executes the word " ] " (right-bracket) after the caller of +level . This is necessary in F83, because in F83 " ] " is the compiler loop. If " ] " were executed directly within +LEVEL , the rest of the control structure would be compiled before the beginning run-time word. For instance, in the case of IF , the rest of the control structure would be compiled before " [compile] if". For systems in which " ] " simply sets the variable STATE (as in FIG Forth and MVP-Forth), the phrase: r> ['] ] >body >r >r \ XXX Execute ] after the caller may be replaced by: ] The line marked YYY causes the unnamed temporary colon definition to be executed when -LEVEL returns. For most threaded code Forth implementations, pushing the parameter field address on the return stack is a convenient (but not "standard") way to execute an unnamed colon definition. Other systems might need to use a different technique. Compiler Extension Words It is possible to add the +LEVEL function to the control structure defining words MARK (see Forth-83 Standard, Chapter 15), and to add the -LEVEL function to RESOLVE , thus making the interpreted control structure behavior automatic for any words which use xMARK and xRESOLVE . Here is an implementation of these system extension words, with the additional features of compiler security and automatic compilation of the run-time word. : +>mark (s acf -- adr ) +level , >mark ; : +resolve (s adr chk2 chk1 -- ) ?pairs >resolve -level ; : -mark 3 ; immediate : ?do ['] (?do) +>mark 3 ; immediate : if ['] ?branch +>mark 2 ; immediate : else ['] branch +>mark 2 2swap 2 ->resolve ; immediate : then 2 ->resolve ; immediate : loop compile (loop) over 2+ resolve ; immediate : +loop compile (+loop) over 2+ resolve ; immediate : until ['] ?branch 1 -