**** assem68k.f:

	A Forth-style assembler for the 68000.

Register Names:  D0  D1  D2  D3  D4  D5  D6  D7
                 A0  A1  A2  A3  A4  A5  A6  A7
		 SR CCR
		 UP   W  IP  RP  SP

Register usage in the Forth virtual machine:

	D0 D1 D2 D3	Scratch
	D4		Floating Point Stack pointer
	D5 D6		Scratch
	D7		Relocation Base.  16 bit token versions only
	A0 A1 A2	Scratch
	UP (A3) 	Base address of User Vector
	W  (A4) 	Word pointer.
	IP (A5)		Interpreter Pointer.
	RP (A6)		Return stack Pointer
	SP (A7)		Parameter Stack Pointer

Addressing Modes:

	Forth 		Motorola	Name
	Syntax		 Syntax

	Dn		Dn		Data Register Direct
	An		An		Address Register Direct
	An )		(An)		Indirect
	An )+		(An)+		Indirect Postincrement
	An -)		-(An)		Indirect Predecrement
	d An D)		d(An)		Indirect Displaced
	d Ri An DI)	d(An,Ri)	Indirect Indexed (Short)
	d Ri An DI)L	d(An,Ri)	Indirect Indexed (Long)
	xxxx #		#xxxx		Immediate (Short)
	xxxxxxxx L#	#xxxxxxxx	Immediate (Long)
	xxxx #)		xxxx.W		Absolute (Short)
	xxxxxxxx L#)	xxxxxxxx.L	Absolute (Long)
	d PCD)		d PC relative	PC relative
	d Ri PCDI)	d PC rel. + Ri	PC relative indexed

PCD) and PCDI) are not currently implemented (Forth very seldom needs
to use pc-relative addressing).

Operation names:

	The operation names are the same as in the Motorola manual.

Overall Syntax:

	Motorola:		op	src,dst
	Forth:	    src   dst   op

Operation Size:

	Motorola: Every operation explicitly specifies its size,
		  as in ADD.W  or  ADD.L

	Forth:	There is a "current size" that is used for operations.
		The current size may be changed with the words
		BYTE    WORD     LONG    NORMAL
		Once the size has been set, it remains that way
		until changed.  NORMAL is either WORD or LONG,
		depending on the width of the Forth stack.
		For the 16 bit Forth system, NORMAL is the same as WORD.
		For the 32 bit Forth system, NORMAL is the same as LONG.

		As a special case, there are 4 versions of the move
		instruction:  MOVE  BMOVE  WMOVE  LMOVE
		MOVE is dependent on the current size.
		BMOVE is always byte move.
		WMOVE is always word move.
		LMOVE is always long move.
		BMOVE, WMOVE, and LMOVE do not affect the current size.

Conditionals:

	The normal Motorola conditionals are provided
	(e.g. BLE, SVC, DBNE, etc.) but their use is discouraged.
	Instead, a better facility is provided.  The condition and
	the operation may be specified independently.  Here are the
	condition names; both sets (Forth and Motorola) are provided.

	Forth:    ALWAYS  NEVER  U>  U<=  U>=  U<  0<>  0= 
	Motorola: ALWAYS  NEVER  HI  LS   CC   CS  NE   EQ

	Forth:     VC  VS  0>=  0<  >=  <   >  <=
	Motorola:  VC  VS  PL   MI  GE  LT GT  LE

	Any of these condition names can be used preceding one of
	the following conditional operations:

	BRIF   ( where condition -- )
		Assembles a conditional branch to "where"
	SETIF  ( operand condition -- )
		Assembles a "Set according to condition codes"
	DBUNTIL ( where condition data-register -- )
		Assembles a "Decrement and Branch according to Condition
		Codes"

Structured conditionals:

	The Forth assembler provides structured conditionals in the
	assembler.  <condition> is one of the condition names described
	in the previous section.

	<condition>  IF    ...   THEN
	<condition>  IF    ...   ELSE   ...   THEN
	BEGIN   ...   <condition> UNTIL
	BEGIN   ...   AGAIN
	BEGIN   ...   <condition> WHILE   ...   REPEAT
	BEGIN   ...   <condition>  Dn  DBUNTIL

	The structures do not have to fit on one line.  They may
	be nested.

	The range of the branches assembled by these structures is
	+- 128 bytes, so the structure must not be extremely large.  A
	conditional structure that large ought to be coded in high-level
	Forth anyway.

Starting assembly:

	The assembler is really just a Forth vocabulary which contains
	the words for assembling 68000 code.  It is "activated"
	by adding the assembler vocabulary to the search order.
	There are also some common ways to control assembly
	which do more than just put the assembler vocabulary in the 
	search order.

ASSEMBLER ( -- )
	Makes the assembler vocabulary the CONTEXT vocabulary (i.e. puts
	it at the top of the search order), thus making all the
	assembler words accessible.

	See:  VOCABULARY

CODE ( -- )  ( Input Stream: word-name )
	Defines the word whose name is taken from the input stream
	and invokes the assembler.  When that word is later executed,
	the machine code that was generated by the assembler will be
	executed.

	The word is added to the CURRENT vocabulary, and the CONTEXT
	vocabulary (the one that is first in the search order) is
	initially set to ASSEMBLER.  The operand size is initially
	set to NORMAL.

	This is the most common way to begin assembly.

C;  ( -- )						"c-semi-colon"
	Terminates a code definition and allows the name of the
	corresponding definition to be found in the dictionary.

	The CONTEXT vocabulary is set to the same as the CURRENT vocabulary
	(which removes the ASSEMBLER vocabulary from the search order,
	unless you have explicitly done something funny to the search order
	while assembling the code).

	The "next" routine, which causes the Forth interpreter to continue
	execution with the next word, is automatically added to the end of
	the word being defined.

	This is the most common way to end assembly.

END-CODE ( -- )
	Terminates a code definition and allows the name of the
	corresponding definition to be found in the dictionary.

	The CONTEXT vocabulary is set to the same as the CURRENT vocabulary
	(which removes the ASSEMBLER vocabulary from the search order,
	unless you have explicitly done something funny to the search order
	while assembling the code).

	The "next" routine is NOT automatically added to the end of the
	code definition.  Usually you want "next" to be at the end of
	the definition, but sometimes the last thing in the definition
	is a branch to somewhere else, so the "next" at the end is not needed.

;CODE  ( -- )						"semi-colon-code"
	Used in the form:
	    : <defining-word-name>  ... CREATE ... ;CODE ... C; (or END-CODE)
	Stops compilation, terminates the defining word <defining-word-name>,
	executes ASSEMBLER, and sets the operand size to NORMAL.

	When <defining-word-name> is later executed in the form:
		<defining-word-name> <word-name>
	to define the new <word-name>, the still later execution of
	<word-name> will cause the machine code sequence following the
	;CODE to be executed.

	This is analogous to DOES>, except that the behavior of the
	defined words <word-name> is specified in assembly language instead
	of high-level Forth.

	See:  CODE  DOES>

NEXT	( -- )
	Assembler macro which assembles the "next" routine.  For the
	16 bit version, this actually assembles a branch instruction
	to the "next" code.  For the 32 bit version, assembles the
	"next" routine in-line.

Forth Virtual Machine Considerations:

	The Forth parameter stack is implemented with A7, but
	the name "SP" should be used instead of "A7", in case
	the virtual machine implementation should change.

	The return stack is implemented with A6, and the name
	"RP" should be used to refer to it.

	Items are pushed on the stack with the "-)" addressing
	mode, and removed from the stack with the ")+" addressing mode.

	The base address of the user area is in A3 ("UP").
	User variable number 124 (for instance) may be accessed
	with the "124 UP D)" addressing mode.

	The interpreter pointer is A5 ("IP").  The interpreter is
	post-incrementing, so when a code definition is being executed,
	IP points to the token after the one being executed.  A "token"
	is the number that is compiled into the dictionary for each
	Forth word in a definitions.  For the 32 bit token Forth, a token
	is a 32 bit absolute address.  For the 16 bit token Forth, a token
	is a 16 bit address that is relative to the address in the high
	word of D7.

	Registers D0, D1, D2, A0, A1, and A2 may be used freely within
	code definitions.  Their values are not expected to be preserved
	from one word to the next.

	When defining a code sequence using ;CODE, it is usually necessary
	to access some data that is placed in the child word.  The "W"
	(A4) is used for this purpose.  When the code sequence starts
	executing, the W register contains the starting address of the
	data area of the child word.

Stack Item Sizes and Relocation:

	In the Forth versions which use 32 bit stacks, numbers on the
	stack are always 32 bits wide.  Addresses on either stack
	are 32 bit absolute addresses.

	In the version with 16 bit stacks, numbers on the stack are
	normally 16 bits.  "Long" numbers are 32-bits, stored with
	the most-significant 16 bits at the lower address.  This makes
	it convenient to access the long stack items with a
	68000 long-sized instruction.  In the 16 bit stack version,
	normal addresses on the stack are 16 bit offsets from the
	base of the Forth system.  The base is constrained to start
	on a 64K boundary, and the most significant 16 bits of this
	base address are always kept in the high word of D7.  Before
	a 16 bit address may be used by the machine language, it
	must be relocated.  This is accomplished with the following
	two assembly language instructions:
		<source>   D7   WMOVE
		    D7     An   LMOVE
	<source> is usually  "SP )+" and An is whichever address register
	is going to be used to actually access the operand, frequently A0.
	This works because a word-sized move to a data register does
	not change the high word of the data register, thus the base
	address portion of the data register is not affected by the
	WMOVE.

	It is possible to write assembly code that will work on either
	the 32 bit or the 16 bit Forth systems.  Whenever you need to
	take an address off of the stack, use the following word:

AMOVE	( source destination -- )
	Assembles an appropriate sequence of instructions to move
	the address specified by the source operand to the destination
	operand, taking into account the size of the stacks on the
	Forth system.  For the 16 bit stack system, this generates:
		source   D7    WMOVE
		D7 destination LMOVE
	For the 32 bit stack system, it generates:
		source destination LMOVE

	Usually used in the form:
		SP )+  A0  AMOVE
	to get an addess from the stack into an address register where
	it may be used.
	
Immediate Operands:
	The following word is intended to assist in the writing of
	assembler code which assembles without change on both 16 bit
	and 32 bit systems:

N#	( immediate-operand -- )
	On a 16 bit system, equivalent to #.  On a 32 bit system,
	equivalent to L#.

Coping with Missing Instructions:

	There are some 68010 instructions that this assembler doesn't
	know about.  These instructions may be assembled by writing
	them in hex followed by ",".

,	( 16-bit-value -- )
	Takes a number off the stack and places the low-order 16 bits
	into the dictionary at HERE, and advances HERE to point to the
	following location.  This version of "," which is in the assembler
	vocabulary, is distinct from the normal Forth version.  This
	version always puts a 16 bit number in the dictionary, whereas
	the normal Forth version puts either a 16 bit or a 32 bit number,	depending on the size of a normal stack item.

Macros:

	Assembler macros may be written as Forth : definitions.
	For example, here is a macro which will generate code to
	relocate a 16 bit address to 32 bits, using D7 as the base
	address (with the base address constrained to start on a
	64K boundary).

	: rmove  ( src-mode dest-mode -- )
	   [ forth ] swap [ assembler ] d7 wmove
	   d7 [ forth ] swap [ assembler ] lmove
	;

	This could be used while assembling as:  "sp )+ a1 rmove"

	Note the switching to the forth vocabulary in order to get
	the Forth version of SWAP.  This is necessary because the
	68000 assembly language also contains a SWAP opcode, and
	the assembler vocabulary has the 68000 opcode version of SWAP.

Oddities, bugs, etc:

	In conventional 68000 assemblers, the numeric operand to
	the "quick" instructions (MOVEQ, ADDQ, etc) is explicitly
	specified as an immediate operand, as in
		MOVEQ  #4,D0
	In the Forth assembler, the immediate operator is not
	necessary or even permitted; the correct syntax is:
		4  D0  MOVEQ
	Note that in these instructions, the numeric operand is not
	really an immediate operand, in the narrow sense that an
	immediate operand is specified as Mode 7 Register 4, with
	the immediate data following the opcode as 1 or 2 words.

	The register list for a MOVEM instruction is specified as
	an immediate bit mask, as in:
		00ff #  sp -) movem
	which pushes all the address registers.  The complementary instruction
		sp )+  ff00 # movem
	pops the stack back into the address registers.

	The assembler ought to be smarter in deciding automatically
	whether to use # or L#.  The real problem occurs for a 16 bit
	forth, where a 32 bit number is stored on the stack as 2 separate
	16 bit stack items.  Since the interpretation of an immediate
	operand as either word-sized or long-sized depends both on the
	current size and also the particular opcode, the # word doesn't
	know whether it needs to take a word (16 bits) or a long (32 bits)
	off the stack, since it hasn't seen the opcode yet.

	Every now and then it would be nice if the assembler had labels.
	Usually the structured conditionals provide a nicer syntax and
	better looking code than labels, but there are times when you
	want to do something unstructured.  You can usually work around
	this by pushing an address on the stack and backpatching a
	branch offset into that address later, but it's ugly.

	The PCD) and PCDI) addressing modes are not currently implemented
	Forth very seldom needs to use pc-relative addressing.


	List of opcodes:

MOVE  LMOVE  WMOVE  BMOVE    MOVEM MOVEP   MOVEQ

LINK UNLK
TRAP  TRAPV  ILLEGAL RESET   NOP  STOP

DBCC   DBGE   DBLS  DBPL DBCS   DBGT   DBLT  DBT
DBEQ   DBHI   DBMI  DBVC DBF    DBLE   DBNE  DBVS
DBRA
DBUNTIL

BRA BSR BHI BLS BCC BCS BNE BEQ
BVC BVS BPL BMI BGE BLT BGT BLE

RTE  RTS  RTR
JSR JMP

NEGX NEG NBCD
CLR NOT TST
PEA LEA
EXG  SWAP  EXT  EXTW  EXTL

SCC  SCS  SEQ SF  SGE SGT SHI SLE
SLS  SLT  SMI SNE SPL ST  SVS SVC

BTST BSET BCHG BCLR
CHK  TAS

DIVS DIVU  MULS MULU

ADD SUB  ADDQ SUBQ  ABCD SBCD  ADDX SUBX   ADDI SUBI   ADDA SUBA

CMP  OR   AND  EOR  CMPI  ORI  ANDI EORI

CMPM  ( A-REG-POSTINC  A-REG-POSTINC -- )
CMPA

ROL ROR ROXL  ASL ASR ROXR  LSL LSR

	Condition Names:

ALWAYS NEVER HI  LS  CC  CS  NE  EQ
VC     VS    PL  MI  GE  LT  GT  LE

             U>  U<= U>= U< 0<>  0=
             0>= 0<  >=   <   >  <=

	Conditionals:

BRIF ( where condition -- )
SETIF ( ea condition -- ) ( set according to condition codes )
DBUNTIL

IF    ELSE   THEN
BEGIN  WHILE  REPEAT  UNTIL AGAIN