9                 Structured Data with Bit Fields
9


9                          Mitch Bradley
9

    A structured data item is a data  item  composed  of  several
subfields of various types, analogous to the `struct' declaration
of C or the `RECORD' types of Pascal and Modula-2.

    Structured data types are trivial to implement  in  Forth  if
each  subfield  is  constrained  to  occupy an integral number of
machine bytes.  The problem becomes more interesting if subfields
may  include  groups  of bits within a byte or word, as in C `bit
fields' or Modula-2 `BITSET's.  Such bit fields arise  frequently
when  dealing with hardware registers, which often contain groups
of bits which control independent hardware functions.

    This paper presents a set of words for conveniently  defining
and  accessing  structured  data,  which  may contain bit fields.
Their utility is shown with a real example -- manipulation  of  a
disk controller.

Introduction

    A  number  of  proposals  for  data  structuring  words  have
recently  appeared  in  the  literature  [WALK][BASI][JOOS].  The
scheme that I present is simpler than most of these, and probably
not  as powerful.  My particular emphasis in this paper is not on
data structures in general, but on data structures with  imbedded
bit fields.

    Structures or records are  usually  used  to  group  together
related  data items.  As an example, consider the following frag-
ment of C [KERN] code.

        struct controller-registers {
               unsigned short   data;
               unsigned short   command;
               unsigned long    dma-base;
               unsigned short   dma-count;
        };

This structure declares that there is  a  structure  type  called
`controller-registers'.   This  structure  contains  four fields,
three of them short unsigned integers (2 bytes each), and one  of
the a long unsigned integer (4 bytes)[1].
_________________________
9  [1] Those of you familiar with C will no doubt  note  that


9                         March 20, 1986


                              - 2 -


    The structure could later be used by declaring a  pointer  to
that structure, as in:

        struct controller-registers *ptr;


    To access the dma-count register, one could then use

        ptr->dma-count


    From a code generation standpoint, this is not hard to imple-
ment.   All  the  compiler  has to do is to remember what type of
object is represented by `dma-count'  and  its  offset  from  the
start of the structure.

    When Forth programmers are faced with accessing a data struc-
ture like this one, the tendency is to use numerical offsets into
the data structure.  The  offset  calcualation  is  usually  done
explicitly  when  the  object is referenced, using perhaps `8 +'.
Explicitly calculating offsets is markedly inferior to having  it
done  automatically  for you, though.  Explicit offsets buried in
code make the code hard to  understand  and  especially  hard  to
modify.

    A structure capability that automatically calculates  offsets
is extremely easy to implement in Forth.

        : field ( offset size --- ofset+size )
          create over , +
          does> ( fd pfa --- fd+offset )
                 @ +
        ;
        : struct 0 ;

This may be used as follows:

        struct ( controller-registers )
            2 field   data
            2 field   command
            4 field   dma-base
            2 field   dma-count
        constant  controller-registers-size

The word `field' just keeps a  running  account  of  the  current
offset  at  compile  time, and creates a new word which when exe-
cuted will add its offset to the number on the stack.   The  word
`struct'  simply  initializes the current offset to 0.  After the
structure definition is finished, it is  convenient  to  remember
_________________________
dma-base  should  probably  be a (char *) instead of an (un-
signed long).  However,  it  is  easier  to  explain  as  it
stands.


                         March 20, 1986


                              - 3 -


the total size of the structure by defining a constant, using the
size that is already on the stack.

    Later the fields may be used as follows, assuming we  have  a
variable  `ptr' which contains the address of an instance of this
structure.

                  ptr @ data @    ( fetches the contents of the data field )

        100000.   ptr @ dma-base 2!   ( sets the dma-base field to 100000. )


Bitfields

    So far this has been pretty trivial, and hardly worth writing
a  paper about.  So lets make the problem a little more interest-
ing.  Suppose that we want to define some  fields  which  do  not
occupy  an  integral  number  of bytes.  This situation can arise
when there are data items that are very short, say 1 bit, and you
don't  want  to  waste  an  entire byte just to store 1 bit.  The
situation also  frequently  arises  when  dealing  with  hardware
registers,  which often pack several barely-related bits within a
single byte or word.  Coping with  single  bits  isn't  too  bad,
because  it  is  easy  to  operate  on  the bit with a mask and a
boolean operator, setting or clearing the bit  as  desired.   The
real  pain  comes when there is something like a 3-bit field that
represents a number, and that field is somewhere in the middle of
a byte.  Then the problem involves a lot of masking and shifting,
which is annoying to write and hard to read.

    C copes with this problem by providing a  `bitfield'  defini-
tion  capability within structures.  Bitfields allow the program-
mer to give a symbolic name to a group of bits within a  word  or
byte,  and  then to access that group of bits as though it were a
small integer, without having to do explicit chifting  and  mask-
ing.   I  have attempted to provide this capability within Forth,
and  have  succeeded  to  some  extent.   Here's  how  it  works,
described  by  example.  This segment of code comes from a driver
for a Xylogics Model 450 disk controller controller board[2].


9_________________________
9  [2] The funny byte numbering (1 first, then 0) is a result
of  the fact that the controller manual numbers the bytes in
`little endian' order, with the least significant byte of  a
word  at the lowest address, whereas the code is written for
a 68000 machine, which is `big endian', with the most signi-
ficant  byte  of a word at the lowest address.  This problem
is well-known to anybody who has ever tried to write  `port-
able' code to deal with real I/O devices.  It ranks as a ma-
jor nuisance.


                         March 20, 1986


                              - 4 -


        struct ( xyiopb )
        ( Byte 1 )
          1 resbits
          1 bits  intrall ( interrupt on all iopbs (450)
          1 bits  intrerr ( interrupt on error )
          1 bits  reserve ( reserve dual port drive (450)
          1 bits  recal   ( recalibrate on seek errors (450)
          1 bits  enabext ( enable extensions (450)
          2 bits  eccmode ( ECC actions )
        ( Byte 0 )
          1 bits  autoup  ( auto update of IOPB )
          1 bits  reloc   ( use relocation )
          1 bits  chain   ( command chaining )
          1 bits  ie      ( interrupt enable )
          4 bits  cmd     ( command )
        ( Byte 3 )
          1 field errno   ( error number )
        ( Byte 2 )
          1 bits  iserr   ( error indicator )
          2 resbits
          3 bits  ctype   ( controller type )
          1 resbits
          1 bits  complete ( completion code valid (450)
        ( Byte 5 )
          2 bits  drive   ( drive type )
          4 resbits
          2 bits  unit    ( unit number )
        ( Byte 4 )
          1 bits  bytebus ( use byte transfers )
          4 bits  intrlv  ( interleave - 1 (450)
          3 bits  throttle ( throttle control )
        ( remaining are not bit fields )
          1 field sector  ( 7: sector number )
          1 field head    ( 6: head number )
          2 field cylinder ( 9,8: cylinder number )
          0 field nsect   ( b,a: sector count )
          2 field status  ( low byte is status )
          2 field bufoff  ( d,c: buffer offset )
          2 field bufrel  ( f,e: buffer offset )
          1 +  ( 11: reserved byte )
          1 field bhead   ( 10: base head (450)
          2 field nxtoff  ( 13,12: next iopb offset )
          2 field eccpatt ( 15,14: ECC pattern )
          2 field eccaddr ( 17,16: ECC address )
        constant xyiopbsize


    Take a look at `Byte 1'.  The 8 bits of this byte  control  6
different  functions.   The  most-significant-bit  is not used (1
resbits).  The next 5 bits control  5  separate  modes,  and  the
final  2  bits  are  encoded  to  select one of four possible ecc
modes.  Other bytes have different configurations of  bits.   The
code  to  control  this disk controller could obviously get quite


                         March 20, 1986


                              - 5 -


contorted if the programmer had to do explicit shifting and mask-
ing in order to manipulate all these little pieces of bytes.

    Here is a section of the code which accesses this  structure.
Note  how  the use of the bitfield names has made the function of
the code obvious.

        : setiopb ( buf. cnt sec head cyl cmd -- )
          iopb xyiopbsize    0 fill
          ( ... cmd ) iopb cmd      bit!
          ( ... cyl ) iopb cylinder !
          ( ... head) iopb head     c!
          ( ... sec ) iopb sector   c!
          ( ... cnt ) iopb nsect    !
          ( ... bufhi ) 15 and   12  shift
                        iopb bufrel !
          ( ... buflo ) iopb bufoff !

          drivetype  @ iopb drive    bit!
          unitno     @ iopb unit     bit!
          xythrottle @ iopb throttle bit!

          1 iopb autoup  bit!
          1 iopb reloc   bit!
        ( specific to the 450 -- not on the 440 )
          1 iopb enabext bit!
          basehead @ iopb bhead  c!
          2 iopb eccmode bit!
        ;


    The rabid Forth programmer will no doubt complain  that  this
definition  is  too  long.  This is a valid point.  The excuse is
that this bit of code  was  translated  almost  verbatim  from  a
known-working C program, and I had no motivation whatsoever to do
a stylistic transformation strictly for the sake of `purity'.  If
it isn't broken, don't fix it.

Implementation

    The code to implement bit field defining, fetching, and stor-
ing  follows.  It is deceptively simple, but its use represents a
major improvement in the writeability  and  understandability  of
programs that need to deal with such structures.  Both high-level
and 68000 assembly language versions are given.  It was  actually
easier  to  write `bit!' in assembly language than in high-level,
because the stack manipulations were complicated.

    This package specifically does not allow bit fields to  cross
byte  boundaries.  It would not be hard to extend it to allow bit
fields to be anywhere within a word.  This particular application
benefited  from  the byte-only restriction, because the disk con-
troller hardware had some funny byte-only  registers.   Undesire-
able  side  effects  resulted  from  using a word operation which


                         March 20, 1986


                              - 6 -


accessed 2 of them at once.

    I suspect that allowing bit  fields  to  cross  n-byte  boun-
daries,  where  n is the number of bytes in the machines hardware
data paths, would make the code a lot more complicated.

Acknowledgement

    The syntax for defining fields, where the current  offset  is
kept on the stack at compile time, was suggested at a FIG meeting
by (I believe) Glenn Tenney.

References

[BASI] Basile, James, Implementing Datastructures in FORTH  _J_o_u_r.
_F_O_R_T_H _A_p_p_l. _R_e_s. 1, 2, December, 1983.  pp.5-16.

[JOOS] Joosten, Rieks, and Nieuwenhuysen, Hans, Vectoring  Arrays
of  Structures _J_o_u_r. _F_O_R_T_H _A_p_p_l. _R_e_s. 1, 2, December 1983. pp.35-
54

[KERN]  Kernighan,  B.,  and  Ritchie,  D.,   _T_h_e  _C  _P_r_o_g_r_a_m_m_i_n_g
_L_a_n_g_u_a_g_e Prentice-Hall, 1978.

[WALK] Walker, Bruce W., PL/1 Data Structures _F_O_R_T_H _D_i_m_e_n_s_i_o_n_s 5,
6, March/April 1984, pp.8-12.


                         March 20, 1986


                              - 7 -


        ( Forth 83 )
        ( structures with bit fields )
        variable bitoffset

        : bitalign ( offset size -- offset' size )
        ( moves to next byte boundary if the bit offset is nonzero )
          bitoffset @ if
            0 bitoffset !   swap 1+ swap
          then
        ;

        : struct 0 bitoffset ! 0 ;
        : field ( offset size -- newoffset )
            create bitalign over , +
            does> ( structure-base -- structure-member-address )
                  @ +
        ;
        ( reserves an n-bits field )
        : resbits ( offset #bits -- newoffset )
          bitoffset @ +   8 /mod  ( offset bitoffset_mod_8 bitoffset/8 )
          swap bitoffset !  +
        ;
        ( bits are allocated from left-right, or from msb to lsb. )
        ( however, they are numbered from lsb=0 to msb=7 )
        ( left-right generates the correct bit number )
        : left-right-bit# ( #bits -- #bits bit# )
          bitoffset @ over +  8 swap -
          dup 0< abort" Bit fields can't cross byte boundaries"
        ;

        ( structure describing the layout of information in the parameter field )
        ( of a word created by "bits" )

        struct ( bit-record )
          2 field bit-offset
          2 field bit#
          2 field #bits
        constant bit-record-size

        ( defines an n-bit field )
        : bits ( offset #bits --namex newoffset )
          create  over , left-right-bit# , dup ,  ( offset #bits )
          resbits
          does> ( struct_addr pfa -- byte_addr bit# #bits )
             >r
             r@ bit-offset @ +  ( byte_addr )
             r@ bit#       @    ( byte_addr bit# )
             r> #bits      @    ( byte_addr bit# #bits )
        ;

        hex
        create masks
          0 c, 1 c, 3 c, 7 c, 0f c, 1f c, 3f c, 7f c, ff c,


                         March 20, 1986


                              - 8 -


        decimal

        : bitmask ( n -- mask ) masks + c@ ;

        : bit@ ( byte_addr bit# #bits -- n )
          rot c@           ( bit# #bits byte )
          rot negate shift ( #bits shifted-byte )
          swap bitmask and
        ;
        : bit! ( n byte_addr bit# #bits -- )
          bitmask         ( n byte_addr bit# mask )
          rot >r          ( n bit# mask )
          rot over and    ( bit# mask masked-n )
          2 pick shift    ( bit# mask shifted-masked-n )
          -rot            ( shifted-masked-n bit# mask )
          swap shift      ( shifted-masked-n shifted-mask )
          not             ( shifted-masked-n shifted-inverted-mask )
          r@              ( shifted-masked-n shifted-inverted-mask byte_addr )
          c@ and          ( shifted-masked-n masked-old-value )
          or              ( new-value )
          r>  c!
        ;

         68000 assembly language version of bit@ and bit!

         code bit@ ( byte_addr bit# #bits -- n )
           word
           sp )+ d0 move  ( d0 is #bits )
           masks # d7 move   d7 a0 lmove  ( relocated address of mask table )
           byte
           0  d0  a0  di)  d0 move        ( mask to d0 )
           word
           sp )+ d1 move  ( d1 is bit# )
           sp )+ d7 move   d7 a0 lmove   ( relocate 16 bit address to 32 bits )
           long d2 clr
           byte a0 ) d2 move
           d1 d2 lsr
           d0 d2 andw
           word d2 sp -) move
         c;

         code bit! ( n byte_addr bit# #bits -- )
           word
           sp )+ d0 move  ( d0 is #bits )
           masks # d7 move   d7 a0 lmove  ( relocated address of mask table )
           byte
           0  d0  a0  di)  d0 move        ( mask to d0 )
           word
           sp )+ d1 move  ( d1 is bit# )
           sp )+ d7 move   d7 a0 lmove   ( relocate 16 bit address to 32 bits )
           sp )+ d2 move  ( d2 is n )
           d0 d2 andw
           d1 d2 lsl ( masked, shifted, n in d2 )
           d1 d0 lsl ( shifted mask in d0 )


                         March 20, 1986


                              - 9 -


           d0 notw ( inverted mask in d0 )
           byte d0 a0 ) andw
                d2 a0 ) orw
           word
         c;


                         March 20, 1986