322 lines
11 KiB
ReStructuredText
322 lines
11 KiB
ReStructuredText
|
Write You a Forth, 0x06
|
||
|
-----------------------
|
||
|
|
||
|
:date: 2018-02-28 22:55
|
||
|
:tags: wyaf, forth
|
||
|
|
||
|
Lots of updates last night; SLOC-wise, I added a bunch of new definitions:
|
||
|
|
||
|
+ ``DEPTH``, ``.`` and ``.S`` to inspect the stack
|
||
|
+ ``/MOD``, ``*/``, and ``*/MOD``, which required adding some idea of a long
|
||
|
type
|
||
|
+ ``0<``, ``0=``, ``0>``, ``<``, ``=``, and ``>`` for conditionals
|
||
|
+ ``DUP`` and ``?DUP``
|
||
|
+ the logical operators ``AND``, ``OR``, and ``NEGATE``
|
||
|
+ ``ABS``
|
||
|
+ ``BYE`` moved from an interpreter hack to a defined word
|
||
|
+ ``D+`` and ``D-`` started me off on the concept of double numbers
|
||
|
+ ``DROP``, ``OVER``, and ``ROLL`` are more stack manipulation functions
|
||
|
|
||
|
It's starting to feel a lot like a Forth...
|
||
|
|
||
|
Speaking of SLOC, for shits and grins I decided to see how the code base has
|
||
|
grown:
|
||
|
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| revision | lines of code | growth | focus | exec size (b) |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| 0x02 | 133 | n/a | starting point | 38368 |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| 0x03 | 245 | 1.8x | parsing | 40920 |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| 0x04 | 369 | 1.5x | stack / numerics | 48736 |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| 0x05 | 677 | 1.8x | initial dictionary | 62896 |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
| 0x06 | 1436 | 2.1x | expanding vocabulary | 85256 |
|
||
|
+-----------+---------------+--------+----------------------+---------------+
|
||
|
|
||
|
Note that the executable is compiled with ``-O0 -g`` on the
|
||
|
``x86_64-linux-gnu`` target.
|
||
|
|
||
|
It makes sense that expanding the vocabulary is going to be a huge code
|
||
|
expansion. I did do more than that; so, I'm not really going to show most of
|
||
|
the work I did for the new words (a lot of it is repetative and mechanical).
|
||
|
|
||
|
System updates
|
||
|
^^^^^^^^^^^^^^
|
||
|
|
||
|
Before I started expanding the dictionary, though, I made some changes to
|
||
|
the ``System``::
|
||
|
|
||
|
$ git diff HEAD^ system.h
|
||
|
diff --git a/system.h b/system.h
|
||
|
index 00f4a34..91aa1fa 100644
|
||
|
--- a/system.h
|
||
|
+++ b/system.h
|
||
|
@@ -5,11 +5,24 @@
|
||
|
#include "io.h"
|
||
|
#include "stack.h"
|
||
|
|
||
|
+typedef enum _SYS_STATUS : uint8_t {
|
||
|
+ STATUS_OK = 0,
|
||
|
+ STATUS_STACK_OVERFLOW = 1,
|
||
|
+ STATUS_STACK_UNDERFLOW = 2,
|
||
|
+ STATUS_EXECUTION_FAILURE = 3,
|
||
|
+ STATUS_UNKNOWN_WORD = 4
|
||
|
+} SYS_STATUS;
|
||
|
+
|
||
|
+class Word;
|
||
|
+
|
||
|
typedef struct _System {
|
||
|
Stack<KF_INT> dstack;
|
||
|
IO *interface;
|
||
|
- struct Word *dict;
|
||
|
+ Word *dict;
|
||
|
+ SYS_STATUS status;
|
||
|
} System;
|
||
|
|
||
|
+void system_clear_error(System *sys);
|
||
|
+void system_write_status(System *sys);
|
||
|
|
||
|
#endif // __KF_CORE_H__
|
||
|
\ No newline at end of file
|
||
|
|
||
|
I've started adding a notion of system state, which I've deliberately kept
|
||
|
separate from the parser state. The new functions aren't particularly
|
||
|
interesting; they just write a string to the ``interface`` field so you
|
||
|
get things like::
|
||
|
|
||
|
$ ./kforth
|
||
|
kforth interpreter
|
||
|
? swap
|
||
|
stack underflow (error code 2).
|
||
|
? what-word?
|
||
|
unknown word (error code 4).
|
||
|
? 2
|
||
|
ok.
|
||
|
|
||
|
Note that this is separate from the parser errors::
|
||
|
|
||
|
$ ./kforth
|
||
|
kforth interpreter
|
||
|
? AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA
|
||
|
ok.
|
||
|
parse error: token too long
|
||
|
? unknown word (error code 4).
|
||
|
?
|
||
|
|
||
|
Though this test does show that the interpreter could be made more robust.
|
||
|
|
||
|
I/O updates
|
||
|
^^^^^^^^^^^
|
||
|
|
||
|
The next thing I did was move the ``write_dstack`` function into ``io.cc``;
|
||
|
this is needed to implement ``.S``. While I was at it, I decided to make
|
||
|
``write_num`` finally work well and correctly, and I think I've got the final
|
||
|
version done::
|
||
|
|
||
|
void
|
||
|
write_num(IO *interface, KF_INT n)
|
||
|
{
|
||
|
char buf[nbuflen];
|
||
|
uint8_t i = nbuflen - 1;
|
||
|
memset(buf, 0, nbuflen);
|
||
|
|
||
|
if (n < 0) {
|
||
|
interface->wrch('-');
|
||
|
}
|
||
|
|
||
|
I'm still not proud of this hack, but it seems to be the best way to deal with
|
||
|
this right now::
|
||
|
|
||
|
else if (n == 0) {
|
||
|
interface->wrch('0');
|
||
|
return;
|
||
|
}
|
||
|
|
||
|
while (n != 0) {
|
||
|
char x = n % 10;
|
||
|
|
||
|
This was the magic that finally got it right: negating the digits as they're
|
||
|
going into the buffer. No more trying to invert the whole number, just each
|
||
|
digit::
|
||
|
|
||
|
x = x < 0 ? -x : x;
|
||
|
x += '0';
|
||
|
buf[i--] = x;
|
||
|
n /= 10;
|
||
|
}
|
||
|
|
||
|
interface->wrbuf(buf+i, nbuflen - i);
|
||
|
}
|
||
|
|
||
|
My first pass at this wrote the string forwards, then reversed it. I didn't
|
||
|
like that; while performance isn't my first concern, it just seemed like a
|
||
|
fun challenge to get the reversed buffer written correctly.
|
||
|
|
||
|
While I was in the I/O subsystem, I also decided to update the ``IO`` base
|
||
|
class to include a ``newline`` method; I had a few instances of
|
||
|
``interface->wrch('\n')`` sprinkled throughout, but that won't necessarily be
|
||
|
correct elsewhere.
|
||
|
|
||
|
Miscellaneous updates
|
||
|
^^^^^^^^^^^^^^^^^^^^^^
|
||
|
|
||
|
I add a new definition to the ``defs.h`` files: a ``KF_LONG`` type to prepare
|
||
|
for the double numbers mentioned in the next section, and switched to static
|
||
|
compilation.
|
||
|
|
||
|
New words!
|
||
|
^^^^^^^^^^
|
||
|
|
||
|
Finally, I started adding the new words in. I'm still trying to figure out a
|
||
|
good way to handle the address types (I think I'll just introduce a ``KF_ADDR``
|
||
|
type) so I've punted on those for now.
|
||
|
|
||
|
.. _pforth: http://www.softsynth.com/pforth/
|
||
|
.. _gforth: https://www.gnu.org/software/gforth/
|
||
|
|
||
|
One of the interesting challenges is dealing with the double numbers. These are
|
||
|
on the stack as a pair of smaller numbers, e.g. if the double number type is 64
|
||
|
bits and the standard number type is 32 bits, then you might see something like
|
||
|
this (via pforth_)::
|
||
|
|
||
|
0 1 0 1 D+
|
||
|
ok
|
||
|
Stack<10> 0 2
|
||
|
|
||
|
So, how to deal with this? There's a ``D.`` word, which I don't have
|
||
|
implemented yet, that will let me see what pforth_ and gforth_ do::
|
||
|
|
||
|
$ pforth -q
|
||
|
Begin AUTO.INIT ------
|
||
|
0 1 D. 1 0 D.
|
||
|
4294967296 1
|
||
|
^C
|
||
|
$ gforth
|
||
|
Gforth 0.7.2, Copyright (C) 1995-2008 Free Software Foundation, Inc.
|
||
|
Gforth comes with ABSOLUTELY NO WARRANTY; for details type `license'
|
||
|
Type `bye' to exit
|
||
|
0 1 D. 1 0 D. 18446744073709551616 1 ok
|
||
|
|
||
|
So, it looks like the first number on the stack is the low part, and the second
|
||
|
is the high part. This is, once again, pretty straightforward: I'll need to
|
||
|
shift the first number by the appropriate number of bits and then add the
|
||
|
second number to it.
|
||
|
|
||
|
constexpr size_t dshift = (sizeof(KF_INT) * 8) - 1;
|
||
|
|
||
|
static bool
|
||
|
pop_long(System *sys, KF_LONG *d)
|
||
|
{
|
||
|
KF_INT a = 0;
|
||
|
KF_INT b = 0;
|
||
|
if (!sys->dstack.pop(&a)) {
|
||
|
sys->status = STATUS_STACK_UNDERFLOW;
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
if (!sys->dstack.pop(&b)) {
|
||
|
sys->status = STATUS_STACK_UNDERFLOW;
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
*d = static_cast<KF_LONG>(a) << dshift;
|
||
|
*d += static_cast<KF_LONG>(b);
|
||
|
sys->status = STATUS_OK;
|
||
|
return true;
|
||
|
}
|
||
|
|
||
|
This function also shows off the new status work and how that turns out. I've
|
||
|
kept the exec interface as a boolean to indicate success or failure.
|
||
|
|
||
|
To push the results back onto the stack, I needed to first write a masking
|
||
|
function to make sure to clear out any lingering bits::
|
||
|
|
||
|
static inline KF_INT
|
||
|
mask(size_t bits)
|
||
|
{
|
||
|
KF_INT m = 0;
|
||
|
|
||
|
for (size_t i = 0; i < bits; i++) {
|
||
|
m += 1 << i;
|
||
|
}
|
||
|
|
||
|
return m;
|
||
|
}
|
||
|
|
||
|
I should probably check `Hacker's Delight <http://hackersdelight.org/>`_ to see
|
||
|
if there's any tricks for this.
|
||
|
|
||
|
With the mask available, getting a long into a pair of ints requires shifting
|
||
|
and clearing for the high part and clearing for the low part::
|
||
|
|
||
|
static bool
|
||
|
push_long(System *sys, KF_LONG d)
|
||
|
{
|
||
|
KF_INT a = static_cast<KF_INT>((d >> dshift) & mask(dshift));
|
||
|
KF_INT b = static_cast<KF_INT>(d & mask(dshift));
|
||
|
|
||
|
if (!sys->dstack.push(b)) {
|
||
|
sys->status = STATUS_STACK_OVERFLOW;
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
if (!sys->dstack.push(a)) {
|
||
|
sys->status = STATUS_STACK_OVERFLOW;
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
sys->status = STATUS_OK;
|
||
|
return true;
|
||
|
}
|
||
|
|
||
|
One of the words that interacts with doubles is ``D+``::
|
||
|
|
||
|
static bool
|
||
|
dplus(System *sys)
|
||
|
{
|
||
|
KF_LONG da, db;
|
||
|
|
||
|
if (!pop_long(sys, &da)) {
|
||
|
// Status is already set.
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
if (!pop_long(sys, &db)) {
|
||
|
// Status is already set.
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
da += db;
|
||
|
|
||
|
if (!push_long(sys, da)) {
|
||
|
// Status is already set.
|
||
|
return false;
|
||
|
}
|
||
|
|
||
|
// Status is already set.
|
||
|
return true;
|
||
|
}
|
||
|
|
||
|
The only other thing I really did was to add a ``remove`` method to the Stack
|
||
|
class to support ``ROLL``.
|
||
|
|
||
|
Huge diff, but not as much to say about it --- next up, I think I'm going to
|
||
|
introduce the ``KF_ADDR`` type and start working on some of the address
|
||
|
interaction stuff. I'll also add more of the double number words, too. The
|
||
|
words I still have to implement from the `FORTH-83 standard`_ nuclear layer
|
||
|
are:
|
||
|
|
||
|
+ ``!``, ``+!``, ``@``, ``C!``, ``C@``, ``CMOVE``, ``CMOVE>``, ``COUNT``,
|
||
|
``FILL``: memory manipulation words
|
||
|
+ ``DNEGATE``, ``MAX``, ``MIN``, ``MOD``, ``XOR``: more arithmetic words
|
||
|
+ ``EXECUTE``, ``EXIT``, ``I``, ``J``, ``PICK``: various words
|
||
|
+ ``>R``, ``R>``, ``R@``: return stack words
|
||
|
+ ``U<``, ``UM*``, ``UM/MOD``: unsigned math words
|
||
|
|
||
|
.. _FORTH-83 standard: http://forth.sourceforge.net/standard/fst83/fst83-12.htm
|
||
|
|
||
|
As before, the snapshot for this update is tagged `part-0x06
|
||
|
<https://github.com/kisom/kforth/tree/part-0x06>`_.
|