1. Strategy
Larceny has six representations for
numbers:
The strategy taken was to generate in-line code for + and -,
by far the most common operations. The check that both operands are
fixnums is performed, and if it succeeds, the operation is performed
in-line. If the operation overflows or the tag-check fails, a millicode call is made to a subroutine for
the operation.
In addition, in-line code is generated for the arithmetic ordering
predicates and for some other predicates, although the exact set of
optimized primitive operations has been changing over time; there is
also the question of what kinds of optimizations the compiler does. For
example, if the compiler replaces (abs x) with the equivalent of
In the future, we may also generate in-line code for *,
quotient, and remainder; when Larceny was first designed
it was pointless to do so because current systems did not implement
multiplication and division in hardware.
When the millicode routine for an arithmetic operation is entered, the
arguments are in the registers RESULT and SECOND. The
millicode routine must figure out the representations of its arguments
as quickly as possible, and then perform the desired operation; this is
a dispatch on the two tags of the arguments. The dispatch is implemented
as an open-coded decision tree:
Note the code for mixed flonum/compnum arithmetic where the imaginary part
of the compnum is 0.0.
By carefully hand-coding the dispatch code in assembly language and
tuning it, the dispatch is reasonably efficient, and is not the
dominating factor in adding two flonums, for example (the time there
being spent loading data and allocating the resulting flonum on the
heap). For those really curious, here is the assembly code for generic add
from Larceny v0.25.
2. Implementation
3. Mixed-representation arithmetic
4. Bignum implementation
1. Strategy
(It is assumed that the reader has a fair understanding of numbers and
generic arithmetic in Scheme.)
fixnum exact fixed-precision integer, 29 bits signed
bignum exact arbitrary-precision integer outside the fixnum range
ratnum exact rational, represented as two exact integers
flonum inexact rational, represented as an IEEE double-precision
floating-point number
rectnum exact complex number, represented as two exact rationals
compnum inexact complex number, represented as two IEEE double-precision
floating-point numbers
From the outset it was decided that we should make certain kinds of
generic arithmetic as fast as we could without having any type
information. It was further decided that remaining kinds of generic
arithmetic should be as fast as it was convenient to make them. The
optimized cases were to be same-representation arithmetic on fixnums,
flonums, and compnums (all operations that are not unary are binary; the
compiler translates from e.g. (+ a b c) to (+ (+ a b) c)).
The operations on these combinations of arguments that were to be
optimized were primarily the common arithmetic operations. In addition,
certain efficiency concessions were to be made when operating on one flonum
and one compnum whose imaginary part is 0.0.
(let ((t x))
(if (< t 0)
(- t)
t)
then writing code generator macros that generate in-line code for
abs is pretty silly, since the back-end won't ever get to use them.
2. Implementation
On the SPARC, we use the architecture's tag bits for the fixnum tag, and
we can therefore use the tagged arithmetic instructions as well. Tagged
adds and subtracts are therefore rather cheap:
taddcc %REG1, %REG2, %RESULT
bvc,a L1
nop
overflow/non-fixnum code goes here
L1:
The branch delay slot is usually filled with the instruction at L1, so
in practice, a tagged add or subtract costs only an extra branch,
assuming taddcc is as efficiently implemented as
(let ((tag1 (tag-of RESULT)))
(cond ((= tag1 bytevector-like-tag)
(let ((hdrtag1 (hdrtag-of RESULT)))
(cond ((= hdrtag1 flonum-hdrtag)
(if (= (tag-of SECOND) bytevector-like-tag)
(cond ((= (hdrtag-of SECOND) flonum-hdrtag)
flonum operation, in-line)
((and (= (hdrtag-of SECOND) compnum-hdrtag)
(= 0.0 (imag-part SECOND)))
flonum operation, in-line)
(else
slow case))))
((= hdrtag1 compnum-hdrtag)
(if (= (tag-of SECOND) bytevector-like-tag)
(cond ((= (hdrtag-of SECOND) compnum-hdrtag)
compnum operation, in-line)
((and (= (hdrtag-of SECOND) flonum-hdrtag)
(= 0.0 (imag-part FIRST)))
flonum operation, in-line)
(else
slow case))))
((= hdrtag1 bignum-hdrtag)
(if (and (= (tag-of SECOND) bytevector-like-tag)
(= (hdrtag-of SECOND) bignum-hdrtag))
bignum operation, out-of-line
slow case))
(else
slow case))))
((= tag1 vector-like-tag)
rectnum and ratnum cases as above, out-of-line)
((= tag1 fixnum-tag)
fixnum case, in-line)
(else
slow case)))
In the above, the "slow case" makes a call to Scheme code that contains
a dispatch procedure not unlike the one exibited here, but that also
deals with non-numbers and mixed-representation arithmetic. Cases
marked "out-of-line" above are implemented in Scheme; the dispatch
procedure simply calls a known Scheme procedure via a vector installed
in the run-time system at startup-time. This is also true for bignum
operations, which are explained in more detail below.
3. Mixed-representation arithmetic
Mixed-representation arithmetic is implemented in Scheme by first
converting the operands to a common representation and then applying the
original operator again. The rules for representation conversion are
mostly straightforward -- you convert to the representation that will
allow both numbers to be represented, if necessary sacrificing exactness
and precision by going to an inexact representation -- but are made
complicated by the insistence of the IEEE standard and the R4RS that
operations be transitive. For this reason, bignum/flonum mixed
arithmetic has to be handled specially: If the flonum represents an
integer, then rather than converting the bignum to a flonum as most
languages (including Common Lisp) would do, thereby losing some
precision of the bignum, the flonum should be converted to a bignum, the
operation should be performed on bignums, and the result (if a number)
should then be converted to a flonum. Otherwise, the bignum operand is
converted to a flonum and the operation is performed using flonums,
yielding a flonum result (if a number).
4. Bignum implementation
(FIXME: Needs to be finished.)
Arithmetic on bignums (arbitrary-implementation exact integers) is
implemented exclusively in Scheme. The algorithms used in the
implementation are the "Classical Algorithms" from Knuth Volume II
(FIXME: need section reference), and the implementation is structured as
suggested in the following paper:
Jon L White, Reconfigurable, Retargetable Bignums:
A Case Study in Efficient, Portable Lisp System Building,
Proceedings of the ACM conference on Lisp & FP, 1986.
(One extension to the system millicode was necessary in order to support
a correct and workable implementation.)
$Id: note3-arithmetic.html 87 1998-11-25 14:38:41Z lth $
larceny@ccs.neu.edu