Table of Contents
Larceny implements the Scheme programming language as defined by IEEE Standard 1178-1990, the Revised5 Report, the Revised6 Report, and ERR5RS. Those language standards serve as Larceny's primary documentation.
This manual describes aspects of Larceny that are not described by the IEEE standard, the Revised Reports, or ERR5RS. For the most current version of this manual, please see Larceny's online documentation page. For links to the Common Larceny User Manual and the Larceny mailing list, please visit Larceny's main web page.
To report bugs, please send email to the Larceny
developers at <larceny@ccs.neu.edu>
, or submit a
bug ticket using Larceny's
Trac system.
There are three main varieties of Larceny.
Native Larceny is the fastest and most convenient variety of Larceny. It compiles directly to native machine code for Intel x86-32 or SPARC microprocessors running Windows, Linux, MacOS X, or Solaris operating systems.
Petit Larceny compiles to C instead of machine code. It runs on most Unix machines, including PowerPC Macintoshes with MacOS X.
Common Larceny compiles to JIT-compiled IL on Microsoft's Common Language Runtime (CLR) or Mono. It provides access to the .NET libraries from Scheme.
The current versions of Larceny are available for download at Larceny's main web page.
Twobit and Larceny are distributed in two forms: as a precompiled binary, or as source code that can be used to reconstruct any of the precompiled binary distributions. Unless you intend to modify Larceny yourself, you do not need to download the source code.
If you are installing or running Common Larceny, please consult the Common Larceny User Manual instead of becoming confused by the instructions in this manual.
Unpack the distribution files with an appropriate command such as one of the following:
tar -xzf larceny-X.Y-bin-native-sparc-solaris.tar.gz tar -xzf larceny-X.Y-bin-native-ia32-macosx.tar.gz tar -xzf larceny-X.Y-bin-native-ia32-linux86.tar.gz tar -xzf larceny-X.Y-bin-native-ia32-win32.tar.gz tar -xzf larceny-X.Y-bin-petit-stdc-macosx.tar.gz tar -xzf larceny-X.Y-src.tar.gz
That will create a directory with a similar name (but without
the .tar.gz
suffix) in your current working directory.
That is the Larceny root directory, which you may rename
to something shorter, such as larceny
; the rest of this
section will refer to it by that name.
Assuming you have unpacked a binary distribution, the larceny
directory will contain the following files:
larceny.bin Run-time system larceny.heap Heap image with all libraries, FFI, and compiler larceny Shell script that runs larceny.heap scheme-script Shell script that runs Scheme scripts compile-stale Scheme script that compiles ERR5RS/R6RS libraries startup.sch Pathnames for the require and autoload features
If you unpacked a binary distribution, then you should be able to
run it immediately by making the larceny
directory your current
working directory and invoking ./larceny
or perhaps just larceny
.
If you unpacked the source code there will be many other files and
directories, but larceny.bin
and larceny.heap
will not be present.
You can reconstruct the larceny.bin
and larceny.heap
files from their
source code, but that process requires a working version of Larceny
or PLT Scheme v37x; unless you're porting Larceny or Petit Larceny to a
brand new target architecture, it's easier to obtain those
files from a binary distribution of Larceny.
You may add the larceny
directory to your standard path,
or you may install Larceny into a directory that is already
part of your standard path.
Suppose, for example, that you want to install Larceny
in /usr/local/bin
and /usr/local/lib/larceny
.
Copy larceny
and scheme-script
to /usr/local/bin
and edit the definition of LARCENY_PATH
at the head
of each file to point to the correct directory:
LARCENY_PATH=/usr/local/lib/larceny
Then move the entire larceny
directory to /usr/local/lib/larceny
,
or copy larceny.bin
, larceny.heap
, startup.sch
, and the lib
directory to /usr/local/lib/larceny
.
You should now be able to run Larceny from any directory
by typing "larceny
" at a prompt.
Before you can run Larceny in ERR5RS or R6RS modes, you may
have to compile the ERR5RS/R6RS runtime and standard libraries.
This step is definitely required if you are using Petit Larceny
or building any variety of Larceny from source code.
With the prebuilt native varieties of Larceny, however, this step
should not be necessary unless you change one of the files in
lib/R6RS
or lib/SRFI
.
If the lib/R6RS
directory and its subdirectories are read-only,
then the standard libraries will not be touched, modified, or
recompiled by accident.
Compiling the ERR5RS/R6RS runtime and standard libraries is accomplished as follows:
$ ./larceny Larceny v0.96 "Fluoridation" (...) > (require 'r6rsmode) > (larceny:compile-r6rs-runtime) > (exit)
Compiling the ERR5RS/R6RS runtime as shown above causes all previously compiled ERR5RS/R6RS libraries and top-level programs to become stale. That means those previously compiled files will need to be recompiled or removed.
Larceny runs in any of four distinct modes:
R5RS traditional read/eval/print loop (the default) ERR5RS ERR5RS read/eval/print loop R6RS batch execution of R6RS top-level programs Scheme script batch execution of R6RS Scheme scripts
These modes correspond to the four distinct kinds of Scheme programs that are described by the current de facto standards for Scheme. (IEEE/ANSI Std 1178 is so similar to the R5RS standard that Larceny implements both languages in R5RS mode.)
When you start Larceny in R5RS mode (the default), you will be presented with a banner message and the read-eval-print loop's prompt:
% larceny Larceny vX.Y "<version_name>" (MMM DD YYYY HH:MM:SS, ...) larceny.heap, built ... >
You can enter a Scheme expression at the prompt. After a complete expression has been read, it will be evaluated and its results printed.
In native Larceny, the expression is evaluated by compiling it to native machine code, which is then executed. In Petit Larceny, the expression is evaluated by an interpreter because compiling to C, running the C compiler, and loading the compiled C code would take too long. Interpreted code behaves like compiled code, so most of what this manual says about the compiler is also true of Petit Larceny's interpreter.
By default, Larceny's Twobit compiler makes several assumptions that allow it to generate faster code; for example, the compiler assumes Scheme's standard procedures will not be redefined. To obtain strict conformance to R5RS semantics, see the section of this user manual devoted to performance.
To interact with Larceny's ERR5RS read/eval/print loop,
specify the -err5rs
option on Larceny's command line:
% larceny -err5rs Larceny v0.95 "First Safety" (...) ERR5RS mode (no libraries have been imported)
Since no libraries have been imported, the only forms you can evaluate are constant literals, variable references (but no variables have been imported!), procedure calls (but no procedure values are accessible!), library definitions, and import forms. The first thing you'll want to do is to import some libraries, such as:
> (import (rnrs base) (rnrs io simple) (err5rs records syntactic))
Once you have imported (rnrs base)
or a composite library
that includes it, you can evaluate definitions and use
all other syntax and variables you have imported.
As an extension to ERR5RS, Larceny attempts to load libraries automatically when they are first imported. Autoloading makes interactive development and separate compilation much more convenient.
All of Larceny's predefined libraries can be autoloaded.
To enable autoloading of other ERR5RS/R6RS libraries, you can:
-path
command-line option
LARCENY_LIBPATH
environment variable
current-require-path
startup.sch
in Larceny's root directory
lib
directory
Larceny automatically loads ERR5RS/R6RS libraries when
they are first imported. This is usually the most
convenient way to load a library, but autoloading
can't be used to load a top-level program. Explicit
loading is needed for top-level programs, for libraries
that don't reside in Larceny's
current-require-path
,
and for libraries that are defined in files whose names
do not follow Larceny's
standard naming conventions.
Explicit loading is the only portable way for ERR5RS programs to load a library.
Explicit loading also makes it possible to write portable programs whose source files conform to both the R5RS and ERR5RS. Two different configuration files, one for the R5RS and one for ERR5RS, can perform all of the imports and loads needed to run the program.
For explicit loading of nonstandard libraries, top-level programs, or unadorned R5RS-style code from a file, you must first import a suitable load procedure:
> (import (err5rs load))
Loading a library does not automatically import it. To use the variables and syntax that are exported by a library, you must import that library explicitly:
> (load "lib/R6RS/larceny/benchmarking.sls") > (import (larceny benchmarking)) > (time (vector-for-each + (make-vector 1000000 0))) Words allocated: 3095752 Words reclaimed: 0 Elapsed time...: 111 ms (User: 104 ms; System: 8 ms) Elapsed GC time: 4 ms (CPU: 4 in 8 collections.)
In Larceny, you may omit the call to load
because the
(larceny benchmarking)
library will be autoloaded when
it is imported. In other ERR5RS systems, however, you
may have to load all of the nonstandard libraries that will
be imported by a top-level program or library before you
load that top-level program or library.
You do not have to import those libraries into the ERR5RS top level, however, unless you want to use the variables and syntax exported by those libraries in the expressions and definitions you evaluate at the top level.
Larceny predefines several nonstandard libraries in addition to the standard ERR5RS and R6RS libraries, and autoloads them for your convenience. The predefined, autoloadable libraries include:
R6RS standard libraries:
(rnrs base (6)) ; R6RS chapter 9 (rnrs unicode (6)) ; R6RS library chapter 1 (rnrs bytevectors (6)) ; R6RS library chapter 2 (rnrs lists (6)) ; R6RS library chapter 3 (rnrs sorting (6)) ; R6RS library chapter 4 (rnrs control (6)) ; R6RS library chapter 5 (rnrs exceptions (6)) ; R6RS library section 7.1 (rnrs conditions (6)) ; R6RS library sections 7.2 and 7.3 (rnrs io ports (6)) ; R6RS library sections 8.1 and 8.2 (rnrs io simple (6)) ; R6RS library sections 8.1 and 8.3 (rnrs files (6)) ; R6RS library chapter 9 (rnrs programs (6)) ; R6RS library chapter 10 (rnrs arithmetic fixnums (6)) ; R6RS library section 11.2 (rnrs arithmetic flonums (6)) ; R6RS library section 11.3 (rnrs arithmetic bitwise (6)) ; R6RS library section 11.4 (rnrs syntax-case (6)) ; R6RS library chapter 12 (rnrs hashtables (6)) ; R6RS library chapter 13 (rnrs enums) ; R6RS library chapter 14 (rnrs (6)) ; R6RS library chapter 15 (rnrs eval (6)) ; R6RS library chapter 16 (rnrs mutable-pairs (6)) ; R6RS library chapter 17 (rnrs mutable-strings (6)) ; R6RS library chapter 18 (rnrs r5rs (6)) ; R6RS library chapter 19
R6RS standard libraries that are autoloadable but deprecated in Larceny:
(rnrs records procedural (6)) ; R6RS library section 6.3 (rnrs records inspection (6)) ; R6RS library section 6.4 (rnrs records syntactic (6)) ; R6RS library section 6.2
SRFI libraries (which follow the SRFI 97 naming convention):
(srfi :1 lists) ; list library (srfi :2 and-let*) ; extended `and` and `let*` (srfi :5 let) ; extended version of `let` (srfi :6 basic-string-ports) ; basic string ports (srfi :8 receive) ; binding to multiple values (srfi :9 records) ; defining record types (srfi :11 let-values) ; syntax for multiple values (srfi :13 strings) ; string libraries (srfi :14 char-set) ; character-set library (srfi :16 case-lambda) ; syntax for variable arity (srfi :17 generalized-set!) ; generalized set! (srfi :19 time) ; time data types and procedures (srfi :23 error) ; error reporting mechanism (srfi :25 multi-dimensional-arrays) ; multi-dimensional array primitives (srfi :26 cut) ; specializing without currying (srfi :27 random-bits) ; sources of random bits (srfi :28 basic-format-strings) ; basic format strings (srfi :29 localization) ; localization (srfi :38 with-shared-structure) ; i/o for data with shared structure (srfi :39 parameters) ; parameter objects (srfi :41 streams) ; streams (srfi :42 eager-comprehensions) ; eager comprehensions (srfi :43 vectors) ; vector library (srfi :45 lazy) ; iterative lazy algorithms (srfi :48 intermediate-format-strings) ; format (srfi :51 rest-values) ; rest values hackery (srfi :54 cat) ; still more formatting (srfi :59 vicinities) ; vicinity (srfi :61 cond) ; a more general cond clause (srfi :63 arrays) ; homogeneous, heterogeneous arrays (srfi :64 testing) ; an API for test suites (srfi :67 compare-procedures) ; three-way comparison procedures (srfi :78 lightweight-testing) ; lightweight testing (srfi :87 case) ; a more general case clause (srfi :98 os-environment-variables) ; environment variables (srfi :99 records) ; ERR5RS records (composite library) (srfi :99 records procedural) ; ERR5RS records (procedural API) (srfi :99 records inspection) ; ERR5RS records (inspection API) (srif :99 records syntactic) ; ERR5RS records (syntactic API)
SRFI libraries that are autoloadable but deprecated in Larceny, usually because they have been superseded in whole or in part by R6RS syntax or libraries:
(srfi :60 integer-bits) ; integers as bits (srfi :66 octet-vectors) ; octet vectors (srfi :69 basic-hash-tables) ; basic hash tables (srfi :71 let) ; extensions of let, let*, letrec (srfi :74 blobs) ; octet-addressed binary blocks (srfi :86 mu-and-nu) ; mu and nu simulating values etc (srfi :95 sorting-and-merging) ; sorting and merging
Other autoloadable libraries:
(err5rs records procedural) ; ERR5RS records (procedural API) (err5rs records inspection) ; ERR5RS records (inspection API) (err5rs records syntactic) ; ERR5RS records (syntactic API) (err5rs load) ; ERR5RS load procedure (rnrs load) ; equivalent to (err5rs load) (larceny load) ; extension of (err5rs load) (larceny compiler) ; separate compilation (ERR5RS/R6RS) (larceny benchmarking) ; timing facilities (larceny profiling) ; profiling of Scheme code (larceny records printer) ; custom printing of records (larceny shivers-syntax) ; syntax favored by Olin Shivers (r5rs) ; approximates the R5RS top level (explicit-renaming) ; macros with explicit renaming
Larceny's autoload feature locates ERR5RS/R6RS libraries
by performing a depth-first search of the directories
that belong to Larceny's
current-require-path
.
Libraries will not be autoloaded unless they are defined
in files whose names follow
Larceny's standard conventions.
The
current-require-path
is initialized by the startup.sch
file in Larceny's root
directory.
Larceny's -path
command-line option
adds one or more directories to the directories in the
current-require-path
.
The LARCENY_LIBPATH
environment variable can also be used to add one or more directories
to the directories in the
current-require-path
.
Any of Larceny's R5RS-mode top-level procedures can be imported
into an ERR5RS or R6RS library or program by using an import
declaration with a primitives
clause that names the R5RS
procedures to be imported. For example:
(import (primitives random current-seconds getenv setenv system current-directory file-modification-time system-features vector-like-cas!) (rnrs base) (rnrs control))
This feature is non-portable. Other implementations of ERR5RS or the R6RS may not even have an underlying implementation of the R5RS.
To execute a top-level R6RS program that is contained
within a file named pgm
, type:
larceny -r6rs -program pgm
The -program
option can be omitted, in which case
Larceny will read the top-level program from standard
input:
larceny -r6rs < pgm
If you omit the -program
option and do not redirect
standard input, then Larceny will wait patiently
for you to type a complete top-level program into
standard input, terminating it with an end-of-file.
You probably don't want to do that. Had you wanted to type R6RS code at Larceny, you'd be using ERR5RS mode instead.
By default, Larceny's Twobit compiler uses settings
that make good sense for production code but violate
some absolute requirements of the R6RS. For maximal
adherence to R6RS requirements (at the expense of
portability, interoperability, and/or performance),
see the discussion of
compiler-switches
in the section on the
(larceny compiler)
library.
The R6RS standard does not specify any way for a top-level program to define its own libraries. Portable R6RS programs are therefore limited to importing a subset of the R6RS standard libraries.
As an extension to the R6RS, Larceny allows R6RS top-level programs and Scheme scripts to import any libraries that are predefined in Larceny's ERR5RS mode.
As another extension to the R6RS, Larceny allows
R6RS top-level programs to import any libraries
that can be found in the directories specified by the
-path
option on Larceny's command line using
Larceny's standard translation
from library names to file names.
On most systems, you can specify multiple directories
by separating them with a colon; under Windows, use a
semicolon as separator instead. The first directory
listed will be searched first.
If you have a set of portable libraries that run under more than
one implementation of the R6RS, and you want to have a special
version of some of those libraries for Larceny, you can put all
your portable versions in one directory and the Larceny-specific
versions in another. When you run Larceny, use the -path
option and specify the Larceny-specific directory first.
The -path
option cannot be used by Scheme scripts,
because command-line options are passed along to the Scheme
script without being interpreted by the scheme-script
processor.
We emphasize that this extension is non-portable.
Other implementations of the R6RS may not provide
anything comparable to Larceny's -path
option.
Even if they do, their mappings from library names
to file names may be incompatible with Larceny's.
As another extension to the R6RS, Larceny allows
R6RS top-level programs to import any libraries
that can be found in directories specified by the
LARCENY_LIBPATH
environment variable.
The value of that environment variable should be
a string in the same format as for Larceny's -path
command-line option, described above.
We emphasize that this extension is non-portable.
Other implementations of the R6RS may not support
anything comparable to Larceny's LARCENY_LIBPATH
environment variable.
Even if they do, their mappings from library names
to file names may be incompatible with Larceny's.
As an extension to the R6RS, Larceny allows a top-level program or Scheme script to define R6RS libraries within the file that contains the top-level program or Scheme script, before the import form that begins the top-level program. These libraries must be arranged so that no library depends upon libraries that come later in the file.
We emphasize that this extension is non-portable. Other implementations of the R6RS may not allow R6RS programs to define their own libraries.
As an extension to the R6RS, Larceny allows any of the
procedures that are predefined at top level in Larceny's
R5RS mode to be imported by using
a primitives
clause
in an import
declaration.
This extension is highly non-portable. Other implementations of the R6RS may not even have an underlying implementation of the R5RS.
On most Unix systems (including MacOS X and Linux), Larceny's
scheme-script
will execute Scheme scripts as described in R6RS
non-normative appendix D, with or without the optional script
header. To make Scheme scripts executable in their own
right, without executing scheme-script
directly, add Larceny's
root directory to your path as described in doc/HOWTO-INSTALL
.
Suppose, for example, that /home/myself/hello
is an R6RS
Scheme script whose first line is the optional script header
shown below:
#!/usr/bin/env scheme-script
If you do not have execute permission for this script, or Larceny's root directory is not in your path, you can still run the script from Larceny's root directory as follows:
% ./scheme-script /home/myself/hello
If you have execute permission for the script, and Larceny's root directory is in your path, you can also run the script as follows:
% /home/myself/hello
If, in addition, the directory that contains the script is in your path, you can run the script as follows:
% hello
You may also pass command-line arguments to a Scheme script.
We emphasize that Scheme scripts are not portable. Scheme scripts are specified only by a non-binding appendix to the R6RS, not by the R6RS proper. Other implementations of the R6RS may not support Scheme scripts at all, or may give them a semantics incompatible with Larceny's.
On Unix systems, standard input and output can be redirected
in the usual way. In Larceny, standard input corresponds to
the textual port initially returned by current-input-port
,
and standard output corresponds to the textual port initially
returned by current-output-port
.
We emphasize that redirection of standard input and output
is non-portable.
Other implementations of the R6RS may not allow redirection,
or may identify the standard input and output with ports
other than those initially returned by current-input-port
and current-output-port
.
Suppose hello.sch
contains the following R5RS code:
(display "Hello world!") (newline) (exit)
You can run hello.sch
as a script by executing Larceny as
follows:
% larceny -nobanner -- hello.sch
You can redirect Larceny's standard input, in which case you may want to eliminate the herald announcement and the read/eval/print loop's prompt:
% larceny -nobanner -- -e "(begin (herald #f) (repl-prompt values))" \ < hello.sch
For an explanation of why that works, which may suggest other creative uses of Larceny, ask for help:
% larceny -help
In R6RS modes, errors should result in an error message followed by a clean exit from the program.
If your program encounters an error in an interactive mode (R5RS or ERR5RS), it will enter the debugger; this is believed to be a feature.
Despite its crudity, and to some extent because of it, Larceny's debugger works at least as well with optimized compiled code as with interpreted code.
If you type a question mark at the debugger prompt, the debugger will print a help message. That message is more helpful if you understand the Twobit compiler and Larceny's internal representations and invariants, but this manual is not the place to explain those things.
The debugging context is saved so you can exit the debugger and re-enter it from the main read/eval/print loop's prompt:
> (debug)
The debugger is pretty much a prototype; you don't need to tell us how bad it is.
When attempting to run an R6RS program, you may see
a warning about "loading source in favor of stale
fasl file
",
following by a long series of error messages about
syntactic keywords used as a variable, ending with
the kind of error you'd expect to see when a large
R6RS program is fed to a Scheme compiler that was
expecting to see R5RS-compatible code. That means
the R6RS runtime and standard libraries were not
installed correctly, or their source files have been
touched or modified since they were last compiled.
To fix the problem,
recompile the R6RS standard libraries as described in step 4 of doc/HOWTO-BUILD
.
The precompiled binary forms of Larceny should run on
most machines with the appropriate processor and operating
system, but the executable program "larceny.bin
" may be
incompatible with very old or with very new versions of
the processor or operating system. If that appears to be
the case, you should see whether a newer version of Larceny
fixes the problem. If not, please report the problem
to us at larceny@ccs.neu.edu
.
Please report success stories as well.
If something goes wrong while
compiling the ERR5RS/R6RS runtime,
make sure you are running the copy of Larceny you think
you are running and have read and write permission
for lib/R6RS
, lib/SRFI
, and all their subdirectories and files.
If you get an error message about something being
"expanded against a different build of this library
",
then one or more of the compiled files in lib/R6RS
or lib/SRFI
or its subdirectories has gone
stale.
Removing all .slfasl
files from lib/R6RS
and lib/SRFI
and
their subdirectories will eliminate the stale file(s).
Don't remove the .sch
or .sls
files.
If Larceny attempts to autoload an imported ERR5RS/R6RS
library but cannot find the library, then the library
may be defined in a file that doesn't follow
Larceny's standard naming conventions.
Another possibility is that the -path
option was
omitted or incorrect.
If an ERR5RS/R6RS library is compiled, then all compiled
libraries and top-level programs that depend upon it must
also be recompiled. In particular, recompiling the standard
R6RS runtime will invalidate all compiled libraries and
top-level programs. Larceny's compile-stale
script
and the
compile-stale-libraries
procedure of (larceny compiler)
make it convenient
to recompile all of the libraries and top-level
programs within any given directory in an order
consistent with their dependencies.
Please report all crashes with as much information is possible; a backtrace from a debugger or a core dump is ideal (but please do not mail the core dump without contacting us first). Larceny's run-time system is compiled with full debugging information by default and a debugger like GDB should be able to provide at least some clues.
By default, Larceny's Twobit compiler makes several assumptions that allow it to generate faster code; for example, the compiler assumes Scheme's standard procedures will not be redefined.
To disable certain compiler optimizations that are
incompatible with the R6RS, see the section that
describes the
(larceny compiler)
library.
To obtain strict conformance to R5RS semantics at the expense of slower code, evaluate the expression
(compiler-switches 'standard)
To make the compiler generate faster code, you can promise not to redefine standard procedures and not to redefine any top-level procedure while it is running. To make this promise, evaluate
(compiler-switches 'fast-safe)
To view the current settings of Twobit's numerous compiler switches, evaluate
(compiler-switches)
All of Twobit's compiler switches are procedures whose setting can be changed by passing the new value of the switch as an argument.
For more information, evaluate
(help)
Some of the help information that will be printed may be irrelevant to the heap image you are using.
Larceny's default lexical syntax extends the lexical syntax required by the R5RS, R6RS, and ERR5RS standards.
The R6RS forbids most lexical extensions, however, so Larceny provides several mechanisms for turning its lexical extensions on and off.
By default, Larceny recognizes several Larceny-specific flags of the form permitted by the R6RS. The flag you are most likely to encounter represents one of Larceny's unspecified values:
#!unspecified
Certain other flags have special meanings to Larceny's
read
and get-datum
procedures. They are described below.
By default, Larceny is case-sensitive.
This global default can be overridden by specifying
—foldcase
or —nofoldcase
on Larceny's command line,
or by changing the value of Larceny's case-sensitive?
parameter.
The case-sensitivity of a particular textual input port
is affected by reading one of the following flags from
the port using the read
or get-datum
procedures:
#!fold-case #!no-fold-case
The #!fold-case
flag enables case-folding on data read from
the port by the read
and get-datum
procedures, while the
#!no-fold-case
flag disables case-folding. The behavior
established by one of these flags extends to the next flag
read from the port by read
or get-datum
.
Both #!fold-case
and #!no-fold-case
evaluate to an
unspecified value. To obtain the effect of one of these flags
while treating it as a comment, place #;
in front of the flag.
When a port is first opened, the Larceny-specific lexical extensions that are accepted on the port are determined by Larceny's lexical parameters.
The following flags change the case-sensitivity and lexical extensions on the specific port from which they are read:
#!r6rs ; implies #!no-fold-case, negates other flags #!r5rs ; implies #!fold-case, #!err5rs #!err5rs ; allows Larceny-specific extensions #!larceny ; implies #!no-fold-case, #!err5rs
The #!r6rs
flag is a comment, while all of Larceny's other
flags evaluate to an unspecified value. To obtain the effect
of a flag other than #!r6rs
while treating it as a comment,
place #;
in front of the flag.
The R6RS requires implementations to treat #!r6rs
as a
comment; it is the only flag that implementations of the R6RS
are required to treat as a comment. Since the #!r6rs
flag
behaves differently from all other flags, it is deprecated.
When given no argument, these parameters return the current default for some aspect of the lexical syntax that will be accepted on newly created ports. When given an argument, these procedures change the default as specified by the argument.
(case-sensitive? boolean)
Determines whether newly created textual ports default to case-sensitive.
Procedure read-larceny-weirdness?
(read-larceny-weirdness? ) => boolean
(read-larceny-weirdness? boolean)
Determines whether newly created textual ports allow
Larceny's usual extensions to R5RS lexical syntax.
This parameter also determines whether newly created
ports allow #
as an insignificant digit; this is
required by the R5RS, but disallowed by the R6RS.
Procedure read-traditional-weirdness?
(read-traditional-weirdness? ) => boolean
(read-traditional-weirdness? boolean)
Determines whether newly created textual ports allow certain lexical extensions that are deprecated in Larceny. These extensions include symbols enclosed by vertical bars and read-time evaluation.
For the current semantics of these parameters, please consult the Larceny developers' web page that describes Larceny's lexical syntax.
In Larceny, file names generally follow Unix conventions, even on Windows. The following suffixes have special meanings to some components of Larceny.
.sls
is the preferred suffix for files that consist
of ERR5RS/R6RS-compatible library definitions.
.sch
is the preferred suffix for files that contain R5RS
source code.
.scm
is an alternative suffix for files that contain R5RS source code.
.slfasl
is the suffix for files that contain the pre-compiled
form of ERR5RS/R6RS-compatible code.
.fasl
is the suffix for files that contain the pre-compiled form of R5RS code.
.mal
is the preferred suffix for files that contain MacScheme
assembly language in symbolic form.
.lap
is the suffix for files that contain MacScheme assembly language.
.lop
is the suffix for files that contain machine code
segments in the form expected by Larceny's heap linker.
.heap
is the suffix for files that contain an executable heap
image (must be combined with the
larceny.bin
runtime).
Larceny's root directory should contain the following files:
larceny scheme-script larceny.bin larceny.heap startup.sch
The following subdirectories are also essential for correct operation of some features of some modes in some varieties of Larceny:
include lib lib/Base lib/Debugger lib/Ffi lib/MzScheme lib/R6RS lib/SRFI lib/Standard lib/TeachPacks
The include
subdirectory is used when compiling files with
Petit Larceny.
The startup.sch
file tells Larceny's require
procedure to
search some of the lib
subdirectories for libraries that are
loaded dynamically.
The R6RS does not specify any mapping from library names to files or to other locations at which the code for a library might be found. As R6RS non-normative appendix E puts it:
Implementations may take radically different approaches to storing source code for libraries, among them: files in the file system where each file contains an arbitrary number of library forms, files in the file system where each file contains exactly one library form, records in a database, and data structures in memory….Implementations may provide a means for importing libraries…. | ||
-- |
In other words, implementations are allowed to extend the R6RS with arbitrary mechanisms for resolving references to imported libraries, but R6RS programs that rely on such mechanisms are not portable. In particular, R6RS libraries are not portable.
Larceny provides five distinct Larceny-specific mechanisms that non-portable R6RS programs can use to import or to define libraries:
(larceny compiler)
,
may be placed in one of the directories that are searched
by Larceny's
autoload feature, provided
those libraries are located in files that follow Larceny's
standard naming conventions as described in the next section.
-path
option
to specify directories that contain other libraries
the program may import, provided those libraries are
located in files that follow Larceny's standard naming
conventions as described in the next section.
LARCENY_LIBPATH
environment variable
to specify directories that contain other libraries
the program may import, provided those libraries are
located in files that follow Larceny's standard naming
conventions as described in the next section.
ERR5RS programs may use any of those five mechanisms, and may also use a sixth mechanism: An ERR5RS program can be written as a little configuration program that loads the program's libraries from files before any libraries are imported. This sixth mechanism is portable, but is not available to R6RS programs.
Suppose Larceny's -path
option is used to specify
a certain directory, and the program imports a
nonstandard library whose name is of the form
(name1 name2 … lastname)
.
Larceny will search for that library in the following
files:
directory/name1/name2/…/lastname.larceny.slfasl
directory/name1/name2/…/lastname.larceny.sls
directory/name1/name2/…/lastname.slfasl
directory/name1/name2/…/lastname.sls
directory/name1/name2.larceny.slfasl
directory/name1/name2.larceny.sls
directory/name1/name2.slfasl
directory/name1/name2.sls
directory/name1.larceny.slfasl
directory/name1.larceny.sls
directory/name1.slfasl
directory/name1.sls
The search starts with the first of those file names, continues with the following file names in order, and ends when a file with one of those names is found. The imported library must be one of the libraries defined within the first file found by this search, since the search is not continued after that first file is found (except as noted in the next paragraph).
If the search ends by finding a file whose name ends
with .slfasl
, then Larceny checks to see whether
there is a file in the same directory with the same
root name but ending with .sls
instead of .slfasl
.
If the .sls
file has been modified since the .slfasl
file was last modified, then a warning is printed and
the .sls
file is loaded instead of the .slfasl
file.
Otherwise the .slfasl
file is loaded.
The R6RS allows arbitrary mappings from library names to library
code. Larceny takes advantage of this by ignoring version
numbers when mapping library names to files, and by (virtually)
rewriting any version number that may be specified in the
definition of a library so it matches any version specification
that appears within the import
form. Furthermore Larceny
allows different versions of the same library to be imported,
but Larceny's algorithm for resolving library references
ensures that the different versions of a library will be
identical except for their version numbers, which have no
meaningful semantics. Although Larceny's treatment of versions
conforms to the R6RS specification, it should be clear that
version numbers serve no purpose in Larceny. Since the R6RS
version feature has no usefully portable semantics, it is
deprecated.
In R5RS mode, Larceny's -path
option
and LARCENY_LIBPATH
environment variable
may be used to
specify directories to be searched by the require
procedure, which takes a single symbol libname as
its argument.
The require
procedure will search for the following
files in every directory that is part of the current
require path, starting with the directories specified
by LARCENY_LIBPATH and the -path
option:
libname.fasl
libname.sch
libname.scm
These files are expected to contain R5RS code, not library definitions. Otherwise the search proceeds much the same as when searching for an ERR5RS/R6RS library.
The require
path is specified by startup.sch
in Larceny's
root directory, but may be changed dynamically using the
current-require-path
parameter. Changing the require
path
is not recommended, however, because Larceny relies on the
require
path for dynamic loading of libraries used by several
important features of Larceny, notably ERR5RS and R6RS modes.
libname must be a symbol that names an R5RS-compatible library within the current require path.
If the library has not already been loaded, then it is
located and loaded. If the library is found and loaded
successfully, then require
returns true; otherwise an
error is signalled.
If the library has already been loaded, then require
returns false without loading the library a second time.
Procedure current-require-path
(current-require-path ) => stringlist
(current-require-path stringlist)
The optional argument is a list of directory names
(without slashes at the end) that should be searched
by require
and (in ERR5RS/R6RS modes)
by Larceny's autoload
feature.
Returns the list of directory names that will be
searched.
This chapter explains how you can use Larceny to compile Scheme source code to native machine code.
The native varieties of Larceny have a just-in-time compiler that compiles to native code automatically whenever you evaluate an expression, load a source file, or import a source library. Even so, files will load faster if they are compiled ahead of time.
Petit Larceny does not have a just-in-time compiler, so compiling ahead of time is the only way to enjoy the speed of native machine code in Petit Larceny.
Common Larceny uses an interpreter for expressions
that are typed at the interactive read/eval/print
loop, but files will be compiled as they are loaded
if you specify Larceny.fasl
on the command line.
For more information on compiling files in Common
Larceny, please consult the
Common Larceny User Manual.
The main disadvantage of compiling files and libraries is that compiled code goes stale when its original source code is changed or when a library on which the compiled code depends is changed or recompiled. Stale compiled code can be dangerously inconsistent with libraries on which it depends, so Larceny checks for staleness and refuses to execute a stale library or program.
On Unix machines, the most convenient way to compile
a group of ERR5RS/R6RS libraries and top-level programs
is to use the compile-stale
script in Larceny's root
directory.
If Larceny's root directory is in your execution path,
then there are just two steps:
cd
to change to the directory that contains
the ERR5RS/R6RS files you want to compile. (Files
that lie within subdirectories of that directory will
be compiled also.)
compile-stale
script.
For example:
% cd lib/R6RS % compile-stale
On non-Unix machines, you can accomplish the same thing
using Larceny's ERR5RS mode and the (larceny compiler)
library:
% pushd lib\R6RS % ..\..\larceny.bat -err5rs Larceny v0.96 "Fluoridation" ERR5RS mode (no libraries have been imported) > (import (larceny compiler)) > (compile-stale-libraries)
To compile individual files, use the compile-file
or
compile-library
procedures that are exported by
(larceny compiler)
.
Compiles sourcefile, which must be a string naming
a file that contains R5RS source code.
If faslfile is supplied as a second argument,
then it must be a string naming the file that will
contain the compiled code; otherwise the name of
the compiled file is obtained from sourcefile
by replacing the ".sch
" or ".scm
" suffix with
".fasl
".
For ERR5RS/R6RS libraries and top-level programs, see above.
ERR5RS is still being developed, so the specifications described below are subject to change as its standard libraries are revised.
This section describes the (err5rs load)
library.
Loads ERR5RS code from filename, evaluating each form as though it had been entered at the interactive read/eval/print loop.
The load
procedure should be used only at an interactive top
level and in files that will be loaded into an interactive top
level. Calls to the load
procedure have no effect at compile
time, and should not appear in files that will be compiled
separately; use the library
and import
syntaxes instead.
When a procedure is said to be equivalent to an R6RS procedure, the equivalence holds only when all arguments have the properties required of them by the R6RS specification. ERR5RS does not mandate R6RS exception semantics for programs that violate the specification.
This section describes the (err5rs records procedural)
library.
(make-rtd name fieldspecs parent-rtd)
(make-rtd name fieldspecs parent-rtd option …)
name is a symbol, which matters only to the
rtd-name
procedure of the inspection layer.
fieldspecs is a vector of field specifiers, where
each field specifier is one of
(mutable name)
,
where name is a symbol naming the mutable field;
(immutable name)
,
where name is a symbol naming the immutable field.
The optional parent is an rtd or #f
. It is an error for
any of the symbols in fieldspecs to name more than one
of the fields specified by fieldspecs, but the field names
in fieldspecs may shadow field names in the parent rtd.
make-rtd
returns an R6RS-compatible record-type
descriptor.
Larceny allows the following optional arguments to follow the optional parent-rtd argument:
sealed
means the new rtd cannot be used
as the parent of other rtds;
opaque
means the record?
predicate
will not recognize instances of the new rtd;
uid
, followed by another symbol id,
means the new rtd is non-generative with uid id; the
semantics of this extension is the same as in the R6RS.
These Larceny-specific options may be used in any combination, giving Larceny's ERR5RS records the same expressive power as R6RS records, with which they are fully interoperable.
This predicate returns true if and only if its argument
is a record-type descriptor.
rtd?
is equivalent to the record-type-descriptor?
procedure of the R6RS.
(rtd-constructor rtd fieldspecs)
rtd is a record-type descriptor, and fieldspecs is an optional vector of symbols.
If no fieldspecs argument is supplied,
then rtd-constructor
returns a procedure that expects
one argument for each field of the record-type described
by rtd and returns an instance of that record-type
with its fields initialized to the corresponding
arguments.
Arguments that correspond to the fields of the
record-type's parent (if any) come first.
If fieldspecs is supplied, then rtd-constructor
returns a procedure that expects one argument for each
element of fieldspecs and returns an instance of the
record-type described by rtd with the named fields
initialized to the corresponding arguments.
It is an error if some symbol occurs more than once in fieldspecs. Fields of a derived record-type shadow fields of the same name in its parent; the fieldspecs argument cannot be used to initialize a shadowed field.
Equivalent to the record-predicate
procedure of the R6RS.
field is a symbol that names a field of the record-type described by the record-type descriptor rtd. Returns a unary procedure that accepts instances of rtd (or any record-type that inherits from rtd) and returns the current value of the named field.
Fields in derived record-types shadow fields of the same name in a parent record-type.
field is a symbol that names a field of the record-type described by the record-type descriptor rtd. Returns a binary procedure that accepts instances of rtd (or any record-type that inherits from rtd) and a new value to be stored into the named field, performs that side effect, and returns an unspecified value.
Fields in derived record-types shadow fields of the same name in a parent record-type.
This section describes the (err5rs records inspection)
library.
Equivalent to its R6RS namesake.
Equivalent to its R6RS namesake.
Equivalent to the record-type-name
procedure of the R6RS.
Equivalent to the record-type-parent
procedure of the R6RS.
Equivalent to the record-type-field-names
procedure of the R6RS.
(That is, it returns a vector of the symbols that name the fields
of the record-type represented by rtd, excluding the fields of
parent record-types.)
Returns a vector of the symbols that name the fields of
the record-type represented by rtd, including the fields
of its parent record-types, if any, with the fields of
parent record-types coming before the fields of its children,
with each subsequence in the same order as in the vectors
that would be returned by calling rtd-field-names
on rtd and on all its ancestral record-type descriptors.
(rtd-field-mutable? rtd field)
rtd is a record-type descriptor, and field is a
symbol naming a field of the record-type described by rtd.
Returns #t
if the named field is mutable; otherwise returns #f
.
This section describes the (err5rs records syntactic)
library.
The syntactic layer consists of SRFI 9 extended with single inheritance and (optional) implicit naming.
All ERR5RS record-type definitions are generative (unless
Larceny's optional uid
feature is used), but
ERR5RS drops the SRFI 9 restriction to top level, mainly
because the R6RS allows generative definitions wherever
a definition may appear.
The syntax of an ERR5RS record-type definition is
<definition> -> <record type definition> ; addition to 7.1.6 in R5RS <record type definition> -> (define-record-type <type spec> <constructor spec> <predicate spec> <field spec> ...) <type spec> -> <type name> -> (<type name> <parent>) <constructor spec> -> #f -> #t -> <constructor name> -> (<constructor name> <field name> ...) <predicate spec> -> #f -> #t -> <predicate name> <field spec> -> <field name> -> (<field name>) -> (<field name> <accessor name>) -> (<field name> <accessor name> <mutator name>) <parent> -> <expression> <type name> -> <identifier> <constructor name> -> <identifier> <predicate name> -> <identifier> <accessor name> -> <identifier> <mutator name> -> <identifier> <field name> -> <identifier>
The semantics of a record type definition is the same as in SRFI 9: the record type definition macro-expands into a cluster of definitions that
<type name>
as the record-type descriptor
for the new record-type;
#f
);
#f
);
An ERR5RS record type definition extends SRFI 9 with the following additional options:
<parent>
expression is specified, then it must
evaluate to an rtd that serves as the parent record-type
for the record-type being defined.
#f
is specified for the constructor or predicate,
then no constructor or predicate procedure is defined.
(This is useful when the record-type being defined will
be used as an abstract base class.)
#t
is specified for the constructor or predicate,
then the name of the constructor is the type name prefixed
by make-
, and the name of the predicate is the type name
followed by a question mark (?
).
#t
or as an
identifier, then the constructor's arguments correspond
to the fields of the parent (if any) followed by the new
fields added by this record-type definition.
If a field spec consists of a single identifier, then
-
) followed by the field name.
If a field spec consists of a list of one identifier, then
-
) followed by the field name;
-
) followed by the field name followed by -set!
.
Two ERR5RS records with fields are eqv?
if and only if
they were created by the same (dynamic) call to some
record constructor.
Two ERR5RS records are eq?
if and only if they are eqv?
.
Apart from the usual constraint that equivalence according
to eqv?
implies equivalence according to equal?
, the
behavior of equal?
on ERR5RS records is unspecified.
(This is compatible with the R6RS.)
A define-record-type
form macro-expands into code that
calls make-rtd
each time the expanded record-type
definition is executed.
Two ERR5RS record-type descriptors are eqv?
if and only if
they were created by the same (dynamic) call to
make-rtd
.
This chapter explains which features of the R6RS standard libraries are available in each of Larceny's major modes of execution.
Larceny was the first substantially complete implementation of the R6RS. Any features that are missing from R6RS modes are missing because of bugs or because the features are deprecated in Larceny. The most up-to-date listing of Larceny's known deviations from the R6RS standard can be found on the web page that describes the current status of Larceny's R6RS-compatible mode.
Larceny is R6RS-compatible but not R6RS-conforming. When Larceny is said to support a feature of the R6RS, that means the feature is present and will behave as specified by the R6RS so long as no exception is raised. Larceny does not always raise the specific conditions specified by the R6RS, and does not perform all of the checking for portability problems that is mandated by the R6RS. These deviations do not affect the execution of production code, and do not compromise Larceny's traditional safety.
ERR5RS and R6RS modes support all procedures and syntaxes
exported by the (rnrs base)
library.
Larceny's R5RS mode does not support library
, import
, or
identifier-syntax
.
The semantics of quasiquote
, let-syntax
, and letrec-syntax
differ between the R5RS and the R6RS. Larceny's R5RS mode still
supports the R5RS semantics.
All of Larceny's modes support all features of the (rnrs unicode)
library.
Larceny v0.97 conforms to The Unicode Standard, Version 5.0.
ERR5RS and R6RS modes support all procedures and syntaxes
exported by (rnrs bytevectors)
, but the endianness
syntax is deprecated because it is redundant with quote
.
Larceny's R5RS mode does not support endianness
.
In Larceny, any symbol names a supported endianness.
The symbols big
and little
have their expected meanings.
All other symbols mean (native-endianness)
with respect
to integer operations, but mean the opposite of
(native-endianness)
with respect to
IEEE-754 operations.
For string operations, the endianness must be the symbol
big
or the symbol little
. All of these extensions are
permitted by the R6RS standard.
Larceny's utf16->string
and utf32->string
accept one,
two, or three arguments. The R6RS specification of these
procedures does not allow them to accept a single argument,
but that is believed to be an error in the R6RS.
ERR5RS and R6RS modes support all procedures and syntaxes
exported by
(rnrs records procedural)
,
(rnrs records inspection)
, and
(rnrs records syntactic)
.
Those libraries are deprecated, however;
the make-record-constructor-descriptor
procedure does
not simplify unusually complex cases enough to justify
the complexity it adds to typical cases, and
the entire syntactic layer is gratuitously incompatible
with the procedural layer.
Larceny's R5RS mode supports all features of the deprecated
(rnrs records procedural)
and (rnrs records inspection)
libraries. R5RS mode does not support (rnrs records syntactic)
.
All of Larceny's modes support all features of the
(err5rs records procedural)
and
(err5rs records inspection)
libraries. ERR5RS and R6RS modes also support the
(err5rs records syntactic)
library.
These libraries are equivalent to the
(srfi :99 records procedural)
,
(srfi :99 records inspection)
, and
(srfi :99 records syntactic)
libraries.
The record definition syntax of
SRFI 9
is a proper subset
of the syntax provided by the (err5rs records syntactic)
library. In R5RS mode, SRFI 9 can be loaded dynamically
using the
require
procedure:
> (require 'srfi-9)
We recommend the ERR5RS and/or SRFI 9 libraries be used instead of the corresponding R6RS libraries.
The R6RS spouts some tendentious nonsense about procedural records being slower than syntactic records, but this is not true of Larceny's records, and is unlikely to be true of other implementations either.
Larceny continues to support its old-style records, which are almost but not quite compatible with ERR5RS and R6RS records. This can be confusing, since some of Larceny's procedures have the same names as R6RS procedures. That has made it necessary to overload those procedures to work with both old-style and R6RS records. We apologize for the mess.
All of Larceny's modes support all features of the
(rnrs exceptions)
and (rnrs conditions)
libraries.
ERR5RS and R6RS modes support all names exported by the
(rnrs io ports)
, (rnrs io simple)
, and
(rnrs files)
libraries.
The buffer-mode
, eol-style
, and error-handling-mode
syntaxes are deprecated because they are redundant
with quote
. These deprecated syntaxes may be provided
in the form of procedures rather than syntax, but this
deviation from R6RS semantics cannot be detected by
portable R6RS programs.
Larceny's R5RS mode supports all non-deprecated features of those libraries.
Larceny supports four distinct buffer modes: none
,
line
, datum
, and block
. The R6RS requires
the buffer-mode
syntax to raise an exception for the
datum
buffer mode, which is the buffer mode Larceny
uses for interactive output ports.
In Larceny, any symbol names a supported end-of-line style. All end-of-line and error-handling-mode symbols whose meanings are not described by the R6RS have locale-dependent meanings, which is an extension permitted by the R6RS standard.
Although Larceny supports the UTF-16 codec, it is not really useful on Windows machines (where it should be most useful) because Larceny's low-level file system mimics a byte-oriented Unix file system even on Windows. This problem should be addressed in some future version of Larceny.
The most up-to-date list of known deviations from R6RS io semantics can be found on the web page that describes the current status of Larceny's R6RS-compatible mode.
ERR5RS and R6RS modes support the (rnrs programs)
library.
Larceny's R5RS mode provides the exit
procedure but
not the command-line
procedure of that library.
Larceny's traditional command-line-arguments
procedure
can be used to implement an approximation to command-line
.
For a definition, see lib/R6RS/rnrs/programs.sls
.
All of Larceny's modes support all features of the
(rnrs arithmetic fixnums)
,
(rnrs arithmetic flonums)
, and
(rnrs arithmetic bitwise)
libraries.
R6RS fixnum and flonum operations may be slower than the corresponding generic operations, since the fixnum and flonum operations are required to check their arguments and may also have to check their results. Isolated operations in small micro-benchmarks are likely to be slower than groups of similar operations in larger programs, however, because the Twobit compiler removes redundant checks and propagates type information.
ERR5RS and R6RS modes support the (rnrs syntax-case)
library.
Larceny's R5RS mode does not.
All of Larceny's modes support all features of the
(rnrs hashtables)
library.
Larceny's traditional make-hashtable
procedure has been
renamed to make-oldstyle-hashtable
.
When you use Larceny's R5RS or ERR5RS mode to dump a heap image
that contains eq?
or eqv?
hashtables you have created, they
are automatically reset so they will
rehash themselves whenever you begin a new session with the
dumped heap.
ERR5RS and R6RS modes support the (rnrs enums)
library.
Larceny's R5RS mode provides all of the procedures exported by
(rnrs enums)
but does not provide the define-enumeration
syntax.
ERR5RS and R6RS modes support the (rnrs eval)
library.
Larceny's R5RS mode provides an R5RS-compatible eval procedure,
not an R6RS-compatible eval procedure, and does not provide the
environment
procedure.
All of Larceny's modes support all features of the
(rnrs mutable-pairs)
and (rnrs mutable-strings)
libraries.
Larceny provides libraries for loading and compiling ERR5RS/R6RS libraries and for timing benchmarks. Future versions of Larceny will offer more ERR5RS/R6RS libraries.
The (larceny load)
library exports both the
load
procedure of (err5rs load)
and r5rs:require
, which is a renaming of
the require
procedure used by
Larceny's R5RS mode.
In Larceny's ERR5RS mode,
the load
procedure can load
R5RS libraries and programs as well as ERR5RS/R6RS
libraries.
The r5rs:require
procedure should be used only for dynamic loading of
R5RS libraries into Larceny's underlying R5RS system. The
variables defined by that library can be imported into
an ERR5RS session or ERR5RS/R6RS library or program using a
primitives
clause in an import
form.
These procedures should be used only at an interactive top
level and in files that will be loaded into an interactive top
level. Calls to these procedures have no effect at compile
time, and should not appear in files that will be compiled
separately; use the library
and import
syntaxes instead.
The (larceny compiler)
library exports the
load
and r5rs:require
procedures of (larceny load)
,
the
current-require-path
procedure, the
compile-file
,
compile-library
, and
compile-stale-libraries
procedures described below,
and the
compiler-switches
procedure.
These procedures can be used to compile ERR5RS/R6RS libraries and top-level programs before they are imported or executed. This is especially important for Petit Larceny, which would otherwise use an interpreter. For native Larceny, whose just-in-time compiler generates native machine code as source libraries and programs are loaded, imported, or executed, the main advantage of separate compilation is that compiled libraries and programs will load much faster than source libraries and programs.
The main disadvantage of separate compilation is that
compiled libraries and programs go stale when their
source code is changed or when a library on which they
depend is changed or recompiled. Stale libraries and
programs can be dangerously inconsistent with libraries
on which they depend, so Larceny checks for staleness
and refuses to execute a stale library or program.
The compile-stale-libraries
procedure provides a convenient way to recompile stale
libraries and programs.
(compile-file sourcefile [slfaslfile])
Compiles sourcefile, which must be a string naming
a file that contains source code for one or more
ERR5RS/R6RS libraries or a top-level program.
If slfaslfile is supplied as a second argument,
then it must be a string naming the file that will
contain the compiled code; otherwise the name of
the compiled file is obtained from sourcefile
by replacing the ".sls
" suffix with ".slfasl
".
(compile-library sourcefile [slfaslfile])
Compiles sourcefile, which must be a string naming
a file that contains source code for one or more
ERR5RS/R6RS libraries.
Apart from its unwillingness to compile top-level
programs, compile-library
behaves the same as
compile-file
above.
Procedure compile-stale-libraries
(compile-stale-libraries changedfile)
If no argument is supplied, then all ".sls
" files that
lie within the current directory or a subdirectory are
recompiled.
If changedfile is supplied, then it must be a string giving the absolute pathname of a file. (In typical usage, changedfile is a source file that has been modified, making it necessary to recompile all files that depend upon it.) Compiles all ERR5RS/R6RS library files that lie within the same directory as changedfile or a subdirectory, and have not yet been compiled or whose compiled files are older than changedfile.
In future versions of Larceny, compile-stale-libraries
might compile only the source files that depend upon
changedfile.
(compiler-switches mode)
If no argument is supplied, then the current settings
of all compiler switches are displayed. Each of those
switches is itself a parameter that is exported by the
(larceny compiler)
library. Calling any individual
compiler switch with no arguments will return its current
setting. Calling any individual compiler switch with an
argument (usually a boolean) will change its setting to
that argument.
The compiler-switches
procedure may also be called with
one of the following symbols as its argument:
default
sets most compiler switches to their default settings.
fast-safe
enables all optimizations but continues to generate
code to perform all run-time type and range checks that
are needed for safety
(in the traditional sense, not the R6RS sense).
fast-unsafe
enables all optimizations and also disables type and
range checking. This setting is deprecated because it
compromises safety (in the traditional sense).
slow
turns off all optimizations.
standard
sets compiler switches for maximal conformance to the
R5RS and R6RS standards.
The standard
setting is deprecated because it generates
very slow code (because the R5RS makes it difficult to
inline standard procedures), disables most compile-time
checking (because the R6RS forbids rejection of programs
with obvious errors unless the R6RS classifies the errors
as syntactic), and may also compromise the portability or
interoperability of ERR5RS/R6RS libraries and programs
(because the R6RS outlaws several extensions that Larceny
uses to improve its compatibility with other implementations
of the R5RS and R6RS as well as interoperability between
Larceny's own R5RS and ERR5RS/R6RS modes).
Selective toggling of compiler switches is almost always
better than using the standard
setting.
To improve R5RS conformance without sacrificing too much
performance, set the benchmark-mode
switch to false and
set the integrate-procedures
switch to false only when
compiling files that need to be sensitive to redefinitions
of standard procedures.
For R6RS libraries and programs, setting the benchmark-mode
and global-optimization
switches to false will eliminate a
couple of minor conformance issues with only a small loss
of performance and without sacrificing compile-time checking
or portability.
The (larceny benchmarking)
library exports the
time
syntax and run-benchmark
procedure described
below.
Syntax time
(time expression)
Evaluates expression and returns its result after printing approximations to the storage allocated and time taken during evaluation of expression.
> (time (fib 30)) Words allocated: 0 Words reclaimed: 0 Elapsed time...: 49 ms (User: 48 ms; System: 0 ms) Elapsed GC time: 0 ms (CPU: 0 in 0 collections.) 832040
(run-benchmark name iterations thunk predicate)
Given the name of a benchmark, the number of iterations to be performed, a zero-argument procedure thunk that runs the benchmark, and a unary predicate that checks the result of thunk, prints approximations to the storage allocated and time taken by iterations calls to thunk.
> (run-benchmark "fib30" 100 (lambda () (fib 30)) (lambda (x) (= x 832040))) -------------------------------------------------------- fib30 Words allocated: 0 Words reclaimed: 0 Elapsed time...: 4828 ms (User: 4824 ms; System: 4 ms) Elapsed GC time: 0 ms (CPU: 0 in 0 collections.)
The (larceny records printer)
library exports the
two procedures described below. These procedures
can be used to override Larceny's usual printing
of records and opaque types that were defined using
the records libraries.
(rtd-printer rtd) => maybe-procedure
Given a record type descriptor, returns its custom print procedure, or returns false if the rtd has no custom print procedure.
(rtd-printer-set! rtd printer)
Given a record type descriptor rtd and a printer for instances of that rtd, installs printer as a custom print procedure for rtd. The printer should be a procedure that, given an instance of the rtd and a textual output port, writes a representation of the instance to the port.
The procedures described in this chapter are nonstandard. Some are deprecated after being rendered obsolete by ERR5RS or R6RS standard libraries. Others still provide useful capabilities that the standard libraries don't.
Larceny provides Unicode strings with R6RS semantics.
The string-downcase
and string-upcase
procedures
perform Unicode-compatible case folding, which can result
in a string whose length is different from that of the original.
Larceny may still provide string-downcase!
and string-upcase!
procedures, but they are deprecated.
A bytevector is a data structure that stores bytes — exact 8-bit unsigned integers. Bytevectors are useful in constructing system interfaces and other low-level programming. In Larceny, many bytevector-like structures — bignums, for example — are implemented in terms of a lower-level bytevector-like data type. The operations on generic bytevector-like structures are particularly fast but useful largely in code that manipulates Larceny's data representations.
The (rnrs bytevectors)
library now
provides a large set of procedures that, in Larceny, are
defined using the procedures described below.
Integrable procedure make-bytevector
(make-bytevector length) => bytevector
(make-bytevector length fill) => bytevector
Returns a bytevector of the desired length. If no second argument is given, then the bytevector has not been initialized and most likely contains garbage.
Operations on bytevector structures
(bytevector? obj) => boolean
(bytevector-length bytevector) => integer
(bytevector-ref bytevector offset) => byte
(bytevector-set! bytevector offset byte) => unspecified
(bytevector-equal? bytevector1 bytevector2) => boolean
(bytevector-fill! bytevector byte) => unspecified
(bytevector-copy bytevector) => bytevector
These procedures do what you expect.
All are integrable, except bytevector-equal?
and bytevector-copy
.
The bytevector-equal?
name is deprecated, since the
R6RS calls it bytevector=?
.
Operations on bytevector-like structures
(bytevector-like? obj) => boolean
(bytevector-like-length bytevector) => integer
(bytevector-like-ref bytevector offset) => byte
(bytevector-like-set! bytevector offset byte) => unspecified
(bytevector-like-equal? bytevector1 bytevector2) => boolean
(bytevector-like-copy bytevector) => bytevector
A bytevector-like structure is a low-level representation for indexed arrays of uninterpreted bytes. Bytevector-like structures are used to represent types such as bignums and flonums.
There is no way to construct a "generic" bytevector-like structure; use the constructors for specific bytevector-like types.
The bytevector-like operations operate on all bytevector-like
structures. All are integrable, except bytevector-like-equal?
and bytevector-like-copy
. All are deprecated because they
violate abstraction barriers and make your code
representation-dependent; they are useful mainly to
Larceny developers, who might otherwise be tempted to
write some low-level operations in C or assembly language.
(vector-copy vector) => vector
Returns a shallow copy of its argument.
Operations on vector-like structures
(vector-like? object) => boolean
(vector-like-length vector-like) => fixnum
(vector-like-ref vector-like k) => object
(vector-like-set! vector-like k object) => unspecified
A vector-like structure is a low-level representation for indexed arrays of Scheme objects. Vector-like structures are used to represent types such as vectors, records, symbols, and ports.
There is no way to construct a "generic" vector-like structure; use the constructors for specific data types.
The vector-like operations operate on all vector-like structures. All are integrable. All are deprecated because they violate abstraction barriers and make your code representation-dependent; they are useful mainly to Larceny developers, who might otherwise be tempted to write some low-level operations in C or assembly language.
Operations on procedures
(make-procedure length) => procedure
(procedure-length procedure) => fixnum
(procedure-ref procedure offset) => object
(procedure-set! procedure offset object) => unspecified
These procedures operate on the representations of procedures and allow user programs to construct, inspect, and alter procedures.
(procedure-copy procedure) => procedure
Returns a shallow copy of the procedure.
The procedures above are deprecated because they violate abstraction barriers and make your code representation-dependent; they are useful mainly to Larceny developers, who might otherwise be tempted to write some low-level operations in C or assembly language.
The rest of this section describes some procedures that reach through abstraction barriers in a more controlled way to extract heuristic information from procedures for debugging purposes.
The following text is copied from a straw proposal authored by Will Clinger and sent to rrr-authors on 09 May 1996. The text has been edited lightly. See the end for notes about the Larceny implementation.
The procedures that extract heuristic information from procedures are permitted to return any result whatsoever. If the type of a result is not among those listed below, then the result represents an implementation-dependent extension to this interface, which may safely be interpreted as though no information were available from the procedure. Otherwise the result is to be interpreted as described below.
Returns information about the arity of proc. If the result is #f
,
then no information is available. If the result is an exact
non-negative integer k, then proc requires exactly k
arguments. If the result is an inexact non-negative integer n, then
proc requires n or more arguments. If the result is a pair, then
it is a list of non-negative integers, each of which indicates a
number of arguments that will be accepted by proc; the list is not
necessarily exhaustive.
Procedure procedure-documentation-string
(procedure-documentation-string proc)
Returns general information about proc. If the result is #f
, then no
information is available. If the result is a string, then it is to be
interpreted as a "documentation string" (see Common Lisp).
Returns information about the name of proc. If the result is #f
,
then no information is available. If the result is a symbol or string,
then it represents a name. If the result is a pair, then it is a list
of symbols and/or strings representing a path of names; the first
element represents an outer name and the last element represents an
inner name.
Procedure procedure-source-file
Returns information about the name of a file that contains the source
code for proc. If the result is #f
, then no information is
available. If the result is a string, then the string is the name of a
file.
Procedure procedure-source-position
(procedure-source-position proc)
Returns information about the position of the source code for proc
whithin the source file specified by procedure-source-file. If the
result is #f
, then no information is available. If the result is an
exact integer k, then k characters precede the opening parenthesis
of the source code for proc within that source file.
Procedure procedure-expression
Returns information about the source code for proc. If the result is
#f
, then no information is available. If the result is a pair, then it
is a lambda expression in the traditional representation of a list.
Procedure procedure-environment
Returns information about the environment of proc. If the result is
#f
, then no information is available. In any case the result may be
passed to any of the environment inquiry functions.
Notes on the Larceny implementation
Twobit does not yet produce data for all of these functions, so some
of them always return #f
.
The (rnrs lists)
library now
provides a set of procedures that may supersede some
of the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
(append! list1 list2 … obj) => object
append!
destructively appends its arguments, which must be lists, and
returns the resulting list. The last argument can be any object. The
argument lists are appended by changing the cdr of the last pair of
each argument except the last to point to the next argument.
(every? procedure list1 list2 …) => object
every?
applies procedure to each element tuple of list_s in
first-to-last order, and returns #f
as soon as _procedure returns
#f
. If procedure does not return #f
for any element tuple of
list_s, then the value returned by _procedure for the last element
tuple of _list_s is returned.
(last-pair list-structure) => pair
last-pair
returns the last pair of the list structure, which must be
a sequence of pairs linked through the cdr fields.
list-copy
makes a shallow copy of the list and returns that copy.
Each of these procedures returns a new list which contains all the
elements of list in the original order, except that those elements of
the original list that were equal to key (or that satisfy pred?) are
not in the new list. Remove uses equal?
as the equivalence predicate;
remq
uses eq?
, and remv
uses eqv?
.
These procedures are like remove
, remq
, remv
, and remp
,
except they modify list instead of returning a fresh list.
reverse!
destructively reverses its argument and returns the reversed
list.
(some? procedure list1 list2 …) => object
some?
applies procedure to each element tuple of list_s in
first-to-last order, and returns the first non-false value returned by
_procedure. If procedure does not return a true value for any
element tuple of _list_s, then some? returns #f
.
The (rnrs sorting)
library now
provides a small set of procedures that supersede most
of the procedures described below.
All of the procedures described below are therefore
deprecated.
Procedures sort and sort!
(sort list less?) => list
(sort vector less?) => vector
(sort! list less?) => list
(sort! vector less?) => vector
These procedures sort their argument (a list or a vector) according to the predicate less?, which must implement a total order on the elements in the data structures that are sorted.
sort
returns a fresh data structure containing the sorted data;
sort!
sorts the data structure in-place.
Larceny's records have been extended to implement all ERR5RS and R6RS procedures from
(err5rs records procedural) (err5rs records inspection) (rnrs records procedural) (rnrs records inspection)
We recommend that Larceny programmers use the ERR5RS APIs instead of the R6RS APIs. This should entail no loss of portability, since the standard reference implementation of ERR5RS records should run efficiently in any implementation of the R6RS that permits new libraries to defined at all.
Larceny now has two kinds of records: old-style and ERR5RS/R6RS. Old-style records cannot be created in R6RS-conforming mode, so our extension of R6RS procedures to accept old-style records does not affect R6RS conformance.
The following specification describes Larceny's old-style record API, which is now deprecated. It is based on a proposal posted by Pavel Curtis to rrrs-authors on 10 Sep 1989, and later re-posted by Norman Adams to comp.lang.scheme on 5 Feb 1992. The authorship and copyright status of the original text are unknown to me.
This document differs from the original proposal in that its record types are extensible, and that it specifies the type of record-type descriptors.
(make-record-type type-name field-names)
Returns a "record-type descriptor", a value representing a new data type, disjoint from all others. The type-name argument must be a string, but is only used for debugging purposes (such as the printed representation of a record of the new type). The field-names argument is a list of symbols naming the "fields" of a record of the new type. It is an error if the list contains any duplicates.
If the parent-rtd argument is provided, then the new type will be a subtype of the type represented by parent-rtd, and the field names of the new type will include all the field names of the parent type. It is an error if the complete list of field names contains any duplicates.
Record-type descriptors are themselves records. In particular,
record-type descriptors have a field printer that is either #f
or a
procedure. If the value of the field is a procedure, then the
procedure will be called to print records of the type represented by
the record-type descriptor. The procedure must accept two arguments:
the record object to be printed and an output port.
Returns a procedure for constructing new members of the type represented by rtd. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in that list are unspecified. The field-names argument defaults to the list of field-names in the call to make-record-type that created the type represented by rtd; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.
Returns a procedure for testing membership in the type represented by rtd. The returned procedure accepts exactly one argument and returns a true value if the argument is a member of the indicated record type or one of its subtypes; it returns a false value otherwise.
(record-accessor rtd field-name)
Returns a procedure for reading the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd, or a member of the field-names of the parent type of the type represented by rtd.
(record-updater rtd field-name)
Returns a procedure for writing the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the updater procedure is unspecified. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd, or a member of the field-names of the parent type of the type represented by rtd.
(record? obj)
Returns a true value if obj is a record of any type and a false value
otherwise. Note that record?
may be true of any Scheme value; of
course, if it returns true for some particular value, then
record-type-descriptor
is applicable to that value and returns an
appropriate descriptor.
Procedure record-type-descriptor
(record-type-descriptor record)
Returns a record-type descriptor representing the type of the given record. That is, for example, if the returned descriptor were passed to record-predicate, the resulting predicate would return a true value when passed the given record. Note that it is not necessarily the case that the returned descriptor is the one that was passed to record-constructor in the call that created the constructor procedure that created the given record.
Returns the type-name associated with the type represented by rtd. The returned value is eqv? to the type-name argument given in the call to make-record-type that created the type represented by rtd.
Procedure record-type-field-names
Returns a list of the symbols naming the fields in members of the type represented by rtd.
Returns a record-type descriptor for the parent type of the type represented by rtd, if that type has a parent type, or a false value otherwise. The type represented by rtd has a parent type if the call to make-record-type that created rtd provided the parent-rtd argument.
Procedure record-type-extends?
(record-type-extends? rtd1 rtd2)
Returns a true value if the type represented by rtd1 is a subtype of the type represented by rtd2 and a false value otherwise. A type s is a subtype of a type t if s=t or if the parent type of s, if it exists, is a subtype of t.
The R6RS spouts some tendentious nonsense about procedural records being slower than syntactic records, but this is not true of Larceny's records, and is unlikely to be true of other implementations either. Larceny's procedural records are fairly efficient already, and will become even more efficient in future versions as interlibrary optimizations are added.
The (rnrs io ports)
and (rnrs files)
libraries now
provide a set of procedures that may supersede some
of the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
(close-open-files ) => unspecified
Closes all open files.
(console-input-port ) => input-port
Returns a character input port such that no read from the port has signalled an error or returned the end-of-file object.
Rationale: console-input-port and console-output-port are artifacts of Unix interactive I/O conventions, where an interactive end-of-file does not mean "quit" but rather "done here". Under these conventions the console port should be reset following an end-of-file. Resetting conflicts with the semantics of ports in Scheme, so console-input-port and console-output-port return a new port if the current port is already at end-of-file.
Since it is convenient to handle errors in the same manner as end-of-file, these procedures also return a new port if an error has been signalled during an I/O operation on the port.
Console-input-port and console-output-port simply call the port generators installed in the parameters console-input-port-factory and console-output-port-factory, which allow user programs to install their own console port generators.
(console-output-port ) => output-port
Returns a character output port such that no write to the port has signalled an error.
See console-input-port for a full explanation.
Parameter console-input-port-factory
The value of this parameter is a procedure that returns a character input port such that no read from the port has signalled an error or returned the end-of-file object.
See console-input-port for a full explanation.
Parameter console-output-port-factory
The value of this parameter is a procedure that returns a character output port such that no write the port has signalled an error.
See console-input-port for a full explanation.
The value of this parameter is a character input port.
The value of this parameter is a character output port.
(delete-file filename) => unspecified
Deletes the named file. No error is signalled if the file does not exist.
(eof-object ) => end-of-file object
Eof-object returns an end-of-file object.
(file-exists? filename) => boolean
File-exists? returns #t if the named file exists at the time the procedure is called.
Procedure file-modification-time
(file-modification-time filename) => vector or #f
File-modification-time returns the time of last modification of the file as a vector, or #f if the file does not exist. The vector has six elements: year, month, day, hour, minute, second, all of which are exact nonnegative integers. The time returned is relative to the local timezone.
(file-modification-time "larceny") => #(1997 2 6 12 51 13)
(file-modification-time "geekdom") => #f
(flush-output-port ) => unspecified
(flush-output-port port) => unspecified
Write any buffered data in the port to the underlying output medium.
(get-output-string string-output-port) => string
Retrieve the output string from the given string output port.
(open-input-string string) => input-port
Creates an input port that reads from string. The string may be shared with the caller. A string input port does not need to be closed, although closing it will prevent further reads from it.
(open-output-string ) => output-port
Creates an output port where any output is written to a string. The accumulated string can be retrieved with get-output-string at any time.
Tests whether its argument is a port.
Returns the name associated with the port; for file ports, this is the file name.
(port-position port) => fixnum
Returns the number of characters that have been read from or written to the port.
(rename-file from to) => unspecified
Renames the file from and gives it the name to. No error is signalled if from does not exist or to exists.
(reset-output-string port) => unspecified
Given a port created with open-output-string, deletes from the port all the characters that have been output so far.
Procedure with-input-from-port
(with-input-from-port input-port thunk) => object
Calls thunk with current input bound to input-port in the dynamic extent of thunk. Returns whatever value was returned from thunk.
(with-output-to-port output-port thunk) => object
Calls thunk with current output bound to output-port in the dynamic extent of thunk. Returns whatever value was returned from thunk.
Procedure command-line-arguments
(command-line-arguments ) => vector
Returns a vector of strings: the arguments supplied to the program by the user or the operating system.
(dump-heap filename procedure) => unspecified
Dump a heap image to the named file that will start up with the
supplied procedure. Before procedure is called, command line
arguments will be parsed and any init procedures registered with
add-init-procedure!
will be called.
Note: Currently, heap dumping is only available with the
stop-and-copy collector (-stopcopy
command line option), although the
heap image can be used with all the other collectors.
Procedure dump-interactive-heap
(dump-interactive-heap filename) => unspecified
Dump a heap image to the named file that will start up with the
standard read-eval-print loop. Before the read-eval-print loop is
called, command line arguments will be parsed and any init procedures
registered with add-init-procedure!
will be called.
Note: Currently, heap dumping is only available with the
stop-and-copy collector (-stopcopy
command line option), although the
heap image can be used with all the other collectors.
Returns the operating system environment mapping for the string key,
or #f
if there is no mapping for key.
Send the command to the operating system's command processor and return the command's exit status, if any. On Unix, command is a string and status is an exact integer.
Fixnums are small exact integers that are likely to be represented without heap allocation. Larceny never represents a number that can be represented as a fixnum any other way, so programs that can use fixnums will do so automatically. However, operations that work only on fixnums can sometimes be substantially faster than generic operations, and the following primitives are provided for use in those programs that need especially good performance.
The (rnrs arithmetic fixnums)
library now
provides a large set of procedures that, in Larceny, are
defined using the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
All arguments to the following procedures must be fixnums.
Returns #t
if its argument is a fixnum, and #f
otherwise.
Returns the fixnum sum of its arguments. If the result is not representable as a fixnum, then an error is signalled (unless error checking has been disabled).
Returns the fixnum difference of its arguments. If the result is not representable as a fixnum, then an error is signalled.
Returns the fixnum negative of its argument. If the result is not representable as a fixnum, then an error is signalled.
Returns the fixnum product of its arguments. If the result is not representable as a fixnum, then an error is signalled.
Returns #t
if its arguments are equal, and #f
otherwise.
Returns #t
if fix1 is less than fix2, and #f
otherwise.
Returns #t
if fix1 is less than or equal to fix2, and #f
otherwise.
Returns #t
if fix1 is greater than fix2, and #f
otherwise.
Returns #t
if fix1 is greater than or equal to fix2, and #f
otherwise.
Returns #t
if its argument is less than zero, and #f
otherwise.
Returns #t
if its argument is greater than zero, and #f
otherwise.
Returns #t
if its argument is zero, and #f
otherwise.
(fxlogand fix1 fix2) => fixnum
Returns the bitwise and of its arguments.
(fxlogior fix1 fix2) => fixnum
Returns the bitwise inclusive or of its arguments.
Returns the bitwise not of its argument.
(fxlogxor fix1 fix2) => fixnum
Returns the bitwise exclusive or of its arguments.
Returns fix1 shifted left fix2 places, shifting in zero bits at the low end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Procedure most-positive-fixnum
(most-positive-fixnum ) => fixnum
Returns the largest representable positive fixnum.
Procedure most-negative-fixnum
(most-negative-fixnum ) => fixnum
Returns the smallest representable negative fixnum.
Returns fix1 shifted right fix2 places, shifting in a copy of the sign bit at the left end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Returns fix1 shifted right fix2 places, shifting in zero bits at the high end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Larceny has six representations for numbers: fixnums are small, exact integers; bignums are unlimited-precision exact integers; ratnums are exact rationals; flonums are inexact rationals; rectnums are exact complexes; and compnums are inexact complexes.
Number-representation predicates
(fixnum? obj) => boolean
(bignum? obj) => boolean
(ratnum? obj) => boolean
(flonum? obj) => boolean
(rectnum? obj) => boolean
(compnum? obj) => boolean
These predicates test whether an object is a number of a particular
representation and return #t
if so, #f
if not.
(random limit) => exact integer
Returns a pseudorandom nonnegative exact integer in the range 0 through limit-1.
Hashtables represent finite mappings from keys to values. If the hash function is a good one, then the value associated with a key may be looked up in constant time (on the average).
The R6RS hashtables library are a big improvement over Larceny's traditional hash tables, and should be used instead of the API described below.
To resolve a clash of names and semantics with the
R6RS make-hashtable
procedure, Larceny's traditional
make-hashtable
procedure has been renamed to
make-oldstyle-hashtable
.
Procedure make-oldstyle-hashtable
(make-oldstyle-hashtable hash-function bucket-searcher size) => hashtable
Returns a newly allocated mutable hash table using hash-function as
the hash function and bucket-searcher, e.g. assq
, assv
, assoc
, to
search a bucket with size buckets at first, expanding the number of
buckets as needed. The hash-function must accept a key and return a
non-negative exact integer.
(make-oldstyle-hashtable hash-function bucket-searcher) => hashtable
Equivalent to (make-oldstyle-hashtable hash-function bucket-searcher n)
for
some value of n chosen by the implementation.
(make-oldstyle-hashtable hash-function) => hashtable
Equivalent to (make-oldstyle-hashtable hash-function assv)
.
(make-oldstyle-hashtable ) => hashtable
Equivalent to (make-oldstyle-hashtable object-hash assv)
.
(hashtable-contains? hashtable key) => bool
Returns true iff the hashtable contains an entry for key.
(hashtable-fetch hashtable key flag) => object
Returns the value associated with key in the hashtable if the hashtable contains key; otherwise returns flag.
(hashtable-get hashtable key) => object
Equivalent to (hashtable-fetch #f)
.
(hashtable-put! hashtable key value) => unspecified
Changes the hashtable to associate key with value, replacing any existing association for key.
(hashtable-remove! hashtable key) => unspecified
Removes any association for key within the hashtable.
(hashtable-clear! hashtable) => unspecified
Removes all associations from the hashtable.
(hashtable-size hashtable) => integer
Returns the number of keys contained within the hashtable.
(hashtable-for-each procedure hashtable) => unspecified
The procedure must accept two arguments, a key and the value associated with that key. Calls the procedure once for each key-value association in hashtable. The order of these calls is indeterminate.
(hashtable-map procedure hashtable)
The procedure must accept two arguments, a key and the value associated with that key. Calls the procedure once for each key-value association in hashtable, and returns a list of the results. The order of the calls is indeterminate.
(hashtable-copy hashtable) => hashtable
Returns a copy of the hashtable.
The hash values returned by these functions are nonnegative exact integer suitable as hash values for the hashtable functions.
(equal-hash object) => integer
Returns a hash value for object based on its contents.
(object-hash object) => integer
Returns a hash value for object based on its identity.
This hash function performs extremely poorly on pairs,
vectors, strings, and bytevectors, which are the objects
with which it is mostly likely to be used.
For efficient hashing on object identity, create the
hashtable with make-eq-hashtable
or make-eqv-hashtable
of the (rnrs hashtables)
library.
(string-hash string) => fixnum
Returns a hash value for string based on its content.
(symbol-hash symbol) => fixnum
Returns a hash value for symbol based on its print name.
The symbol-hash
is very fast, because the hash code is cached in the symbol data
structure.
Parameters are procedures that serve as containers for values; parts of the system that do not operate in the same namespace can still share parameters and thereby read and write shared state.
A parameter takes zero or one arguments. If called with no arguments, it returns the current value of the parameter. If called with one argument, it sets the parameter's value to that of the argument and returns the new value.
(make-parameter name value [predicate]) => procedure
Create a parameter with name name, initial value value, and
optional setter predicate predicate. When the parameter is set the
new value is first passed to predicate,, and if it returns #f
then
an error is signalled. Name can be a symbol or a string.
Syntax parameterize
(parameterize ((parameter0 value0) …) expr0 expr1 …)
Parameterize overrides the values of a set of parameters in a dynamic scope — it is like fluid-let for parameters.
The following list of parameters does not yet include the reader or compiler switches, which are also parameters.
Parameter console-input-port-factory
Parameter console-output-port-factory
Parameter interaction-environment
Parameter keyboard-interrupt-handler
The property list of a symbol is an association list that is attached to that symbol. The association list maps properties, which are themselves symbols, to arbitrary values.
(putprop symbol property obj) => unspecified
If an association exists for property on the property list of symbol, then its value is replaced by the new value obj. Otherwise, a new association is added to the property list of symbol that associates property with obj.
(getprop symbol property) => obj
If an association exists for property on the property list of
symbol, then its value is returned. Otherwise, #f
is returned.
(remprop symbol property) => unspecified
If an association exists for property on the property list of symbol, then that association is removed. Otherwise, this is a no-op.
Gensym returns a new uninterned symbol, the name of which contains the given string.
Oblist returns the list of interned symbols.
(oblist-set! list) => unspecified
(oblist-set! list table-size) => unspecified
oblist-set!
sets the list of interned symbols to those in the given
list by clearing the symbol hash table and storing the symbols in
list in the hash table. If the optional table-size is given, it is
taken to be the desired size of the new symbol table.
See also: symbol-hash.
(collect generation) => unspecified
(collect generation method) => unspecified
Collect initiates a garbage collection. If the system has multiple generations, then the optional arguments are interpreted as follows. The generation is the generation to collect, where 0 is the youngest generation. The method determines how the collection is performed. If method is the symbol collect, then a full collection is performed in that generation, whatever that means — in a normal multi-generational copying collector, it means that all live objects in the generation's current semispace and all live objects from all younger generations are copied into the generation's other semispace. If method is the symbol promote, then live objects are promoted from younger generations into the target generation — in our example collector, that means that the objects are copied into the target generation's current semispace.
The default value for generation is 0, and the default value for method is collect.
Note that the collector's internal policy settings may cause it to perform a more major type of collection than the one requested; for example, an attempt to collect generation 2 could cause the collector to promote all live data into generation 3.
gc-counter returns the number of garbage collections performed since startup. On a 32-bit system, the counter wraps around every 1,073,741,824 collections.
gc-counter is a primitive and compiles to a single load instruction on the SPARC.
major-gc-counter returns the number of major garbage collections performed since startup, where a major collection is defined as a collection that may change the address of objects that have already survived a previous collection. On a 32-bit system, the counter wraps around every 1,073,741,824 collections.
major-gc-counter is a primitive and compiles to a single load instruction on the SPARC. Its primary use to implement efficient hashtables that hash on object identity (make-eq-hashtable and make-eqv-hashtable).
(gcctl heap-number operation operand) => unspecified
[GCCTL is largely obsolete in the new garbage collector but may be resurrected in the future. It can still be used to control the non-predictive collector.]
gcctl controls garbage collection policy on a heap-wise basis. The heap-number is the heap to operate on, like for the command line switches: heap 1 is the youngest. If the given heap number does not correspond to a heap, gcctl fails silently.
The operation is a symbol that selects the operation to perform, and the operand is the operand to that operation, always a number. For the non-predictive garbage collector, the following operator/operand pairs are meaningful:
Example: if the non-predictive heap is heap number 2, then the expressions
(gcctl 2 'j-fixed 0) (gcctl 2 'incr-fixed 1)
makes the non-predictive collector simulate a normal stop-and-copy collector (because j is always set to 0), and grows the heap only one step at a time as necessary. This may be useful for certain kinds of experiments.
Example: ditto, the expressions
(gcctl 2 'j-percent 50) (gcctl 2 'incr-percent 20)
selects the default policy settings.
Note: The gcctl facility is experimental. A more developed facility will allow controlling heap contraction policy, as well as setting all the watermarks. Certainly one can envision other uses, too. Finally, it needs to be possible to get current values.
Note: Currently the non-predictive heap (np-sc-heap.c) and the standard stop-and-copy "old" heap (old-heap.c) are supported, but not the standard "young" heap (young-heap.c), nor the stop-and-copy collector (sc-heap.c).
(sro pointer-tag type-tag limit) => vector
SRO ("standing room only") is a system primitive that traverses the entire heap and returns a vector that contains all live objects in the heap that satisfy the constraints imposed by its parameters:
For example, (sro -1 -1 -1) returns a vector that contains all live objects (not including the vector), and (sro 5 2 3) returns a vector containing all live flonums (bytevector-like, with typetag 2) that are referred to in no more than 3 places.
(stats-dump-on filename) => unspecified
Stats-dump-on turns on garbage collection statistics dumping. After each collection, a complete RTS statistics dump is appended to the file named by filename.
The file format and contents are documented in a banner written at the top of the output file. In addition, accessor procedures for the output structure are defined in the program Util/process-stats.sch.
Stats-dump-on does not perform an initial dump when the file is first opened; only at the first collection is the first set of statistics dumped. The user might therefore want to initiate a minor collection just after turning on dumping in order to have a baseline set of data.
(stats-dump-off ) => unspecified
Stats-dump-off turns off garbage collection statistics dumping (which was turned on with stats-dump-on). It does not dump a final set of statistics before closing the file; therefore, the user may wish to initiate a minor collection before calling this procedure.
System-features returns an association lists of system features. Most entries are self-explanatory. The following are a more subtle:
(display-memstats vector) => unspecified
(display-memstats vector minimal) => unspecified
(display-memstats vector minimal full) => unspecified
Display-memstats takes as its argument a vector as returned by memstats and displays the contents of the vector in human-readable form on the current output port. By default, not all of the values in the vector are displayed.
If the symbol minimal is passed as the second argument, then only a small number of statistics generally relevant to running benchmarks are displayed.
If the symbol full is passed as the second argument, then all statistics are displayed.
Memstats returns a freshly allocated vector containing run-time-system resource usage statistics. Many of these will make no sense whatsoever to you unless you also study the RTS sources. A listing of the contents of the vector is available here.
Run-with-stats evaluates thunk, then prints a short summary of run-time statistics, as with
(display-memstats ... 'minimal),
and then returns the result of evaluating thunk.
(run-benchmark name k thunk ok?) => obj
Run-benchmark prints a short banner (including the identifying name) to identify the benchmark, then runs thunk k times, and finally tests the value returned from the last call to thunk by applying the predicate ok? to it. If the predicate returns true, then run-benchmark prints summary statistics, as with
([display-memstats][5] ... 'minimal).
If the predicate returns false, an error is signalled.
The SRFIs (Scheme Requests For Implementations) is an Internet-based collection of Scheme code designed and provided by Scheme programmers. The SRFI effort is open to anyone, and is described at http://srfi.schemers.org.
The fundamental SRFI is SRFI-0, "Feature-based conditional expansion construct", which allows a program to query the underlying implementation about the available SRFIs (and potentially about other implementation features) at macro expansion time. The design documents for this and other SRFIs are available at the web site shown above.
Larceny currently supports many SRFIs, but not as many as it should.
Some SRFIs are built into Larceny, but most must be loaded dynamically
using Larceny's require
procedure:
> (require 'srfi-0)
Larceny provides the following nonstandard SRFI keys for use in SRFI 0:
larceny
SLIB is a large collection of useful libraries that have been written or collected by Aubrey Jaffer.
Larceny supports SLIB via
SRFI 96,
but SLIB itself is not shipped with Larceny;
it must be downloaded separately and then installed.
For the most up-to-date information on installing and using
SLIB with Larceny, see doc/HOWTO-SLIB
.
Larceny provides a general foreign-function interface (FFI) substrate on which other FFIs can be built; see Larceny Note #7. The FFI described in this manual section is a simple example of a derived FFI. It is not yet fully evolved, but it is useful.
This section has undergone signficant revision, but not all of the material has been properly vetted. Some of the information in this section may be out of date.
Some of the text below is adapted from the 2008 Scheme Workshop paper, “The Layers of Larceny's Foreign Function Interface,” by Felix S Klock II. That paper may provide additional insight for those searching for implementation details and motivations.
There are a number of different potential ways to use the FFI. One client may want to develop code in C and load it into Larceny. Another client may want to load native libraries provided by the host operating system, enabling invocation of foreign code from Scheme expressions without developing any C code or even running a C compiler. Larceny's FFI can be used for both of these cases, but many of its facilities target a third client in between the two extremes: a client with a C compiler and the header files and object code for the foreign libraries, but who wishes to avoid writing glue code in C to interface with the libraries.
There are four main steps to interacting with foreign code:
Step 1 is conceptual, while steps 2 through 4 yield artifacts in Scheme source code.
At the machine code level, foreign values are uninterpreted sequences of bits. Often foreign object code is oriented around manipulating word-sized bit-sequences (words) or arrays and tuples of words.
Many libraries are written with a particular
interpretation of such values. In C code, explicit types are
often used hints to guide such interpretation; for example,
a 0
of type bool
is usually interpreted as false,
while a 1
(or other non-zero value) of type bool
is
usually interpreted as true.
Another example are C enumerations (or enums).
An enum declaration defines a set of named
integral constants. After the C declaration:
enum months { JAN = 1, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC };
a JAN
in C code now denotes 1
, FEB
is 2
, and so on.
Furthermore, tools like debuggers may render a variable x
dynamically assigned the value 2
(and of static type enum months
)
as FEB
. Thus the enum declaration
intoduces a new interpretation for a finite set of integers.
This leads to questions for a client of an FFI; we explore some below.
#f
and #t
for the bool
type, or the Scheme symbols
{JAN
, FEB
, MAR
, APR
, MAY
, JUN
,
JUL
, AUG
, SEP
, OCT
, NOV
, DEC
}
for the enum months
type?
A foreign library might leave the mapping
of names like FEB
to words like 2
unspecified
in the library interface.
That is, while the C compiler will know FEB
maps to 2
according to a particular version of the library's header file,
the library designer may intend to change this mapping
in the future, and clients writing C code should only use
the names to refer to a enum months
value, and not integer
expressions.
Foreign libraries often manipulate mutable entities, like arrays of words where modifications can be observed (often by design).
Will the foreign code hold references to heap-allocated objects? Heap-allocated objects that leak out to foreign memory must be treated with care; garbage collection presents two main problems.
cons-nonrelocatable
, make-nonrelocatable-bytevector
,
and make-nonrelocatable-vector
.
Answering these questions may require deep knowledge of the intended usage of the foreign library.
The Larceny FFI attempts to ease interfacing with foreign code in the presence of the above concerns, but the nature of the header files included with most foreign libraries means that the FFI cannot infer the answers unassisted.
Foreign C code developed to work in concert with Larceny could hypothetically be written to cope with holding handles for objects managed by the the garbage collector, but there is currently no significant support for this use-case.
One class of foreign values is not addressed
by the Larceny FFI: structures passed by value (as
opposed to by reference, ie pointers to structures).
There is no way to describe the interface to a
foreign procedure that accepts or produces a
C struct
(at least not properly nor portably).
This tends to not matter for many foreign libraries (since many C programmers eschew passing structures by value), but it can arise.
If the foreign library of interest has procedures that
accept or produce a C struct
, we currently recommend
either avoiding such procedures, or writing
adapter code in C that marshals between values handled
by the FFI and the C struct
.
The conclusion is: when designing an interface to a foreign library, you should analyze the values manipulated on the foreign side and identify their relationship with values on the Scheme side. After you have identified the domains of interest, you then describe how the values will be marshaled back and forth between the two domains.
This section describes the marshalling protocol defined in
lib/Base/std-ffi.sch
.
Foreign functions automatically marshal their inputs and outputs according to type-descriptors attached to each foreign function.
Type-descriptors are S-expressons formed according to the following grammar:
TypeDesc ::= CoreAttr | ArrowT | MaybeT | OneOfT CoreAttr ::= PrimAttr | VoidStar | --- PrimAttr ::= CurrentPrimAttr | DeprecatedPrimAttr CurrentPrimAttr ::= int | uint | byte | short | ushort | char | uchar | long | ulong | longlong | ulonglong | size_t | float | double | bool | string | void DeprecatedPrimAttr ::= unsigned | boxed VoidStar ::= void* | --- ArrowT ::= (-> (TypeDesc ...) TypeDesc) MaybeT ::= (maybe TypeDesc) OneOfT ::= (oneof (Any Fixnum) ... TypeDesc)
where ---
represents a user-extensible part of the grammar
(see below),
Any
represents any Scheme value, and Fixnum
represents
any word-sized integer.
A central registry maps CoreAttr
's to a foreign
representation and two conversion routines:
one to convert a Scheme value to a foreign argument, and
another to convert a foreign result back back to a Scheme value.
The denoted components are collectively referred to as a type
within the FFI documentation.
The registry is extensible; the ffi-add-attribute-core-entry!
procedure adds new CoreAttr's
to the registry, and
one can alternatively add short-hands for
type-descriptors via the ffi-add-alias-of-attribute-entry!
procedure.
Finally, one can add new VoidStar
productions
(subtypes of the void*
type-descriptor)
via the ffi-install-void*-subtype
procedure
(defined in the lib/Standard/foreign-stdlib.sch
library).
The following is a list of the accepted types and their conversions at the boundary between Scheme and foreign code:
int
int
".
uint
unsigned int
".
byte
int
in the current implementation.
short
int
in the current implementation.
ushort
unsigned
in the current implementation.
char
char
".
uchar
unsigned char
".
long
int
in the current implementation.
ulong
unsigned
in the current implementation.
longlong
long long
".
ulonglong
unsigned long long
".
size_t
uint
in the current implementation.
float
float
".
The conversion to float
is performed via
a C (float)
cast from a C double
.
double
bool
int
";
#f
is converted to 0, and all other objects to 1.
In the reverse direction, 0 is converted to #f
and
all other integers to #t
.
string
#f
is converted to a C "(char*)0
" value.
In the reverse direction, a pointer to a NUL-terminated sequence
of bytes interpreted as ASCII characters is
copied into a freshly allocated Scheme string; a NULL pointer is
converted to #f
.
void
unsigned
uint
; deprecated.
boxed
void*
" to the first element of the structure. The
value #f
is also acceptable. It is converted to a C "(void*)0
"
value.
(Only used in argument position for foreign functions; foreign
functions are not expected to return direct references
to heap-allocated values.)
The public interface to many foreign libraries is written in terms of types defined within that foreign library. One can introduce new types to the Larceny FFI by extending the core attribute entry table.
Procedure ffi-add-attribute-core-entry!
(ffi-add-attribute-core-entry! entry-name rep-sym marshal unmarshal) => unspecified
ffi-add-attribute-core-entry! extends the internal registry with the new entry specified by its arguments.
signed32
, unsigned32
, signed64
, unsigned64
(representing varieties of fixed width integers),
ieee32
(representing “floats”),
ieee64
(representing “doubles”), or
pointer
(representing “(void*)
” in C).
#f
or an unmarshalling function that
accepts an instance of the low-level representation
and produces a corresponding Scheme object.
Core attributes suffice for linking to simple functions. Constructured FFI attributes express more complex marshaling protocols
Arrow Type Constructors. A structured FFI attribute
of the form (-> (s_1 … s_n) s_r)
(called an arrow type)
allows passing functions from Scheme to C
and back again. Each of the s_1, …, s_n, s_r
is an FFI attribute.
When an arrow type describes an input to a foreign
function, it marshals a Scheme procedure to a
C function pointer by generating glue code to hook the two together
and marshal values as described by the FFI attributes
within the arrow type.
Likewise, when an arrow type describes an output from a
foreign function, it marshals a C function pointer
to a Scheme procedure, again by generating glue code.
These two mappings naturally generalize to arbitrary nesting
of arrow types, so one can create callbacks that consume
callouts, return callouts that consume callbacks, and so on.
The current implementation of arrow types introduces an unnecessary space leak, because none of Larceny's current garbage collectors attempt to reclaim some of the structure allocated (in particular, the so-called trampolines) when functions are marshaled via arrow types.
The FFI could be revised to reduce the leak (e.g. it could keep a cache of generated trampolines and reuse them, but currently do not do so).
Many foreign libraries have a structure where one only sets up a fixed set of callbacks, and then all further computation does not require arrow type marshaling. This is one reason why fixing this problem has been a low priority item for the Larceny development team.
Maybe Type Constructor. (maybe t)
captures the
pattern of passing NULL
in C and #f
in Scheme
to represent the absence of information.
The FFI attribute t within the maybe type
describes the typical information passed;
the constructed maybe type
marshals #f
to the foreign null pointer or 0
(as appropriate),
and otherwise applies the marshaling of t.
Likewise, it unmarshals the foreign
null pointer and 0
to #f
, and otherwise applies the
unmarshaling of t.
(There are a few other built-in type constructors, such as
the oneof
type constructor, but they
are not as fully-developed as the two above, and are intended
for use only for internal development for now.)
Using the void*
attribute
wraps foreign addresses up in a Larceny record,
so that standard numeric
operations cannot be directly applied by accident.
The FFI uses two features of Larceny's record system:
the record type descriptor is a first class
value with an inspectable name, and
record types are extensible via single-inheritance.
Basic Operations on void*
. The FFI provides void*-rt
, a record type
descriptor with a single field (a wrapped address).
There is also a family of functions for dereferencing the
pointer within a void*-rt
and manipulating the
state it references.
void*
.
Distinquishes void*
's from other Scheme values.
(void*-byte-ref x idx) => number
(void*-byte-set! x idx val) => unspecified
(void*-word-ref x idx) => number
(void*-word-set! x idx val) => unspecified
(void*-void*-ref x idx) => void*
void*
) at offset from address within x.
(void*-void*-set! x idx val) => unspecified
(void*-double-ref x idx) => number
(void*-double-set! x idx val) => unspecified
Type Hierarchies. Procedures for establishing type hierarchies are provided by the
lib/Standard/foreign-stdlib.sch
library; see
ffi-install-void*-subtype and establish-void*-subhierarchy!.
You must first compile your C code and create one or more loadable object modules. These object modules may then be loaded into Larceny, and Scheme foreign functions may link to specific functions in the loaded module. Defining foreign functions in Scheme is covered in a later section.
The method for creating a loadable object module varies from platform to platform. In the following, assume you have to C source files file1.c and file2.c that define functions that you want to make available as foreign functions in Larceny.
Compile your source files and create a shared library. Using GCC, the command line might look like this:
gcc -fPIC -shared file1.c file2.c -o my-library.so
The command creates my-library.so in the current directory. This library can now be loaded into Larceny using foreign-file. Any other shared libraries used by your library files should also be loaded into Larceny using foreign-file before any procedures are linked using foreign-procedure.
By default, /lib/libc.so is made available to the dynamic linker and to the foreign function interface, so there is no need for you to load that library explicitly.
Compile your source files and create a shared library, linking with all the necessary libraries. Using GCC, the command line might look like this:
gcc -fPIC -shared file1.c file2.c -lc -lm -lsocket -o my-library.so
Now you can use foreign-file to load my-library.so into Larceny.
By default, /lib/libc.so is made available to the foreign function interface, so there is no need for you to load that library explicitly.
(foreign-file filename) => unspecified
foreign-file loads the named object file into Larceny and makes it available for dynamic linking.
Larceny uses the operating system provided dynamic linker to do dynamic linking. The operation of the dynamic linker varies from platform to platform:
LD_LIBRARY_PATH
environment variable) for the file. Hence, a foreign file in the current directory should be "./file.so", not "file.so".
(foreign-procedure name (arg-type …) return-type) => unspecified
FIXME: The interface to this function has been extended to support hooking into Windows procedures that use the Pascal calling convention instead of the C one. The way to select which convention to use should be documented.
Returns a Scheme procedure p that calls the foreign procedure whose name is name. When p is called, it will convert its parameters to representations indicated by the arg-types and invoke the foreign procedure, passing the converted values as parameters. When the foreign procedure returns, its return value is converted to a Scheme value according to return-type.
Types are described below.
The address of the foreign procedure is obtained by searching for name in the symbol tables of the foreign files that have been loaded with foreign-file.
Procedure foreign-null-pointer
(foreign-null-pointer ) => integer
Returns a foreign null pointer.
Procedure foreign-null-pointer?
(foreign-null-pointer? integer) => boolean
Tests whether its argument is a foreign null pointer.
The two primitives peek-bytes and poke-bytes are provided for reading and writing memory at specific addresses. These procedures are typically used for copying data from foreign data structures into Scheme bytevectors for subsequent decoding.
(The use of peek-bytes and poke-bytes can often be avoided by keeping foreign data in a Scheme bytevector and passing the bytevector to a call-out using the boxed parameter type. However, this technique is inappropriate if the foreign code retains a pointer to the Scheme datum, which may be moved by the garbage collector.)
(peek-bytes addr bytevector count) => unspecified
Addr must be an exact nonnegative integer. Count must be a fixnum. The bytes in the range from addr through addr+count-1 are copied into bytevector, which must be long enough to hold that many bytes.
If any address in the range is not an address accessible to the process, unpredictable things may happen. Typically, you'll get a segmentation fault. Larceny does not yet catch segmentation faults.
(poke-bytes addr bytevector count) => unspecified
Addr must be an exact nonnegative integer. Count must be a fixnum. The count first bytes from bytevector are copied into memory in the range from addr through addr+count-1.
If any address in the range is not an address accessible to the process, unpredictable things may happen. Typically, you'll get a segmentation fault. Larceny does not yet catch segmentation faults.
Also, it's possible to corrupt memory with poke-bytes. Don't do that.
The following variables constants define the sizes of basic C data types:
Foreign data is visible to a Scheme program either as an object pointed to by a memory address (which is itself represented as an integer), or as a bytevector that contains the bytes of the foreign datum.
A number of utility procedures that make reading and writing data of common C primitive types have been written for both these kinds of foreign objects.
Bytevector accessor procedures
(%get16 bv i) => integer
(%get16u bv i) => integer
(%get32 bv i) => integer
(%get32u bv i) => integer
(%get-int bv i) => integer
(%get-unsigned bv i) => integer
(%get-short bv i) => integer
(%get-ushort bv i) => integer
(%get-long bv i) => integer
(%get-ulong bv i) => integer
(%get-pointer bv i) => integer
These procedures decode bytevectors that contain the bytes of foreign objects. In each case, bv is a bytevector and i is the offset of the first byte of a field in that bytevector. The field is fetched and returned as an integer (signed or unsigned as appropriate).
Bytevector updater procedures
(%set16 bv i val) => unspecified
(%set16u bv i val) => unspecified
(%set32 bv i val) => unspecified
(%set32u bv i val) => unspecified
(%set-int bv i val) => unspecified
(%set-unsigned bv i val) => unspecified
(%set-short bv i val) => unspecified
(%set-ushort bv i val) => unspecified
(%set-long bv i val) => unspecified
(%set-ulong bv i val) => unspecified
(%set-pointer bv i val) => unspecified
These procedures update bytevectors that contain the bytes of foreign objects. In each case, bv is a bytevector, i is an offset of the first byte of a field in that bytevector, and val is a value to be stored in that field. The values must be exact integers in a range implied by the data type.
Foreign-pointer accessor procedures
(%peek8 addr) => integer
(%peek8u addr) => integer
(%peek16 addr) => integer
(%peek16u addr) => integer
(%peek32 addr) => integer
(%peek32u addr) => integer
(%peek-int addr) => integer
(%peek-long addr) => integer
(%peek-unsigned addr) => integer
(%peek-ulong addr) => integer
(%peek-short addr) => integer
(%peek-ushort addr) => integer
(%peek-pointer addr) => integer
(%peek-string addr) => integer
These procedures read raw memory. In each case, addr is an address, and the value stored at that address (the size of which is indicated by the name of the procedure) is fetched and returned as an integer.
%Peek-string expects to find a NUL-terminated string of 8-bit bytes at the given address. It is returned as a Scheme string.
Foreign-pointer updater procedures
(%poke8 addr val) => unspecified
(%poke8u addr val) => unspecified
(%poke16 addr val) => unspecified
(%poke16u addr val) => unspecified
(%poke32 addr val) => unspecified
(%poke32u addr val) => unspecified
(%poke-int addr val) => unspecified
(%poke-long addr val) => unspecified
(%poke-unsigned addr val) => unspecified
(%poke-ulong addr val) => unspecified
(%poke-short addr val) => unspecified
(%poke-ushort addr val) => unspecified
(%poke-pointer addr val) => unspecified
These procedures update raw memory. In each case, addr is an address, and val is a value to be stored at that address.
If foreign functions are linked into Larceny using the FFI, and a Larceny heap image is subsequently dumped (with dump-interactive-heap or dump-heap), then the foreign functions are not saved as part of the heap image. When the heap image is subsequently loaded into Larceny at startup, the FFI will attempt to re-link all the foreign functions in the heap image.
During the relinking phase, foreign files will again be loaded into Larceny, and Larceny's FFI will use the file names as they were originally given to the FFI when it tries to load the files. In particular, if relative pathnames were used, Larceny will not have converted them to absolute pathnames.
An error during relinking will result in Larceny aborting with an error message and returning to the operating system. This is considered a feature.
This procedure uses the chdir() system call to set the process's current working directory. The string parameter type is used to pass a Scheme string to the C procedure.
(define cd (let ((chdir (foreign-procedure "chdir" '(string) 'int))) (lambda (newdir) (if (not (zero? (chdir newdir))) (error "cd: " newdir " is not a valid directory name.")) (unspecified))))
This procedure uses the getcwd() (get current working directory) system call to retrieve the name of the process's current working directory. A bytevector is created and passed in as a buffer in which to store the return value — a 0-terminated ASCII string. Then the FFI utility function ffi/asciiz->string is called to convert the bytevector to a string.
(define pwd (let ((getcwd (foreign-procedure "getcwd" '(boxed int) 'int))) (lambda () (let ((s (make-bytevector 1024))) (getcwd s 1024) (ffi/asciiz->string s)))))
this example is bogus. It is not safe to pass a collectable object into a C procedure when the callback invocation might cause a garbage collection, thus moving the object and invalidating the address stored in the C machine context.
This demonstrates how to use a callback such as the comparator argument to qsort. It is specified in the type signature using -> as a type constructor. (Note that one should probably use the built-in sort routines rather than call out like this; this example is for demonstrating callbacks, not how to sort.)
(define qsort! (foreign-procedure "qsort" '(boxed ushort ushort (-> (void* void*) int)) 'void))
(let ((bv (list->vector '(40 10 30 20 1 2 3 4)))) (qsort! bv 8 4 (lambda (x y) (let ((x (/ (void*-word-ref x 0) 4)) (y (/ (void*-word-ref y 0) 4))) (- x y)))) bv)
(let ((bv (list->bytevector '(40 10 30 20 1 2 3 4)))) (qsort! bv 8 1 (lambda (x y) (let ((x (void*-byte-ref x 0)) (y (void*-byte-ref y 0))) (- x y)))) bv)
The general foreign-function interface functionality described above is powerful but awkward to use in practice. A user might be tempted to hard code values of offsets or constants that are compiler dependent. Also, the FFI will marshall some low-level values such as strings or integers, but other values such as enumerations which could be naturally mapped to sets of symbols are not marshalled since the host environment does not provide the necessary type information to the FFI.
This section documents a collection of libraries to mitigate these and other problems.
Foreign data access is performed by peeking at manually calculated addresses, but in practice one often needs to inspect fields of C structures, whose offsets are dependant on the application binary interface (ABI) of the host environment. Similarly, C programs often use refer to values via constant macro definitions; since the values of such names are not provided by the object code and Scheme programs do not have a C preprocessor run on them prior to execution, it is difficult to refer to the same value without encoding "magic numbers" into the Scheme source code.
The foreign-ctools library is meant to mitigate problems like the two described above. It provides special forms for introducing global definitions of values typically available at compile-time for a C program. The library assumes the presence of a C compiler (such as cc on Unix systems or cl.exe on Windows systems). The special forms work by dynamically generating, compiling, and running C code at expansion time to determine the desired values of structure offsets or macro constants.
Here is a grammar for the define-c-info
form provided by
the foreign-ctools
library.
<exp> ::= (define-c-info <c-decl> ... <c-defn> ...) <c-decl> ::= (compiler <cc-spec>) | (path <include-path>) | (include <header>) | (include<> <header>) <cc-spec> ::= cc | cl <c-defn> ::= (const <id> <c-type> <c-expr>) | (sizeof <id> <c-type-expr>) | (struct <c-name> <field-clause> ...) | (fields <c-name> <field-clause> ...) | (ifdefconst <id> <c-type> <c-name>) <c-type> ::= int | uint | long | ulong <include-path> ::= <string-literal> <header> ::= <string-literal> <field-clause> ::= (<offset-id> <c-field>) | (<offset-id> <c-field> <size-id>) <c-expr> ::= <string-literal> <c-type-expr> ::= <string-literal> <c-name> ::= <string-literal> <c-field> ::= <string-literal>
Syntax define-c-info
(define-c-info <c-decl> … <c-defn> …)
The <c-decl>
clauses of define-c-info
control how header files are processed.
The compiler
clause selects between cc
(the default UNIX system compiler) and cl
(the compiler included with Microsoft's Windows SDK).
The path
clause adds a directory to search when
looking for header files.
The include
and include<>
clauses indicate
header files to include when executing the
<c-defn>
clauses;
the two variants correspond to the quoted and bracketed
forms of the C preprocessor's #include
directive.
The <c-defn>
clauses bind identifiers.
A (const x t "ae")
clause binds x to
the integer value of ae according to the C language;
ae can be any C arithmetic expression that evaluates
to a value of type t.
(The expected usage is for ae to be an
expression that the C preprocessor expands to an arithmetic expression.)
The remaining clauses provide similar functionality:
(sizeof x "te")
binds x to the size occupied by values
of type te, where te is any C type expression.
(struct "cn" … (x "cf" y) …)
binds x to the offset from the start of a
structure of type struct cn
to its
cf field, and binds y, if present, to the field's size.
A fields
clause is similar, but it applies
to structures of type cn
rather than struct cn
.
(ifdefconst x t "cn")
binds x to the value of cn
if cn
is defined;
x is otherwise bound to Larceny's unspecified value.
The foreign-procedure function is sufficient to link in dynamically loaded C procedures, but it can be annoying to use when there are many procedures to define that all follow a regular pattern where one could infer a mapping between Scheme identifiers and C function names.
For example, some libraries follow a naming convention where a words within a name are separated by underscores; such functions could be immediately mapped to Scheme names where the underscores have been replaced by dashes.
The foreign-sugar library provides a special form, define-foreign
,
which gives the user a syntax for defining foreign functions using
a syntax where one provides only the Scheme name, the argument types,
and the return type. The define-foreign
form then attempts to
infer what C function the name was meant to refer to.
Syntax define-foreign
(define-foreign (name arg-type …) result-type)
There is other functionality provided allowing the user to introduce new rules for inferring C function names, but they are undocumented because they will probably have to change when we switch to an R6RS macro expander.
(stdlib/malloc rtd [ctor]) => procedure
Given a record extension of void*-rt, returns an allocator that uses
the C malloc
procedure to allocate instances of such an object.
Note that the client is responsible for eventually freeing such
objects with stdlib/free.
Frees objects produced by allocators returned from stdlib/malloc.
Procedure ffi-install-void*-subtype
(ffi-install-void*-subtype rtd) => rtd
(ffi-install-void*-subtype string [parent-rtd]) => rtd
(ffi-install-void*-subtype symbol [parent-rtd]) => rtd
ffi-install-void*-subtype
extends the core attribute registry with a new primitive
entry for subtype.
The parent-rtd argument should be a subtype of void*-rt
and defaults to void*-rt
.
In the case of the symbol or string inputs, the
procedure constructs a new record type subtyping the parent argument.
In the case of the rtd input, the rtd record type
must extend void*-rt
.
ffi-install-void*-subtype returns the subtype record type.
The returned record type represents a tagged wrapped C pointer, allowing one to encode type hierarchies.
Procedure establish-void*-subhierarchy!
(establish-void*-subhierarchy! symbol-tree) => unspecified
establish-void*-subhierarchy! is a convenience function
for constructing large object hierarchies.
It descends the symbol-tree,
creates a record type descriptor for each symbol
(where the root of the tree has the parent void*-rt
),
and invokes ffi-install-void*-subtype on all
of the introduced types.
Type char* extends void* Procedure string->char*
(string->char* string) => char*
(char*-strlen char*) => fixnum
(char*->string char*) => string
(char*->string char* len) => string
(call-with-char* string string-function) => value
(call-with-char** string-vector function) => value
(call-with-int* fixnum-vector function) => value
(call-with-short* fixnum-vector function) => value
(call-with-double* num-vector function) => value
FIXME: (There are other functions, but I want to test and document the ones above first…)
The foreign-cstructs
library provides a
more direct interface to C structures.
It provides the define-c-struct
special form.
This form is layered on top of define-c-info
;
the latter provides the structure field offsets
and sizes used to generate constructors
(which produce appropriately sized bytevectors,
not record instances).
The define-c-struct
form combines these
with marshaling and unmarshaling procedures to
provide high-level access to a structure.
The grammar for the define-c-struct
form is presented below.
<exp> ::= (define-c-struct (<struct-type> <ctor-id> <c-decl> ...) <field-clause> ...) <field-clause> ::= (<c-field> <getter>) | (<c-field> <getter> <setter>) <getter> ::= (<id>) | (<id> <unmarshal>) <setter> ::= (<id>) | (<id> <marshal>) <marshal> ::= <ffi-attr-symbol> | <marshal-proc-exp> <unmarshal> ::= <ffi-attr-symbol> | <unmarshal-proc-exp> <struct-type> ::= <string-literal>
This library provides the special forms
define-c-enum
and define-c-enum-set
,
which associate the identifiers of
a C enum
type declaration
with the integer values they denote.
The define-c-enum
form describes enums
encoding a discriminated sum;
define-c-enum-set
describes bitmasks,
mapping them to R6RS enum-sets in Scheme.
The (define-c-enum en (<c-decl> …) (x "cn") …)
form adds the en
FFI attribute.
The attribute marshals each symbol x
to
the integer value that cn
denotes in C;
unmarshaling does the inverse translation.
The (define-c-enum-set ens (<c-decl> …) (x "cn") …)
form binds ens to an R6RS enum-set constructor
with universe resulting from
(make-enumeration '(x …))
; it also adds the ens
FFI attribute. The attribute marshals an
enum-set s constructed by ens
to the corresponding bitmask in C (that is,
the integer one would get by logically or'ing
all cn such that the corresponding x is in s).
Unmarshaling attempts to do the inverse translation.
The grammar for the two forms is presented below.
<exp> ::= (define-c-enum <enum-id> (<c-decl> ...) (<id> <c-name>) ...) <exp> ::= (define-c-enum-set <enum-id> (<c-decl> ...) (<id> <c-name>) ...) <enum-id> ::= <id>
Larceny's debugging functionality is implemented in Scheme, using some of Larceny's extensions for catching exceptions and inspecting the continuation structure.
When Larceny detects an error or a keyboard interrupt, or when it hits a breakpoint, it signals the condition by printing a message on the console. Larceny then enters the debugger, which signals its presence with a short banner and the debugger prompt:
Entering debugger; type "?" for help. debug>
You can also re-enter the debugger by evaluating (debug).
The debugger is still in an immature state. The following commands are available (commands can be typed in upper or lower case):
B Print backtrace of continuation.
C Print source code of procedure, if available.
D Move down to previous (earlier) activation record.
E n expr Expr is evaluated in the current interaction environment and must evaluate to a procedure. It is passed the contents of slot n from the current activation record, and the result, if not unspecified, is printed.
E (n1 … nk) expr Expr is evaluated in the current interaction environment and must evaluate to a procedure. It is passed the contents of slots n1 through nk from the current activation record, and the result, if not unspecified, is printed.
I n Inspect the procedure in slot n of the current activation record.
I @_ Inspect the active procedure.
Q Quit the debugger and abort the computation.
R Return from the debugger and continue the computation.
S Summarize the contents of the current activation record.
U Up to the next (later) activation record.
X Examine the contents of the current activation record.
The B, D, and U commands can be prefixed with a count, for example, 5 U moves up five activation records, and 10 B displays the next 10 activation records. The default for B is to display all the activations; the default count for D and U is 1.
You can set breakpoints either in program text with the break primitive or interactively at the start of a procedure with the break-entry procedure. When Larceny reaches a breakpoint during execution, the program is suspended and the debugger is entered to allow you to inspect the program.
Invokes the breakpoint handler.
Set a breakpoint at the start of the procedure.
(unbreak )
In the first form, remove any breakpoint set by break-entry at the start of the procedure_s. In the second form, remove all breakpoints set by _break-entry.
Set a trace point on entry to the procedure, removing any other trace points on the procedure. When the procedure is entered, information about the call is printed on the console: the name of the procedure and the actual arguments.
Set a trace point on exit from the procedure, removing any other trace points on the procedure. When the procedure returns, information about the return is printed on the console: the name of the procedure and the returned values.
Note that trace-exit destroys the tail recursion properties of the instrumented procedure. Where the procedure would normally "return" by tail-calling another procedure, the instrumented procedure will call the other procedure by a non-tail call and then return, at which point the procedure name and return values will be printed. Thus use of trace-exit may destroy the space properties of the program.
Set trace points on procedure both at entry and exit.
(untrace )
The first form removes any trace points from the specified procedures. The second form removes all untrace points.
The value of break-handler is a procedure that is called when a breakpoint or tracepoint is encountered. The procedure takes two arguments: the procedure in which the breakpoint was set, and the byte offset within the procedure's code vector of the breakpoint.
IEEE Standard 1178-1990, "IEEE Standard for the Scheme Programming Language", IEEE, 1991. ISBN 1-55937-125-0. May be ordered from IEEE by calling 1-800-678-IEEE or 908-981-1393 or writing IEEE Service Center, 445 Hoes Lane, P.O. Box 1331, Piscataway, NJ 08855-1331, and using order number SH14209.
Richard Kelsey, William Clinger, and Jonathan Rees [editors]. Revised^5 Report on the Algorithmic Language Scheme. Journal of Higher Order and Symbolic Computation, 11(1), 1998, pages 7-105. Also appears in ACM SIGPLAN Notices 33(9), September 1998. Available online in various formats.
Michael Sperber, R Kent Dybvig, Matthew Flatt, and Anton van Straaten [editors]. Revised^6 Report on the Algorithmic Language Scheme.
Various SchemePunks [editors]. ERR5RS is under construction.
IEEE Standard 754-1985, "IEEE Standard for Binary Floating-Point Arithmetic", IEEE, 1985.
A revision of IEEE Std 754-1985 has been underway since 2000. The IEEE Microprocessor Standards Committee (MSC) accepted a candidate draft on 9 October 2006. The candidate draft 1.2.6 was approved by 79% of 70 votes, which exceeded the required supermajority of 75%. Because there were negative votes and several hundred comments, however, a revised draft 1.3.0 was prepared and approved by 84% of 73 votes. Since there were over a hundred comments on the second candidate draft as well, a third candidate draft 1.4.0 was prepared and another vote taken in April 2007. After a total of eight ballots, with the last four being approved by more than 90% of the voters, the Ballot Review Committee decided in May 2008 that maximum possible timely consensus has been obtained, and the consensus draft was submitted to IEEE-SA RevCom. IEEE-754r was approved on 12 June 2008.
The Unicode Consortium. The Unicode 5.0 Standard. Addison-Wesley Professional, 2006.