The procedures described in this chapter are nonstandard. Some are deprecated after being rendered obsolete by ERR5RS or R6RS standard libraries. Others still provide useful capabilities that the standard libraries don't.
Larceny provides Unicode strings with R6RS semantics.
The string-downcase
and string-upcase
procedures
perform Unicode-compatible case folding, which can result
in a string whose length is different from that of the original.
Larceny may still provide string-downcase!
and string-upcase!
procedures, but they are deprecated.
A bytevector is a data structure that stores bytes — exact 8-bit unsigned integers. Bytevectors are useful in constructing system interfaces and other low-level programming. In Larceny, many bytevector-like structures — bignums, for example — are implemented in terms of a lower-level bytevector-like data type. The operations on generic bytevector-like structures are particularly fast but useful largely in code that manipulates Larceny's data representations.
The (rnrs bytevectors)
library now
provides a large set of procedures that, in Larceny, are
defined using the procedures described below.
Integrable procedure make-bytevector
(make-bytevector length) => bytevector
(make-bytevector length fill) => bytevector
Returns a bytevector of the desired length. If no second argument is given, then the bytevector has not been initialized and most likely contains garbage.
Operations on bytevector structures
(bytevector? obj) => boolean
(bytevector-length bytevector) => integer
(bytevector-ref bytevector offset) => byte
(bytevector-set! bytevector offset byte) => unspecified
(bytevector-equal? bytevector1 bytevector2) => boolean
(bytevector-fill! bytevector byte) => unspecified
(bytevector-copy bytevector) => bytevector
These procedures do what you expect.
All are integrable, except bytevector-equal?
and bytevector-copy
.
The bytevector-equal?
name is deprecated, since the
R6RS calls it bytevector=?
.
Operations on bytevector-like structures
(bytevector-like? obj) => boolean
(bytevector-like-length bytevector) => integer
(bytevector-like-ref bytevector offset) => byte
(bytevector-like-set! bytevector offset byte) => unspecified
(bytevector-like-equal? bytevector1 bytevector2) => boolean
(bytevector-like-copy bytevector) => bytevector
A bytevector-like structure is a low-level representation for indexed arrays of uninterpreted bytes. Bytevector-like structures are used to represent types such as bignums and flonums.
There is no way to construct a "generic" bytevector-like structure; use the constructors for specific bytevector-like types.
The bytevector-like operations operate on all bytevector-like
structures. All are integrable, except bytevector-like-equal?
and bytevector-like-copy
. All are deprecated because they
violate abstraction barriers and make your code
representation-dependent; they are useful mainly to
Larceny developers, who might otherwise be tempted to
write some low-level operations in C or assembly language.
(vector-copy vector) => vector
Returns a shallow copy of its argument.
Operations on vector-like structures
(vector-like? object) => boolean
(vector-like-length vector-like) => fixnum
(vector-like-ref vector-like k) => object
(vector-like-set! vector-like k object) => unspecified
A vector-like structure is a low-level representation for indexed arrays of Scheme objects. Vector-like structures are used to represent types such as vectors, records, symbols, and ports.
There is no way to construct a "generic" vector-like structure; use the constructors for specific data types.
The vector-like operations operate on all vector-like structures. All are integrable. All are deprecated because they violate abstraction barriers and make your code representation-dependent; they are useful mainly to Larceny developers, who might otherwise be tempted to write some low-level operations in C or assembly language.
Operations on procedures
(make-procedure length) => procedure
(procedure-length procedure) => fixnum
(procedure-ref procedure offset) => object
(procedure-set! procedure offset object) => unspecified
These procedures operate on the representations of procedures and allow user programs to construct, inspect, and alter procedures.
(procedure-copy procedure) => procedure
Returns a shallow copy of the procedure.
The procedures above are deprecated because they violate abstraction barriers and make your code representation-dependent; they are useful mainly to Larceny developers, who might otherwise be tempted to write some low-level operations in C or assembly language.
The rest of this section describes some procedures that reach through abstraction barriers in a more controlled way to extract heuristic information from procedures for debugging purposes.
The following text is copied from a straw proposal authored by Will Clinger and sent to rrr-authors on 09 May 1996. The text has been edited lightly. See the end for notes about the Larceny implementation.
The procedures that extract heuristic information from procedures are permitted to return any result whatsoever. If the type of a result is not among those listed below, then the result represents an implementation-dependent extension to this interface, which may safely be interpreted as though no information were available from the procedure. Otherwise the result is to be interpreted as described below.
Returns information about the arity of proc. If the result is #f
,
then no information is available. If the result is an exact
non-negative integer k, then proc requires exactly k
arguments. If the result is an inexact non-negative integer n, then
proc requires n or more arguments. If the result is a pair, then
it is a list of non-negative integers, each of which indicates a
number of arguments that will be accepted by proc; the list is not
necessarily exhaustive.
Procedure procedure-documentation-string
(procedure-documentation-string proc)
Returns general information about proc. If the result is #f
, then no
information is available. If the result is a string, then it is to be
interpreted as a "documentation string" (see Common Lisp).
Returns information about the name of proc. If the result is #f
,
then no information is available. If the result is a symbol or string,
then it represents a name. If the result is a pair, then it is a list
of symbols and/or strings representing a path of names; the first
element represents an outer name and the last element represents an
inner name.
Procedure procedure-source-file
Returns information about the name of a file that contains the source
code for proc. If the result is #f
, then no information is
available. If the result is a string, then the string is the name of a
file.
Procedure procedure-source-position
(procedure-source-position proc)
Returns information about the position of the source code for proc
whithin the source file specified by procedure-source-file. If the
result is #f
, then no information is available. If the result is an
exact integer k, then k characters precede the opening parenthesis
of the source code for proc within that source file.
Procedure procedure-expression
Returns information about the source code for proc. If the result is
#f
, then no information is available. If the result is a pair, then it
is a lambda expression in the traditional representation of a list.
Procedure procedure-environment
Returns information about the environment of proc. If the result is
#f
, then no information is available. In any case the result may be
passed to any of the environment inquiry functions.
Notes on the Larceny implementation
Twobit does not yet produce data for all of these functions, so some
of them always return #f
.
The (rnrs lists)
library now
provides a set of procedures that may supersede some
of the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
(append! list1 list2 … obj) => object
append!
destructively appends its arguments, which must be lists, and
returns the resulting list. The last argument can be any object. The
argument lists are appended by changing the cdr of the last pair of
each argument except the last to point to the next argument.
(every? procedure list1 list2 …) => object
every?
applies procedure to each element tuple of list_s in
first-to-last order, and returns #f
as soon as _procedure returns
#f
. If procedure does not return #f
for any element tuple of
list_s, then the value returned by _procedure for the last element
tuple of _list_s is returned.
(last-pair list-structure) => pair
last-pair
returns the last pair of the list structure, which must be
a sequence of pairs linked through the cdr fields.
list-copy
makes a shallow copy of the list and returns that copy.
Each of these procedures returns a new list which contains all the
elements of list in the original order, except that those elements of
the original list that were equal to key (or that satisfy pred?) are
not in the new list. Remove uses equal?
as the equivalence predicate;
remq
uses eq?
, and remv
uses eqv?
.
These procedures are like remove
, remq
, remv
, and remp
,
except they modify list instead of returning a fresh list.
reverse!
destructively reverses its argument and returns the reversed
list.
(some? procedure list1 list2 …) => object
some?
applies procedure to each element tuple of list_s in
first-to-last order, and returns the first non-false value returned by
_procedure. If procedure does not return a true value for any
element tuple of _list_s, then some? returns #f
.
The (rnrs sorting)
library now
provides a small set of procedures that supersede most
of the procedures described below.
All of the procedures described below are therefore
deprecated.
Procedures sort and sort!
(sort list less?) => list
(sort vector less?) => vector
(sort! list less?) => list
(sort! vector less?) => vector
These procedures sort their argument (a list or a vector) according to the predicate less?, which must implement a total order on the elements in the data structures that are sorted.
sort
returns a fresh data structure containing the sorted data;
sort!
sorts the data structure in-place.
Larceny's records have been extended to implement all ERR5RS and R6RS procedures from
(err5rs records procedural) (err5rs records inspection) (rnrs records procedural) (rnrs records inspection)
We recommend that Larceny programmers use the ERR5RS APIs instead of the R6RS APIs. This should entail no loss of portability, since the standard reference implementation of ERR5RS records should run efficiently in any implementation of the R6RS that permits new libraries to defined at all.
Larceny now has two kinds of records: old-style and ERR5RS/R6RS. Old-style records cannot be created in R6RS-conforming mode, so our extension of R6RS procedures to accept old-style records does not affect R6RS conformance.
The following specification describes Larceny's old-style record API, which is now deprecated. It is based on a proposal posted by Pavel Curtis to rrrs-authors on 10 Sep 1989, and later re-posted by Norman Adams to comp.lang.scheme on 5 Feb 1992. The authorship and copyright status of the original text are unknown to me.
This document differs from the original proposal in that its record types are extensible, and that it specifies the type of record-type descriptors.
(make-record-type type-name field-names)
Returns a "record-type descriptor", a value representing a new data type, disjoint from all others. The type-name argument must be a string, but is only used for debugging purposes (such as the printed representation of a record of the new type). The field-names argument is a list of symbols naming the "fields" of a record of the new type. It is an error if the list contains any duplicates.
If the parent-rtd argument is provided, then the new type will be a subtype of the type represented by parent-rtd, and the field names of the new type will include all the field names of the parent type. It is an error if the complete list of field names contains any duplicates.
Record-type descriptors are themselves records. In particular,
record-type descriptors have a field printer that is either #f
or a
procedure. If the value of the field is a procedure, then the
procedure will be called to print records of the type represented by
the record-type descriptor. The procedure must accept two arguments:
the record object to be printed and an output port.
Returns a procedure for constructing new members of the type represented by rtd. The returned procedure accepts exactly as many arguments as there are symbols in the given list, field-names; these are used, in order, as the initial values of those fields in a new record, which is returned by the constructor procedure. The values of any fields not named in that list are unspecified. The field-names argument defaults to the list of field-names in the call to make-record-type that created the type represented by rtd; if the field-names argument is provided, it is an error if it contains any duplicates or any symbols not in the default list.
Returns a procedure for testing membership in the type represented by rtd. The returned procedure accepts exactly one argument and returns a true value if the argument is a member of the indicated record type or one of its subtypes; it returns a false value otherwise.
(record-accessor rtd field-name)
Returns a procedure for reading the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly one argument which must be a record of the appropriate type; it returns the current value of the field named by the symbol field-name in that record. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd, or a member of the field-names of the parent type of the type represented by rtd.
(record-updater rtd field-name)
Returns a procedure for writing the value of a particular field of a member of the type represented by rtd. The returned procedure accepts exactly two arguments: first, a record of the appropriate type, and second, an arbitrary Scheme value; it modifies the field named by the symbol field-name in that record to contain the given value. The returned value of the updater procedure is unspecified. The symbol field-name must be a member of the list of field-names in the call to make-record-type that created the type represented by rtd, or a member of the field-names of the parent type of the type represented by rtd.
(record? obj)
Returns a true value if obj is a record of any type and a false value
otherwise. Note that record?
may be true of any Scheme value; of
course, if it returns true for some particular value, then
record-type-descriptor
is applicable to that value and returns an
appropriate descriptor.
Procedure record-type-descriptor
(record-type-descriptor record)
Returns a record-type descriptor representing the type of the given record. That is, for example, if the returned descriptor were passed to record-predicate, the resulting predicate would return a true value when passed the given record. Note that it is not necessarily the case that the returned descriptor is the one that was passed to record-constructor in the call that created the constructor procedure that created the given record.
Returns the type-name associated with the type represented by rtd. The returned value is eqv? to the type-name argument given in the call to make-record-type that created the type represented by rtd.
Procedure record-type-field-names
Returns a list of the symbols naming the fields in members of the type represented by rtd.
Returns a record-type descriptor for the parent type of the type represented by rtd, if that type has a parent type, or a false value otherwise. The type represented by rtd has a parent type if the call to make-record-type that created rtd provided the parent-rtd argument.
Procedure record-type-extends?
(record-type-extends? rtd1 rtd2)
Returns a true value if the type represented by rtd1 is a subtype of the type represented by rtd2 and a false value otherwise. A type s is a subtype of a type t if s=t or if the parent type of s, if it exists, is a subtype of t.
The R6RS spouts some tendentious nonsense about procedural records being slower than syntactic records, but this is not true of Larceny's records, and is unlikely to be true of other implementations either. Larceny's procedural records are fairly efficient already, and will become even more efficient in future versions as interlibrary optimizations are added.
The (rnrs io ports)
and (rnrs files)
libraries now
provide a set of procedures that may supersede some
of the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
(close-open-files ) => unspecified
Closes all open files.
(console-input-port ) => input-port
Returns a character input port such that no read from the port has signalled an error or returned the end-of-file object.
Rationale: console-input-port and console-output-port are artifacts of Unix interactive I/O conventions, where an interactive end-of-file does not mean "quit" but rather "done here". Under these conventions the console port should be reset following an end-of-file. Resetting conflicts with the semantics of ports in Scheme, so console-input-port and console-output-port return a new port if the current port is already at end-of-file.
Since it is convenient to handle errors in the same manner as end-of-file, these procedures also return a new port if an error has been signalled during an I/O operation on the port.
Console-input-port and console-output-port simply call the port generators installed in the parameters console-input-port-factory and console-output-port-factory, which allow user programs to install their own console port generators.
(console-output-port ) => output-port
Returns a character output port such that no write to the port has signalled an error.
See console-input-port for a full explanation.
Parameter console-input-port-factory
The value of this parameter is a procedure that returns a character input port such that no read from the port has signalled an error or returned the end-of-file object.
See console-input-port for a full explanation.
Parameter console-output-port-factory
The value of this parameter is a procedure that returns a character output port such that no write the port has signalled an error.
See console-input-port for a full explanation.
The value of this parameter is a character input port.
The value of this parameter is a character output port.
(delete-file filename) => unspecified
Deletes the named file. No error is signalled if the file does not exist.
(eof-object ) => end-of-file object
Eof-object returns an end-of-file object.
(file-exists? filename) => boolean
File-exists? returns #t if the named file exists at the time the procedure is called.
Procedure file-modification-time
(file-modification-time filename) => vector or #f
File-modification-time returns the time of last modification of the file as a vector, or #f if the file does not exist. The vector has six elements: year, month, day, hour, minute, second, all of which are exact nonnegative integers. The time returned is relative to the local timezone.
(file-modification-time "larceny") => #(1997 2 6 12 51 13)
(file-modification-time "geekdom") => #f
(flush-output-port ) => unspecified
(flush-output-port port) => unspecified
Write any buffered data in the port to the underlying output medium.
(get-output-string string-output-port) => string
Retrieve the output string from the given string output port.
(open-input-string string) => input-port
Creates an input port that reads from string. The string may be shared with the caller. A string input port does not need to be closed, although closing it will prevent further reads from it.
(open-output-string ) => output-port
Creates an output port where any output is written to a string. The accumulated string can be retrieved with get-output-string at any time.
Tests whether its argument is a port.
Returns the name associated with the port; for file ports, this is the file name.
(port-position port) => fixnum
Returns the number of characters that have been read from or written to the port.
(rename-file from to) => unspecified
Renames the file from and gives it the name to. No error is signalled if from does not exist or to exists.
(reset-output-string port) => unspecified
Given a port created with open-output-string, deletes from the port all the characters that have been output so far.
Procedure with-input-from-port
(with-input-from-port input-port thunk) => object
Calls thunk with current input bound to input-port in the dynamic extent of thunk. Returns whatever value was returned from thunk.
(with-output-to-port output-port thunk) => object
Calls thunk with current output bound to output-port in the dynamic extent of thunk. Returns whatever value was returned from thunk.
Procedure command-line-arguments
(command-line-arguments ) => vector
Returns a vector of strings: the arguments supplied to the program by the user or the operating system.
(dump-heap filename procedure) => unspecified
Dump a heap image to the named file that will start up with the
supplied procedure. Before procedure is called, command line
arguments will be parsed and any init procedures registered with
add-init-procedure!
will be called.
Note: Currently, heap dumping is only available with the
stop-and-copy collector (-stopcopy
command line option), although the
heap image can be used with all the other collectors.
Procedure dump-interactive-heap
(dump-interactive-heap filename) => unspecified
Dump a heap image to the named file that will start up with the
standard read-eval-print loop. Before the read-eval-print loop is
called, command line arguments will be parsed and any init procedures
registered with add-init-procedure!
will be called.
Note: Currently, heap dumping is only available with the
stop-and-copy collector (-stopcopy
command line option), although the
heap image can be used with all the other collectors.
Returns the operating system environment mapping for the string key,
or #f
if there is no mapping for key.
Send the command to the operating system's command processor and return the command's exit status, if any. On Unix, command is a string and status is an exact integer.
Fixnums are small exact integers that are likely to be represented without heap allocation. Larceny never represents a number that can be represented as a fixnum any other way, so programs that can use fixnums will do so automatically. However, operations that work only on fixnums can sometimes be substantially faster than generic operations, and the following primitives are provided for use in those programs that need especially good performance.
The (rnrs arithmetic fixnums)
library now
provides a large set of procedures that, in Larceny, are
defined using the procedures described below.
If one of Larceny's procedures duplicates the semantics of
an R6RS procedure whose name is different, then Larceny's
name is deprecated.
All arguments to the following procedures must be fixnums.
Returns #t
if its argument is a fixnum, and #f
otherwise.
Returns the fixnum sum of its arguments. If the result is not representable as a fixnum, then an error is signalled (unless error checking has been disabled).
Returns the fixnum difference of its arguments. If the result is not representable as a fixnum, then an error is signalled.
Returns the fixnum negative of its argument. If the result is not representable as a fixnum, then an error is signalled.
Returns the fixnum product of its arguments. If the result is not representable as a fixnum, then an error is signalled.
Returns #t
if its arguments are equal, and #f
otherwise.
Returns #t
if fix1 is less than fix2, and #f
otherwise.
Returns #t
if fix1 is less than or equal to fix2, and #f
otherwise.
Returns #t
if fix1 is greater than fix2, and #f
otherwise.
Returns #t
if fix1 is greater than or equal to fix2, and #f
otherwise.
Returns #t
if its argument is less than zero, and #f
otherwise.
Returns #t
if its argument is greater than zero, and #f
otherwise.
Returns #t
if its argument is zero, and #f
otherwise.
(fxlogand fix1 fix2) => fixnum
Returns the bitwise and of its arguments.
(fxlogior fix1 fix2) => fixnum
Returns the bitwise inclusive or of its arguments.
Returns the bitwise not of its argument.
(fxlogxor fix1 fix2) => fixnum
Returns the bitwise exclusive or of its arguments.
Returns fix1 shifted left fix2 places, shifting in zero bits at the low end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Procedure most-positive-fixnum
(most-positive-fixnum ) => fixnum
Returns the largest representable positive fixnum.
Procedure most-negative-fixnum
(most-negative-fixnum ) => fixnum
Returns the smallest representable negative fixnum.
Returns fix1 shifted right fix2 places, shifting in a copy of the sign bit at the left end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Returns fix1 shifted right fix2 places, shifting in zero bits at the high end. If the shift count exceeds the number of bits in the machine's word size, then the results are machine-dependent.
Larceny has six representations for numbers: fixnums are small, exact integers; bignums are unlimited-precision exact integers; ratnums are exact rationals; flonums are inexact rationals; rectnums are exact complexes; and compnums are inexact complexes.
Number-representation predicates
(fixnum? obj) => boolean
(bignum? obj) => boolean
(ratnum? obj) => boolean
(flonum? obj) => boolean
(rectnum? obj) => boolean
(compnum? obj) => boolean
These predicates test whether an object is a number of a particular
representation and return #t
if so, #f
if not.
(random limit) => exact integer
Returns a pseudorandom nonnegative exact integer in the range 0 through limit-1.
Hashtables represent finite mappings from keys to values. If the hash function is a good one, then the value associated with a key may be looked up in constant time (on the average).
The R6RS hashtables library are a big improvement over Larceny's traditional hash tables, and should be used instead of the API described below.
To resolve a clash of names and semantics with the
R6RS make-hashtable
procedure, Larceny's traditional
make-hashtable
procedure has been renamed to
make-oldstyle-hashtable
.
Procedure make-oldstyle-hashtable
(make-oldstyle-hashtable hash-function bucket-searcher size) => hashtable
Returns a newly allocated mutable hash table using hash-function as
the hash function and bucket-searcher, e.g. assq
, assv
, assoc
, to
search a bucket with size buckets at first, expanding the number of
buckets as needed. The hash-function must accept a key and return a
non-negative exact integer.
(make-oldstyle-hashtable hash-function bucket-searcher) => hashtable
Equivalent to (make-oldstyle-hashtable hash-function bucket-searcher n)
for
some value of n chosen by the implementation.
(make-oldstyle-hashtable hash-function) => hashtable
Equivalent to (make-oldstyle-hashtable hash-function assv)
.
(make-oldstyle-hashtable ) => hashtable
Equivalent to (make-oldstyle-hashtable object-hash assv)
.
(hashtable-contains? hashtable key) => bool
Returns true iff the hashtable contains an entry for key.
(hashtable-fetch hashtable key flag) => object
Returns the value associated with key in the hashtable if the hashtable contains key; otherwise returns flag.
(hashtable-get hashtable key) => object
Equivalent to (hashtable-fetch #f)
.
(hashtable-put! hashtable key value) => unspecified
Changes the hashtable to associate key with value, replacing any existing association for key.
(hashtable-remove! hashtable key) => unspecified
Removes any association for key within the hashtable.
(hashtable-clear! hashtable) => unspecified
Removes all associations from the hashtable.
(hashtable-size hashtable) => integer
Returns the number of keys contained within the hashtable.
(hashtable-for-each procedure hashtable) => unspecified
The procedure must accept two arguments, a key and the value associated with that key. Calls the procedure once for each key-value association in hashtable. The order of these calls is indeterminate.
(hashtable-map procedure hashtable)
The procedure must accept two arguments, a key and the value associated with that key. Calls the procedure once for each key-value association in hashtable, and returns a list of the results. The order of the calls is indeterminate.
(hashtable-copy hashtable) => hashtable
Returns a copy of the hashtable.
The hash values returned by these functions are nonnegative exact integer suitable as hash values for the hashtable functions.
(equal-hash object) => integer
Returns a hash value for object based on its contents.
(object-hash object) => integer
Returns a hash value for object based on its identity.
This hash function performs extremely poorly on pairs,
vectors, strings, and bytevectors, which are the objects
with which it is mostly likely to be used.
For efficient hashing on object identity, create the
hashtable with make-eq-hashtable
or make-eqv-hashtable
of the (rnrs hashtables)
library.
(string-hash string) => fixnum
Returns a hash value for string based on its content.
(symbol-hash symbol) => fixnum
Returns a hash value for symbol based on its print name.
The symbol-hash
is very fast, because the hash code is cached in the symbol data
structure.
Parameters are procedures that serve as containers for values; parts of the system that do not operate in the same namespace can still share parameters and thereby read and write shared state.
A parameter takes zero or one arguments. If called with no arguments, it returns the current value of the parameter. If called with one argument, it sets the parameter's value to that of the argument and returns the new value.
(make-parameter name value [predicate]) => procedure
Create a parameter with name name, initial value value, and
optional setter predicate predicate. When the parameter is set the
new value is first passed to predicate,, and if it returns #f
then
an error is signalled. Name can be a symbol or a string.
Syntax parameterize
(parameterize ((parameter0 value0) …) expr0 expr1 …)
Parameterize overrides the values of a set of parameters in a dynamic scope — it is like fluid-let for parameters.
The following list of parameters does not yet include the reader or compiler switches, which are also parameters.
Parameter console-input-port-factory
Parameter console-output-port-factory
Parameter interaction-environment
Parameter keyboard-interrupt-handler
The property list of a symbol is an association list that is attached to that symbol. The association list maps properties, which are themselves symbols, to arbitrary values.
(putprop symbol property obj) => unspecified
If an association exists for property on the property list of symbol, then its value is replaced by the new value obj. Otherwise, a new association is added to the property list of symbol that associates property with obj.
(getprop symbol property) => obj
If an association exists for property on the property list of
symbol, then its value is returned. Otherwise, #f
is returned.
(remprop symbol property) => unspecified
If an association exists for property on the property list of symbol, then that association is removed. Otherwise, this is a no-op.
Gensym returns a new uninterned symbol, the name of which contains the given string.
Oblist returns the list of interned symbols.
(oblist-set! list) => unspecified
(oblist-set! list table-size) => unspecified
oblist-set!
sets the list of interned symbols to those in the given
list by clearing the symbol hash table and storing the symbols in
list in the hash table. If the optional table-size is given, it is
taken to be the desired size of the new symbol table.
See also: symbol-hash.
(collect generation) => unspecified
(collect generation method) => unspecified
Collect initiates a garbage collection. If the system has multiple generations, then the optional arguments are interpreted as follows. The generation is the generation to collect, where 0 is the youngest generation. The method determines how the collection is performed. If method is the symbol collect, then a full collection is performed in that generation, whatever that means — in a normal multi-generational copying collector, it means that all live objects in the generation's current semispace and all live objects from all younger generations are copied into the generation's other semispace. If method is the symbol promote, then live objects are promoted from younger generations into the target generation — in our example collector, that means that the objects are copied into the target generation's current semispace.
The default value for generation is 0, and the default value for method is collect.
Note that the collector's internal policy settings may cause it to perform a more major type of collection than the one requested; for example, an attempt to collect generation 2 could cause the collector to promote all live data into generation 3.
gc-counter returns the number of garbage collections performed since startup. On a 32-bit system, the counter wraps around every 1,073,741,824 collections.
gc-counter is a primitive and compiles to a single load instruction on the SPARC.
major-gc-counter returns the number of major garbage collections performed since startup, where a major collection is defined as a collection that may change the address of objects that have already survived a previous collection. On a 32-bit system, the counter wraps around every 1,073,741,824 collections.
major-gc-counter is a primitive and compiles to a single load instruction on the SPARC. Its primary use to implement efficient hashtables that hash on object identity (make-eq-hashtable and make-eqv-hashtable).
(gcctl heap-number operation operand) => unspecified
[GCCTL is largely obsolete in the new garbage collector but may be resurrected in the future. It can still be used to control the non-predictive collector.]
gcctl controls garbage collection policy on a heap-wise basis. The heap-number is the heap to operate on, like for the command line switches: heap 1 is the youngest. If the given heap number does not correspond to a heap, gcctl fails silently.
The operation is a symbol that selects the operation to perform, and the operand is the operand to that operation, always a number. For the non-predictive garbage collector, the following operator/operand pairs are meaningful:
Example: if the non-predictive heap is heap number 2, then the expressions
(gcctl 2 'j-fixed 0) (gcctl 2 'incr-fixed 1)
makes the non-predictive collector simulate a normal stop-and-copy collector (because j is always set to 0), and grows the heap only one step at a time as necessary. This may be useful for certain kinds of experiments.
Example: ditto, the expressions
(gcctl 2 'j-percent 50) (gcctl 2 'incr-percent 20)
selects the default policy settings.
Note: The gcctl facility is experimental. A more developed facility will allow controlling heap contraction policy, as well as setting all the watermarks. Certainly one can envision other uses, too. Finally, it needs to be possible to get current values.
Note: Currently the non-predictive heap (np-sc-heap.c) and the standard stop-and-copy "old" heap (old-heap.c) are supported, but not the standard "young" heap (young-heap.c), nor the stop-and-copy collector (sc-heap.c).
(sro pointer-tag type-tag limit) => vector
SRO ("standing room only") is a system primitive that traverses the entire heap and returns a vector that contains all live objects in the heap that satisfy the constraints imposed by its parameters:
For example, (sro -1 -1 -1) returns a vector that contains all live objects (not including the vector), and (sro 5 2 3) returns a vector containing all live flonums (bytevector-like, with typetag 2) that are referred to in no more than 3 places.
(stats-dump-on filename) => unspecified
Stats-dump-on turns on garbage collection statistics dumping. After each collection, a complete RTS statistics dump is appended to the file named by filename.
The file format and contents are documented in a banner written at the top of the output file. In addition, accessor procedures for the output structure are defined in the program Util/process-stats.sch.
Stats-dump-on does not perform an initial dump when the file is first opened; only at the first collection is the first set of statistics dumped. The user might therefore want to initiate a minor collection just after turning on dumping in order to have a baseline set of data.
(stats-dump-off ) => unspecified
Stats-dump-off turns off garbage collection statistics dumping (which was turned on with stats-dump-on). It does not dump a final set of statistics before closing the file; therefore, the user may wish to initiate a minor collection before calling this procedure.
System-features returns an association lists of system features. Most entries are self-explanatory. The following are a more subtle:
(display-memstats vector) => unspecified
(display-memstats vector minimal) => unspecified
(display-memstats vector minimal full) => unspecified
Display-memstats takes as its argument a vector as returned by memstats and displays the contents of the vector in human-readable form on the current output port. By default, not all of the values in the vector are displayed.
If the symbol minimal is passed as the second argument, then only a small number of statistics generally relevant to running benchmarks are displayed.
If the symbol full is passed as the second argument, then all statistics are displayed.
Memstats returns a freshly allocated vector containing run-time-system resource usage statistics. Many of these will make no sense whatsoever to you unless you also study the RTS sources. A listing of the contents of the vector is available here.
Run-with-stats evaluates thunk, then prints a short summary of run-time statistics, as with
(display-memstats ... 'minimal),
and then returns the result of evaluating thunk.
(run-benchmark name k thunk ok?) => obj
Run-benchmark prints a short banner (including the identifying name) to identify the benchmark, then runs thunk k times, and finally tests the value returned from the last call to thunk by applying the predicate ok? to it. If the predicate returns true, then run-benchmark prints summary statistics, as with
([display-memstats][5] ... 'minimal).
If the predicate returns false, an error is signalled.
The SRFIs (Scheme Requests For Implementations) is an Internet-based collection of Scheme code designed and provided by Scheme programmers. The SRFI effort is open to anyone, and is described at http://srfi.schemers.org.
The fundamental SRFI is SRFI-0, "Feature-based conditional expansion construct", which allows a program to query the underlying implementation about the available SRFIs (and potentially about other implementation features) at macro expansion time. The design documents for this and other SRFIs are available at the web site shown above.
Larceny currently supports many SRFIs, but not as many as it should.
Some SRFIs are built into Larceny, but most must be loaded dynamically
using Larceny's require
procedure:
> (require 'srfi-0)
Larceny provides the following nonstandard SRFI keys for use in SRFI 0:
larceny
SLIB is a large collection of useful libraries that have been written or collected by Aubrey Jaffer.
Larceny supports SLIB via
SRFI 96,
but SLIB itself is not shipped with Larceny;
it must be downloaded separately and then installed.
For the most up-to-date information on installing and using
SLIB with Larceny, see doc/HOWTO-SLIB
.
Larceny provides a general foreign-function interface (FFI) substrate on which other FFIs can be built; see Larceny Note #7. The FFI described in this manual section is a simple example of a derived FFI. It is not yet fully evolved, but it is useful.
This section has undergone signficant revision, but not all of the material has been properly vetted. Some of the information in this section may be out of date.
Some of the text below is adapted from the 2008 Scheme Workshop paper, “The Layers of Larceny's Foreign Function Interface,” by Felix S Klock II. That paper may provide additional insight for those searching for implementation details and motivations.
There are a number of different potential ways to use the FFI. One client may want to develop code in C and load it into Larceny. Another client may want to load native libraries provided by the host operating system, enabling invocation of foreign code from Scheme expressions without developing any C code or even running a C compiler. Larceny's FFI can be used for both of these cases, but many of its facilities target a third client in between the two extremes: a client with a C compiler and the header files and object code for the foreign libraries, but who wishes to avoid writing glue code in C to interface with the libraries.
There are four main steps to interacting with foreign code:
Step 1 is conceptual, while steps 2 through 4 yield artifacts in Scheme source code.
At the machine code level, foreign values are uninterpreted sequences of bits. Often foreign object code is oriented around manipulating word-sized bit-sequences (words) or arrays and tuples of words.
Many libraries are written with a particular
interpretation of such values. In C code, explicit types are
often used hints to guide such interpretation; for example,
a 0
of type bool
is usually interpreted as false,
while a 1
(or other non-zero value) of type bool
is
usually interpreted as true.
Another example are C enumerations (or enums).
An enum declaration defines a set of named
integral constants. After the C declaration:
enum months { JAN = 1, FEB, MAR, APR, MAY, JUN, JUL, AUG, SEP, OCT, NOV, DEC };
a JAN
in C code now denotes 1
, FEB
is 2
, and so on.
Furthermore, tools like debuggers may render a variable x
dynamically assigned the value 2
(and of static type enum months
)
as FEB
. Thus the enum declaration
intoduces a new interpretation for a finite set of integers.
This leads to questions for a client of an FFI; we explore some below.
#f
and #t
for the bool
type, or the Scheme symbols
{JAN
, FEB
, MAR
, APR
, MAY
, JUN
,
JUL
, AUG
, SEP
, OCT
, NOV
, DEC
}
for the enum months
type?
A foreign library might leave the mapping
of names like FEB
to words like 2
unspecified
in the library interface.
That is, while the C compiler will know FEB
maps to 2
according to a particular version of the library's header file,
the library designer may intend to change this mapping
in the future, and clients writing C code should only use
the names to refer to a enum months
value, and not integer
expressions.
Foreign libraries often manipulate mutable entities, like arrays of words where modifications can be observed (often by design).
Will the foreign code hold references to heap-allocated objects? Heap-allocated objects that leak out to foreign memory must be treated with care; garbage collection presents two main problems.
cons-nonrelocatable
, make-nonrelocatable-bytevector
,
and make-nonrelocatable-vector
.
Answering these questions may require deep knowledge of the intended usage of the foreign library.
The Larceny FFI attempts to ease interfacing with foreign code in the presence of the above concerns, but the nature of the header files included with most foreign libraries means that the FFI cannot infer the answers unassisted.
Foreign C code developed to work in concert with Larceny could hypothetically be written to cope with holding handles for objects managed by the the garbage collector, but there is currently no significant support for this use-case.
One class of foreign values is not addressed
by the Larceny FFI: structures passed by value (as
opposed to by reference, ie pointers to structures).
There is no way to describe the interface to a
foreign procedure that accepts or produces a
C struct
(at least not properly nor portably).
This tends to not matter for many foreign libraries (since many C programmers eschew passing structures by value), but it can arise.
If the foreign library of interest has procedures that
accept or produce a C struct
, we currently recommend
either avoiding such procedures, or writing
adapter code in C that marshals between values handled
by the FFI and the C struct
.
The conclusion is: when designing an interface to a foreign library, you should analyze the values manipulated on the foreign side and identify their relationship with values on the Scheme side. After you have identified the domains of interest, you then describe how the values will be marshaled back and forth between the two domains.
This section describes the marshalling protocol defined in
lib/Base/std-ffi.sch
.
Foreign functions automatically marshal their inputs and outputs according to type-descriptors attached to each foreign function.
Type-descriptors are S-expressons formed according to the following grammar:
TypeDesc ::= CoreAttr | ArrowT | MaybeT | OneOfT CoreAttr ::= PrimAttr | VoidStar | --- PrimAttr ::= CurrentPrimAttr | DeprecatedPrimAttr CurrentPrimAttr ::= int | uint | byte | short | ushort | char | uchar | long | ulong | longlong | ulonglong | size_t | float | double | bool | string | void DeprecatedPrimAttr ::= unsigned | boxed VoidStar ::= void* | --- ArrowT ::= (-> (TypeDesc ...) TypeDesc) MaybeT ::= (maybe TypeDesc) OneOfT ::= (oneof (Any Fixnum) ... TypeDesc)
where ---
represents a user-extensible part of the grammar
(see below),
Any
represents any Scheme value, and Fixnum
represents
any word-sized integer.
A central registry maps CoreAttr
's to a foreign
representation and two conversion routines:
one to convert a Scheme value to a foreign argument, and
another to convert a foreign result back back to a Scheme value.
The denoted components are collectively referred to as a type
within the FFI documentation.
The registry is extensible; the ffi-add-attribute-core-entry!
procedure adds new CoreAttr's
to the registry, and
one can alternatively add short-hands for
type-descriptors via the ffi-add-alias-of-attribute-entry!
procedure.
Finally, one can add new VoidStar
productions
(subtypes of the void*
type-descriptor)
via the ffi-install-void*-subtype
procedure
(defined in the lib/Standard/foreign-stdlib.sch
library).
The following is a list of the accepted types and their conversions at the boundary between Scheme and foreign code:
int
int
".
uint
unsigned int
".
byte
int
in the current implementation.
short
int
in the current implementation.
ushort
unsigned
in the current implementation.
char
char
".
uchar
unsigned char
".
long
int
in the current implementation.
ulong
unsigned
in the current implementation.
longlong
long long
".
ulonglong
unsigned long long
".
size_t
uint
in the current implementation.
float
float
".
The conversion to float
is performed via
a C (float)
cast from a C double
.
double
bool
int
";
#f
is converted to 0, and all other objects to 1.
In the reverse direction, 0 is converted to #f
and
all other integers to #t
.
string
#f
is converted to a C "(char*)0
" value.
In the reverse direction, a pointer to a NUL-terminated sequence
of bytes interpreted as ASCII characters is
copied into a freshly allocated Scheme string; a NULL pointer is
converted to #f
.
void
unsigned
uint
; deprecated.
boxed
void*
" to the first element of the structure. The
value #f
is also acceptable. It is converted to a C "(void*)0
"
value.
(Only used in argument position for foreign functions; foreign
functions are not expected to return direct references
to heap-allocated values.)
The public interface to many foreign libraries is written in terms of types defined within that foreign library. One can introduce new types to the Larceny FFI by extending the core attribute entry table.
Procedure ffi-add-attribute-core-entry!
(ffi-add-attribute-core-entry! entry-name rep-sym marshal unmarshal) => unspecified
ffi-add-attribute-core-entry! extends the internal registry with the new entry specified by its arguments.
signed32
, unsigned32
, signed64
, unsigned64
(representing varieties of fixed width integers),
ieee32
(representing “floats”),
ieee64
(representing “doubles”), or
pointer
(representing “(void*)
” in C).
#f
or an unmarshalling function that
accepts an instance of the low-level representation
and produces a corresponding Scheme object.
Core attributes suffice for linking to simple functions. Constructured FFI attributes express more complex marshaling protocols
Arrow Type Constructors. A structured FFI attribute
of the form (-> (s_1 … s_n) s_r)
(called an arrow type)
allows passing functions from Scheme to C
and back again. Each of the s_1, …, s_n, s_r
is an FFI attribute.
When an arrow type describes an input to a foreign
function, it marshals a Scheme procedure to a
C function pointer by generating glue code to hook the two together
and marshal values as described by the FFI attributes
within the arrow type.
Likewise, when an arrow type describes an output from a
foreign function, it marshals a C function pointer
to a Scheme procedure, again by generating glue code.
These two mappings naturally generalize to arbitrary nesting
of arrow types, so one can create callbacks that consume
callouts, return callouts that consume callbacks, and so on.
The current implementation of arrow types introduces an unnecessary space leak, because none of Larceny's current garbage collectors attempt to reclaim some of the structure allocated (in particular, the so-called trampolines) when functions are marshaled via arrow types.
The FFI could be revised to reduce the leak (e.g. it could keep a cache of generated trampolines and reuse them, but currently do not do so).
Many foreign libraries have a structure where one only sets up a fixed set of callbacks, and then all further computation does not require arrow type marshaling. This is one reason why fixing this problem has been a low priority item for the Larceny development team.
Maybe Type Constructor. (maybe t)
captures the
pattern of passing NULL
in C and #f
in Scheme
to represent the absence of information.
The FFI attribute t within the maybe type
describes the typical information passed;
the constructed maybe type
marshals #f
to the foreign null pointer or 0
(as appropriate),
and otherwise applies the marshaling of t.
Likewise, it unmarshals the foreign
null pointer and 0
to #f
, and otherwise applies the
unmarshaling of t.
(There are a few other built-in type constructors, such as
the oneof
type constructor, but they
are not as fully-developed as the two above, and are intended
for use only for internal development for now.)
Using the void*
attribute
wraps foreign addresses up in a Larceny record,
so that standard numeric
operations cannot be directly applied by accident.
The FFI uses two features of Larceny's record system:
the record type descriptor is a first class
value with an inspectable name, and
record types are extensible via single-inheritance.
Basic Operations on void*
. The FFI provides void*-rt
, a record type
descriptor with a single field (a wrapped address).
There is also a family of functions for dereferencing the
pointer within a void*-rt
and manipulating the
state it references.
void*
.
Distinquishes void*
's from other Scheme values.
(void*-byte-ref x idx) => number
(void*-byte-set! x idx val) => unspecified
(void*-word-ref x idx) => number
(void*-word-set! x idx val) => unspecified
(void*-void*-ref x idx) => void*
void*
) at offset from address within x.
(void*-void*-set! x idx val) => unspecified
(void*-double-ref x idx) => number
(void*-double-set! x idx val) => unspecified
Type Hierarchies. Procedures for establishing type hierarchies are provided by the
lib/Standard/foreign-stdlib.sch
library; see
ffi-install-void*-subtype and establish-void*-subhierarchy!.
You must first compile your C code and create one or more loadable object modules. These object modules may then be loaded into Larceny, and Scheme foreign functions may link to specific functions in the loaded module. Defining foreign functions in Scheme is covered in a later section.
The method for creating a loadable object module varies from platform to platform. In the following, assume you have to C source files file1.c and file2.c that define functions that you want to make available as foreign functions in Larceny.
Compile your source files and create a shared library. Using GCC, the command line might look like this:
gcc -fPIC -shared file1.c file2.c -o my-library.so
The command creates my-library.so in the current directory. This library can now be loaded into Larceny using foreign-file. Any other shared libraries used by your library files should also be loaded into Larceny using foreign-file before any procedures are linked using foreign-procedure.
By default, /lib/libc.so is made available to the dynamic linker and to the foreign function interface, so there is no need for you to load that library explicitly.
Compile your source files and create a shared library, linking with all the necessary libraries. Using GCC, the command line might look like this:
gcc -fPIC -shared file1.c file2.c -lc -lm -lsocket -o my-library.so
Now you can use foreign-file to load my-library.so into Larceny.
By default, /lib/libc.so is made available to the foreign function interface, so there is no need for you to load that library explicitly.
(foreign-file filename) => unspecified
foreign-file loads the named object file into Larceny and makes it available for dynamic linking.
Larceny uses the operating system provided dynamic linker to do dynamic linking. The operation of the dynamic linker varies from platform to platform:
LD_LIBRARY_PATH
environment variable) for the file. Hence, a foreign file in the current directory should be "./file.so", not "file.so".
(foreign-procedure name (arg-type …) return-type) => unspecified
FIXME: The interface to this function has been extended to support hooking into Windows procedures that use the Pascal calling convention instead of the C one. The way to select which convention to use should be documented.
Returns a Scheme procedure p that calls the foreign procedure whose name is name. When p is called, it will convert its parameters to representations indicated by the arg-types and invoke the foreign procedure, passing the converted values as parameters. When the foreign procedure returns, its return value is converted to a Scheme value according to return-type.
Types are described below.
The address of the foreign procedure is obtained by searching for name in the symbol tables of the foreign files that have been loaded with foreign-file.
Procedure foreign-null-pointer
(foreign-null-pointer ) => integer
Returns a foreign null pointer.
Procedure foreign-null-pointer?
(foreign-null-pointer? integer) => boolean
Tests whether its argument is a foreign null pointer.
The two primitives peek-bytes and poke-bytes are provided for reading and writing memory at specific addresses. These procedures are typically used for copying data from foreign data structures into Scheme bytevectors for subsequent decoding.
(The use of peek-bytes and poke-bytes can often be avoided by keeping foreign data in a Scheme bytevector and passing the bytevector to a call-out using the boxed parameter type. However, this technique is inappropriate if the foreign code retains a pointer to the Scheme datum, which may be moved by the garbage collector.)
(peek-bytes addr bytevector count) => unspecified
Addr must be an exact nonnegative integer. Count must be a fixnum. The bytes in the range from addr through addr+count-1 are copied into bytevector, which must be long enough to hold that many bytes.
If any address in the range is not an address accessible to the process, unpredictable things may happen. Typically, you'll get a segmentation fault. Larceny does not yet catch segmentation faults.
(poke-bytes addr bytevector count) => unspecified
Addr must be an exact nonnegative integer. Count must be a fixnum. The count first bytes from bytevector are copied into memory in the range from addr through addr+count-1.
If any address in the range is not an address accessible to the process, unpredictable things may happen. Typically, you'll get a segmentation fault. Larceny does not yet catch segmentation faults.
Also, it's possible to corrupt memory with poke-bytes. Don't do that.
The following variables constants define the sizes of basic C data types:
Foreign data is visible to a Scheme program either as an object pointed to by a memory address (which is itself represented as an integer), or as a bytevector that contains the bytes of the foreign datum.
A number of utility procedures that make reading and writing data of common C primitive types have been written for both these kinds of foreign objects.
Bytevector accessor procedures
(%get16 bv i) => integer
(%get16u bv i) => integer
(%get32 bv i) => integer
(%get32u bv i) => integer
(%get-int bv i) => integer
(%get-unsigned bv i) => integer
(%get-short bv i) => integer
(%get-ushort bv i) => integer
(%get-long bv i) => integer
(%get-ulong bv i) => integer
(%get-pointer bv i) => integer
These procedures decode bytevectors that contain the bytes of foreign objects. In each case, bv is a bytevector and i is the offset of the first byte of a field in that bytevector. The field is fetched and returned as an integer (signed or unsigned as appropriate).
Bytevector updater procedures
(%set16 bv i val) => unspecified
(%set16u bv i val) => unspecified
(%set32 bv i val) => unspecified
(%set32u bv i val) => unspecified
(%set-int bv i val) => unspecified
(%set-unsigned bv i val) => unspecified
(%set-short bv i val) => unspecified
(%set-ushort bv i val) => unspecified
(%set-long bv i val) => unspecified
(%set-ulong bv i val) => unspecified
(%set-pointer bv i val) => unspecified
These procedures update bytevectors that contain the bytes of foreign objects. In each case, bv is a bytevector, i is an offset of the first byte of a field in that bytevector, and val is a value to be stored in that field. The values must be exact integers in a range implied by the data type.
Foreign-pointer accessor procedures
(%peek8 addr) => integer
(%peek8u addr) => integer
(%peek16 addr) => integer
(%peek16u addr) => integer
(%peek32 addr) => integer
(%peek32u addr) => integer
(%peek-int addr) => integer
(%peek-long addr) => integer
(%peek-unsigned addr) => integer
(%peek-ulong addr) => integer
(%peek-short addr) => integer
(%peek-ushort addr) => integer
(%peek-pointer addr) => integer
(%peek-string addr) => integer
These procedures read raw memory. In each case, addr is an address, and the value stored at that address (the size of which is indicated by the name of the procedure) is fetched and returned as an integer.
%Peek-string expects to find a NUL-terminated string of 8-bit bytes at the given address. It is returned as a Scheme string.
Foreign-pointer updater procedures
(%poke8 addr val) => unspecified
(%poke8u addr val) => unspecified
(%poke16 addr val) => unspecified
(%poke16u addr val) => unspecified
(%poke32 addr val) => unspecified
(%poke32u addr val) => unspecified
(%poke-int addr val) => unspecified
(%poke-long addr val) => unspecified
(%poke-unsigned addr val) => unspecified
(%poke-ulong addr val) => unspecified
(%poke-short addr val) => unspecified
(%poke-ushort addr val) => unspecified
(%poke-pointer addr val) => unspecified
These procedures update raw memory. In each case, addr is an address, and val is a value to be stored at that address.
If foreign functions are linked into Larceny using the FFI, and a Larceny heap image is subsequently dumped (with dump-interactive-heap or dump-heap), then the foreign functions are not saved as part of the heap image. When the heap image is subsequently loaded into Larceny at startup, the FFI will attempt to re-link all the foreign functions in the heap image.
During the relinking phase, foreign files will again be loaded into Larceny, and Larceny's FFI will use the file names as they were originally given to the FFI when it tries to load the files. In particular, if relative pathnames were used, Larceny will not have converted them to absolute pathnames.
An error during relinking will result in Larceny aborting with an error message and returning to the operating system. This is considered a feature.
This procedure uses the chdir() system call to set the process's current working directory. The string parameter type is used to pass a Scheme string to the C procedure.
(define cd (let ((chdir (foreign-procedure "chdir" '(string) 'int))) (lambda (newdir) (if (not (zero? (chdir newdir))) (error "cd: " newdir " is not a valid directory name.")) (unspecified))))
This procedure uses the getcwd() (get current working directory) system call to retrieve the name of the process's current working directory. A bytevector is created and passed in as a buffer in which to store the return value — a 0-terminated ASCII string. Then the FFI utility function ffi/asciiz->string is called to convert the bytevector to a string.
(define pwd (let ((getcwd (foreign-procedure "getcwd" '(boxed int) 'int))) (lambda () (let ((s (make-bytevector 1024))) (getcwd s 1024) (ffi/asciiz->string s)))))
this example is bogus. It is not safe to pass a collectable object into a C procedure when the callback invocation might cause a garbage collection, thus moving the object and invalidating the address stored in the C machine context.
This demonstrates how to use a callback such as the comparator argument to qsort. It is specified in the type signature using -> as a type constructor. (Note that one should probably use the built-in sort routines rather than call out like this; this example is for demonstrating callbacks, not how to sort.)
(define qsort! (foreign-procedure "qsort" '(boxed ushort ushort (-> (void* void*) int)) 'void))
(let ((bv (list->vector '(40 10 30 20 1 2 3 4)))) (qsort! bv 8 4 (lambda (x y) (let ((x (/ (void*-word-ref x 0) 4)) (y (/ (void*-word-ref y 0) 4))) (- x y)))) bv)
(let ((bv (list->bytevector '(40 10 30 20 1 2 3 4)))) (qsort! bv 8 1 (lambda (x y) (let ((x (void*-byte-ref x 0)) (y (void*-byte-ref y 0))) (- x y)))) bv)
The general foreign-function interface functionality described above is powerful but awkward to use in practice. A user might be tempted to hard code values of offsets or constants that are compiler dependent. Also, the FFI will marshall some low-level values such as strings or integers, but other values such as enumerations which could be naturally mapped to sets of symbols are not marshalled since the host environment does not provide the necessary type information to the FFI.
This section documents a collection of libraries to mitigate these and other problems.
Foreign data access is performed by peeking at manually calculated addresses, but in practice one often needs to inspect fields of C structures, whose offsets are dependant on the application binary interface (ABI) of the host environment. Similarly, C programs often use refer to values via constant macro definitions; since the values of such names are not provided by the object code and Scheme programs do not have a C preprocessor run on them prior to execution, it is difficult to refer to the same value without encoding "magic numbers" into the Scheme source code.
The foreign-ctools library is meant to mitigate problems like the two described above. It provides special forms for introducing global definitions of values typically available at compile-time for a C program. The library assumes the presence of a C compiler (such as cc on Unix systems or cl.exe on Windows systems). The special forms work by dynamically generating, compiling, and running C code at expansion time to determine the desired values of structure offsets or macro constants.
Here is a grammar for the define-c-info
form provided by
the foreign-ctools
library.
<exp> ::= (define-c-info <c-decl> ... <c-defn> ...) <c-decl> ::= (compiler <cc-spec>) | (path <include-path>) | (include <header>) | (include<> <header>) <cc-spec> ::= cc | cl <c-defn> ::= (const <id> <c-type> <c-expr>) | (sizeof <id> <c-type-expr>) | (struct <c-name> <field-clause> ...) | (fields <c-name> <field-clause> ...) | (ifdefconst <id> <c-type> <c-name>) <c-type> ::= int | uint | long | ulong <include-path> ::= <string-literal> <header> ::= <string-literal> <field-clause> ::= (<offset-id> <c-field>) | (<offset-id> <c-field> <size-id>) <c-expr> ::= <string-literal> <c-type-expr> ::= <string-literal> <c-name> ::= <string-literal> <c-field> ::= <string-literal>
Syntax define-c-info
(define-c-info <c-decl> … <c-defn> …)
The <c-decl>
clauses of define-c-info
control how header files are processed.
The compiler
clause selects between cc
(the default UNIX system compiler) and cl
(the compiler included with Microsoft's Windows SDK).
The path
clause adds a directory to search when
looking for header files.
The include
and include<>
clauses indicate
header files to include when executing the
<c-defn>
clauses;
the two variants correspond to the quoted and bracketed
forms of the C preprocessor's #include
directive.
The <c-defn>
clauses bind identifiers.
A (const x t "ae")
clause binds x to
the integer value of ae according to the C language;
ae can be any C arithmetic expression that evaluates
to a value of type t.
(The expected usage is for ae to be an
expression that the C preprocessor expands to an arithmetic expression.)
The remaining clauses provide similar functionality:
(sizeof x "te")
binds x to the size occupied by values
of type te, where te is any C type expression.
(struct "cn" … (x "cf" y) …)
binds x to the offset from the start of a
structure of type struct cn
to its
cf field, and binds y, if present, to the field's size.
A fields
clause is similar, but it applies
to structures of type cn
rather than struct cn
.
(ifdefconst x t "cn")
binds x to the value of cn
if cn
is defined;
x is otherwise bound to Larceny's unspecified value.
The foreign-procedure function is sufficient to link in dynamically loaded C procedures, but it can be annoying to use when there are many procedures to define that all follow a regular pattern where one could infer a mapping between Scheme identifiers and C function names.
For example, some libraries follow a naming convention where a words within a name are separated by underscores; such functions could be immediately mapped to Scheme names where the underscores have been replaced by dashes.
The foreign-sugar library provides a special form, define-foreign
,
which gives the user a syntax for defining foreign functions using
a syntax where one provides only the Scheme name, the argument types,
and the return type. The define-foreign
form then attempts to
infer what C function the name was meant to refer to.
Syntax define-foreign
(define-foreign (name arg-type …) result-type)
There is other functionality provided allowing the user to introduce new rules for inferring C function names, but they are undocumented because they will probably have to change when we switch to an R6RS macro expander.
(stdlib/malloc rtd [ctor]) => procedure
Given a record extension of void*-rt, returns an allocator that uses
the C malloc
procedure to allocate instances of such an object.
Note that the client is responsible for eventually freeing such
objects with stdlib/free.
Frees objects produced by allocators returned from stdlib/malloc.
Procedure ffi-install-void*-subtype
(ffi-install-void*-subtype rtd) => rtd
(ffi-install-void*-subtype string [parent-rtd]) => rtd
(ffi-install-void*-subtype symbol [parent-rtd]) => rtd
ffi-install-void*-subtype
extends the core attribute registry with a new primitive
entry for subtype.
The parent-rtd argument should be a subtype of void*-rt
and defaults to void*-rt
.
In the case of the symbol or string inputs, the
procedure constructs a new record type subtyping the parent argument.
In the case of the rtd input, the rtd record type
must extend void*-rt
.
ffi-install-void*-subtype returns the subtype record type.
The returned record type represents a tagged wrapped C pointer, allowing one to encode type hierarchies.
Procedure establish-void*-subhierarchy!
(establish-void*-subhierarchy! symbol-tree) => unspecified
establish-void*-subhierarchy! is a convenience function
for constructing large object hierarchies.
It descends the symbol-tree,
creates a record type descriptor for each symbol
(where the root of the tree has the parent void*-rt
),
and invokes ffi-install-void*-subtype on all
of the introduced types.
Type char* extends void* Procedure string->char*
(string->char* string) => char*
(char*-strlen char*) => fixnum
(char*->string char*) => string
(char*->string char* len) => string
(call-with-char* string string-function) => value
(call-with-char** string-vector function) => value
(call-with-int* fixnum-vector function) => value
(call-with-short* fixnum-vector function) => value
(call-with-double* num-vector function) => value
FIXME: (There are other functions, but I want to test and document the ones above first…)
The foreign-cstructs
library provides a
more direct interface to C structures.
It provides the define-c-struct
special form.
This form is layered on top of define-c-info
;
the latter provides the structure field offsets
and sizes used to generate constructors
(which produce appropriately sized bytevectors,
not record instances).
The define-c-struct
form combines these
with marshaling and unmarshaling procedures to
provide high-level access to a structure.
The grammar for the define-c-struct
form is presented below.
<exp> ::= (define-c-struct (<struct-type> <ctor-id> <c-decl> ...) <field-clause> ...) <field-clause> ::= (<c-field> <getter>) | (<c-field> <getter> <setter>) <getter> ::= (<id>) | (<id> <unmarshal>) <setter> ::= (<id>) | (<id> <marshal>) <marshal> ::= <ffi-attr-symbol> | <marshal-proc-exp> <unmarshal> ::= <ffi-attr-symbol> | <unmarshal-proc-exp> <struct-type> ::= <string-literal>
This library provides the special forms
define-c-enum
and define-c-enum-set
,
which associate the identifiers of
a C enum
type declaration
with the integer values they denote.
The define-c-enum
form describes enums
encoding a discriminated sum;
define-c-enum-set
describes bitmasks,
mapping them to R6RS enum-sets in Scheme.
The (define-c-enum en (<c-decl> …) (x "cn") …)
form adds the en
FFI attribute.
The attribute marshals each symbol x
to
the integer value that cn
denotes in C;
unmarshaling does the inverse translation.
The (define-c-enum-set ens (<c-decl> …) (x "cn") …)
form binds ens to an R6RS enum-set constructor
with universe resulting from
(make-enumeration '(x …))
; it also adds the ens
FFI attribute. The attribute marshals an
enum-set s constructed by ens
to the corresponding bitmask in C (that is,
the integer one would get by logically or'ing
all cn such that the corresponding x is in s).
Unmarshaling attempts to do the inverse translation.
The grammar for the two forms is presented below.
<exp> ::= (define-c-enum <enum-id> (<c-decl> ...) (<id> <c-name>) ...) <exp> ::= (define-c-enum-set <enum-id> (<c-decl> ...) (<id> <c-name>) ...) <enum-id> ::= <id>