If you want to extend, fix or simply customize ECL for your own needs, you should understand how the implementation works.
Union containing all first-class ECL types.
In ECL a lisp object is represented by a type called
cl_object
. This type is a word which is long enough to host both
an integer and a pointer. The least significant bits of this word, also
called the tag bits, determine whether it is a pointer to a C structure
representing a complex object, or whether it is an immediate data, such
as a fixnum or a character.
The topic of the immediate values and bit fiddling is nicely described
in
Peter Bex’s blog describing Chicken
Scheme internal data representation. We could borrow some ideas from it
(like improving fixnum
bitness and providing more immediate
values). All changes to code related to immediate values should be
carefully benchmarked.
The fixnums
and characters are called immediate data types,
because they require no more than the cl_object
datatype to store
all information. All other ECL objects are non-immediate and they are
represented by a pointer to a cell that is allocated on the heap. Each
cell consists of several words of memory and contains all the
information related to that object. By storing data in multiples of a
word size, we make sure that the least significant bits of a pointer are
zero, which distinguishes pointers from immediate data.
In an immediate datatype, the tag bits determine the type of the object. In non-immediate datatypes, the first byte in the cell contains the secondary type indicator, and distinguishes between different types of non immediate data. The use of the remaining bytes differs for each type of object. For instance, a cons cell consists of three words:
+---------+----------+ | CONS | | +---------+----------+ | car-pointer | +--------------------+ | cdr-pointer | +--------------------+
Note, that this is one of the possible implementation of
cons
. The second one (currently default) uses the immediate value
for the list
and consumes two words instead of three. Such
implementation is more memory and speed efficient (according to the
comments in the source code):
/* * CONSES * * We implement two variants. The "small cons" type carries the type * information in the least significant bits of the pointer. We have * to do some pointer arithmetics to find out the CAR / CDR of the * cons but the overall result is faster and memory efficient, only * using two words per cons. * * The other scheme stores conses as three-words objects, the first * word carrying the type information. This is kept for backward * compatibility and also because the oldest garbage collector does * not yet support the smaller datatype. * * To make code portable and independent of the representation, only * access the objects using the common macros below (that is all * except ECL_CONS_PTR or ECL_PTR_CONS). */
This is the type of a lisp object. For your C/C++ program, a cl_object
can be either a fixnum, a character, or a pointer to a union of
structures (See cl_lispunion
in the header object.h). The actual
interpretation of that object can be guessed with the macro
ecl_t_of
.
For example, if x is of type cl_object, and it is of type fixnum, we may retrieve its value:
if (ecl_t_of(x) == t_fixnum) printf("Integer value: %d\n", ecl_fixnum(x));
If x
is of type cl_object
and it does not contain an
immediate datatype, you may inspect the cell associated to the lisp
object using x
as a pointer. For example:
if (ecl_t_of(x) == t_vector) printf("Vector's dimension is: %d\n", x->vector.dim);
You should see the following sections and the header object.h to learn
how to use the different fields of a cl_object
pointer.
Enumeration type which distinguishes the different types of lisp objects. The most important values are:
t_cons
t_fixnum
, t_character
, t_bignum
,
t_ratio
, t_singlefloat
, t_doublefloat
,
t_complex
, t_symbol
, t_package
, t_hashtable
,
t_array
, t_vector
, t_string
, t_bitvector
,
t_stream
, t_random
, t_readtable
, t_pathname
,
t_bytecodes
, t_cfun
, t_cclosure
, t_gfun
,
t_instance
, t_foreign
and t_thread
.
cl_type
ecl_t_of (cl_object x)
¶If x is a valid lisp object, ecl_t_of(x)
returns an integer
denoting the type that lisp object. That integer is one of the values of
the enumeration type cl_type
.
bool
ECL_CHARACTERP (cl_object o)
¶bool
ECL_BASE_CHAR_P (cl_object o)
¶bool
ECL_BASE_CHAR_CODE_P (ecl_character o)
¶bool
ECL_NUMBER_TYPE_P (cl_object o)
¶bool
ECL_COMPLEXP (cl_object o)
¶bool
ECL_REAL_TYPE_P (cl_object o)
¶bool
ECL_FIXNUMP (cl_object o)
¶bool
ECL_BIGNUMP (cl_object o)
¶bool
ECL_SINGLE_FLOAT_P (cl_object o)
¶bool
ECL_DOUBLE_FLOAT_P (cl_object o)
¶bool
ECL_LONG_FLOAT_P (cl_object o)
¶bool
ECL_CONSP (cl_object o)
¶bool
ECL_LISTP (cl_object o)
¶bool
ECL_ATOM (cl_object o)
¶bool
ECL_SYMBOLP (cl_object o)
¶bool
ECL_ARRAYP (cl_object o)
¶bool
ECL_VECTORP (cl_object o)
¶bool
ECL_BIT_VECTOR_P (cl_object o)
¶bool
ECL_STRINGP (cl_object o)
¶bool
ECL_HASH_TABLE_P (cl_object o)
¶bool
ECL_RANDOM_STATE_P (cl_object o)
¶bool
ECL_PACKAGEP (cl_object o)
¶bool
ECL_PATHNAMEP (cl_object o)
¶bool
ECL_READTABLEP (cl_object o)
¶bool
ECL_FOREIGN_DATA_P (cl_object o)
¶bool
ECL_SSE_PACK_P (cl_object o)
¶Different macros that check whether o belongs to the specified
type. These checks have been optimized, and are preferred over several
calls to ecl_t_of
.
bool
ECL_IMMEDIATE (cl_object o)
¶Tells whether x is an immediate datatype.
On each of the following sections we will document the standard interface for building objects of different types. For some objects, though, it is too difficult to make a C interface that resembles all of the functionality in the lisp environment. In those cases you need to
The first way makes use of a C or Lisp string to construct an object. The two functions you need to know are the following ones.
cl_object
si_string_to_object (cl_narg narg, cl_object str, ...)
¶cl_object
ecl_read_from_cstring (const char *s)
¶ecl_read_from_cstring
builds a lisp object from a C string
which contains a suitable representation of a lisp
object. si_string_to_object
performs the same task, but uses a
lisp string, and therefore it is less useful.
c_string_to_object
– equivalent to ecl_read_from_cstring
Using a C string
cl_object array1 = ecl_read_from_cstring("#(1 2 3 4)");
Using a Lisp string
cl_object string = make_simple_base_string("#(1 2 3 4)"); cl_object array2 = si_string_to_object(string);
Common-Lisp distinguishes two types of integer types: bignum
s and
fixnum
s. A fixnum is a small integer, which ideally occupies only
a word of memory and which is between the values
MOST-NEGATIVE-FIXNUM
and MOST-POSITIVE-FIXNUM
. A
bignum
is any integer which is not a fixnum
and it is only
constrained by the amount of memory available to represent it.
In ECL a fixnum
is an integer that, together with the tag bits,
fits in a word of memory. The size of a word, and thus the size of a
fixnum
, varies from one architecture to another, and you should
refer to the types and constants in the ecl.h header to make sure that
your C extensions are portable. All other integers are stored as
bignum
s, they are not immediate objects, they take up a variable
amount of memory and the GNU Multiprecision Library is required to
create, manipulate and calculate with them.
This is a C signed integer type capable of holding a whole fixnum
without any loss of precision. The opposite is not true, and you may
create a cl_fixnum
which exceeds the limits of a fixnum and
should be stored as a bignum
.
This is a C unsigned integer type capable of holding a non-negative
fixnum
without loss of precision. Typically, a cl_index
is
used as an index into an array, or into a proper list, etc.
These constants mark the limits of a fixnum
.
bool
ecl_fixnum_lower (cl_fixnum a, cl_fixnum b)
¶bool
ecl_fixnum_greater (cl_fixnum a, cl_fixnum b)
¶bool
ecl_fixnum_leq (cl_fixnum a, cl_fixnum b)
¶bool
ecl_fixnum_geq (cl_fixnum a, cl_fixnum b)
¶bool
ecl_fixnum_plusp (cl_fixnum a)
¶bool
ecl_fixnum_minusp (cl_fixnum a)
¶Operations on fixnums
(comparison and predicates).
cl_object
ecl_make_fixnum (cl_fixnum n)
¶cl_fixnum
ecl_fixnum (cl_object o)
¶ecl_make_fixnum
converts from an integer to a lisp object, while
the ecl_fixnum
does the opposite (converts lisp object fixnum to
integer). These functions do not check their arguments.
MAKE_FIXNUM
– equivalent to ecl_make_fixnum
fix
– equivalent to ecl_fixnum
cl_fixnum
fixint (cl_object o)
¶cl_index
fixnint (cl_object o)
¶Safe conversion of a lisp fixnum
to a C integer of the
appropriate size. Signals an error if o is not of fixnum type.
fixnint
additionally ensure that o is not negative.
ECL has two types of characters – one fits in the C type char, while
the other is used when ECL is built with a configure option
--enable-unicode
which defaults to 32 (characters are stored in
32bit variable and codepoints have 21-bits).
Immediate type t_character
. If ECL built with Unicode support,
then may be either base or extended character, which may be
distinguished with the predicate ECL_BASE_CHAR_P
.
Additionally we have ecl_base_char
for base strings, which is an
equivalent to the ordinary char.
if (ECL_CHARACTERP(o) && ECL_BASE_CHAR_P(o)) printf("Base character: %c\n", ECL_CHAR_CODE(o));
Each character is assigned an integer code which ranges from 0 to (ECL_CHAR_CODE_LIMIT-1).
CHAR_CODE_LIMIT
– equivalent to ECL_CHAR_CODE_LIMIT
cl_object
ECL_CODE_CHAR (ecl_character o)
¶ecl_character
ECL_CHAR_CODE (cl_object o)
¶ecl_character
ecl_char_code (cl_object o)
¶ecl_base_char
ecl_base_char_code (cl_object o)
¶ECL_CHAR_CODE
, ecl_char_code
and
ecl_base_char_code
return the integer code associated to a
lisp character. ecl_char_code
and ecl_base_char_code
perform a safe conversion, while ECL_CHAR_CODE
doesn’t check
its argument.
ECL_CODE_CHAR
returns the lisp character associated to an integer
code. It does not check its arguments.
CHAR_CODE
– equivalent to ECL_CHAR_CODE
CODE_CHAR
– equivalent to ECL_CODE_CHAR
bool
ecl_char_eq (cl_object x, cl_object y)
¶bool
ecl_char_equal (cl_object x, cl_object y)
¶Compare two characters for equality. char_eq take case into account and char_equal ignores it.
int
ecl_char_cmp (cl_object x, cl_object y)
¶int
ecl_char_compare (cl_object x, cl_object y)
¶Compare the relative order of two characters. ecl_char_cmp
takes care of case and ecl_char_compare
converts all
characters to uppercase before comparing them.
An array is an aggregate of data of a common type, which can be accessed with one or more non-negative indices. ECL stores arrays as a C structure with a pointer to the region of memory which contains the actual data. The cell of an array datatype varies depending on whether it is a vector, a bit-vector, a multidimensional array or a string.
bool
ECL_ADJUSTABLE_ARRAY_P (cl_object x)
¶bool
ECL_ARRAY_HAS_FILL_POINTER_P (cl_object x)
¶All arrays (arrays, strings and bit-vectors) may be tested for being adjustable and whenever they have a fill pointer with this two macros. They don’t check the type of their arguments.
If x
contains a vector, you can access the following fields:
x->vector.elttype
The type of the elements of the vector.
x->vector.displaced
List storing the vectors that x
is displaced from and that x
displaces to.
x->vector.dim
The maximum number of elements.
x->vector.fillp
Actual number of elements in the vector or fill pointer
.
x->vector.self
Union of pointers of different types. You should choose the right pointer depending on x->vector.elttype.
If x
contains a multidimensional array, you can access the
following fields:
x->array.elttype
The type of the elements of the array.
x->array.rank
The number of array dimensions.
x->array.displaced
List storing the arrays that x
is displaced from and that x
displaces to.
x->array.dim
The maximum number of elements.
x->array.dims[]
Array with the dimensions of the array. The elements range from
x->array.dim[0]
to x->array.dim[x->array.rank-1]
.
x->array.fillp
Actual number of elements in the array or fill pointer
.
x->array.self
Union of pointers of different types. You should choose the right pointer depending on x->array.elttype.
Each array is of an specialized type which is the type of the elements
of the array. ECL has arrays only a few following specialized types, and
for each of these types there is a C integer which is the corresponding
value of x->array.elttype
or x->vector.elttype
. We list
some of those types together with the C constant that denotes that type:
ecl_aet_object
ecl_aet_sf
ecl_aet_df
ecl_aet_lf
ecl_aet_csf
ecl_aet_cdf
ecl_aet_clf
ecl_aet_bit
ecl_aet_fix
ecl_aet_index
ecl_aet_ch
ecl_aet_bc
cl_elttype
ecl_array_elttype (cl_object array)
¶Returns the element type of the array o
, which can be a string, a
bit-vector, vector, or a multidimensional array.
For example, the code
ecl_array_elttype(ecl_read_from_cstring("\"AAA\"")); /* returns ecl_aet_ch */ ecl_array_elttype(ecl_read_from_cstring("#(A B C)")); /* returns ecl_aet_object */
cl_object
ecl_aref (cl_object x, cl_index index)
¶cl_object
ecl_aset (cl_object x, cl_index index, cl_object value)
¶These functions are used to retrieve and set the elements of an array. The elements are accessed with one index, index, as in the lisp function ROW-MAJOR-AREF.
cl_object array = ecl_read_from_cstring("#2A((1 2) (3 4))"); cl_object x = ecl_aref(array, 3); cl_print(1, x); /* Outputs 4 */ ecl_aset(array, 3, ecl_make_fixnum(5)); cl_print(1, array); /* Outputs #2A((1 2) (3 5)) */
cl_object
ecl_aref1 (cl_object x, cl_index index)
¶cl_object
ecl_aset1 (cl_object x, cl_index index, cl_object value)
¶These functions are similar to aref
and aset
, but they operate on vectors.
cl_object array = ecl_read_from_cstring("#(1 2 3 4)"); cl_object x = ecl_aref1(array, 3); cl_print(1, x); /* Outputs 4 */ ecl_aset1(array, 3, ecl_make_fixnum(5)); cl_print(1, array); /* Outputs #(1 2 3 5) */
A string, both in Common-Lisp and in ECL is nothing but a vector of characters. Therefore, almost everything mentioned in the section of arrays remains valid here.
The only important difference is that ECL stores the base-strings (non-Unicode version of a string) as a lisp object with a pointer to a zero terminated C string. Thus, if a string has n characters, ECL will reserve n+1 bytes for the base-string. This allows us to pass the base-string self pointer to any C routine.
If x
is a lisp object of type string or a base-string, we can
access the following fields:
x->string.dim x->base_string.dim
Actual number of characters in the string.
x->string.fillp x->base_string.fillp
Actual number of characters in the string.
x->string.self x->base_string.self
Pointer to the characters (appropriately ecl_character
’s and ecl_base_char
’s).
bool
ECL_EXTENDED_STRING_P (cl_object object)
¶bool
ECL_BASE_STRING_P (cl_object object)
¶Verifies if an objects is an extended or base string. If Unicode isn’t
supported, then ECL_EXTENDED_STRING_P
always returns 0.
Bit-vector operations are implemented in file
src/c/array.d
. Bit-vector shares the structure with a vector,
therefore, almost everything mentioned in the section of arrays remains
valid here.
Streams implementation is a broad topic. Most of the implementation is
done in the file src/c/file.d
. Stream handling may have different
implementations referred by a member pointer ops
.
Additionally on top of that we have implemented Gray Streams (in
portable Common Lisp) in file src/clos/streams.lsp
, which may be
somewhat slower (we need to benchmark it!). This implementation is in a
separate package GRAY. We may redefine functions in the
COMMON-LISP package with a function redefine-cl-functions
at run-time.
ecl_smmode mode
Stream mode (in example ecl_smm_string_input
).
int closed
Whenever stream is closed or not.
ecl_file_ops *ops
Pointer to the structure containing operation implementations (dispatch table).
union file
Union of ANSI C streams (FILE *stream) and POSIX files interface (cl_fixnum descriptor).
cl_object object0, object1
Some objects (may be used for a specific implementation purposes).
cl_object byte_stack
Buffer for unread bytes.
cl_index column
File column.
cl_fixnum last_char
Last character read.
cl_fixnum last_code[2]
Actual composition of the last character.
cl_fixnum int0 int1
Some integers (may be used for a specific implementation purposes).
cl_index byte_size
Size of byte in binary streams.
cl_fixnum last_op
0: unknown, 1: reading, -1: writing
char *buffer
Buffer for FILE
cl_object format
external format
cl_eformat_encoder encoder
cl_eformat_encoder decoder
cl_object format_table
in flags
Character table, flags, etc
ecl_character eof_character
bool
ECL_ANSI_STREAM_P (cl_object o)
¶Predicate determining if o
is a first-class stream object.
bool
ECL_ANSI_STREAM_TYPE_P (cl_object o, ecl_smmode m)
¶Predicate determining if o
is a first-class stream object of
type m
.
Structures and instances share the same datatype t_instance
(
with a few exceptions. Structure implementation details are the file
src/c/structure.d
.
cl_object
ECL_STRUCT_TYPE (cl_object x)
¶cl_object
ECL_STRUCT_SLOTS (cl_object x)
¶cl_object
ECL_STRUCT_LENGTH (cl_object x)
¶cl_object
ECL_STRUCT_SLOT (cl_object x, cl_index i)
¶cl_object
ECL_STRUCT_NAME (cl_object x)
¶Convenience functions for the structures.
cl_object
ECL_CLASS_OF (cl_object x)
¶cl_object
ECL_SPEC_FLAG (cl_object x)
¶cl_object
ECL_SPEC_OBJECT (cl_object x)
¶cl_object
ECL_CLASS_NAME (cl_object x)
¶cl_object
ECL_CLASS_SUPERIORS (cl_object x)
¶cl_object
ECL_CLASS_INFERIORS (cl_object x)
¶cl_object
ECL_CLASS_SLOTS (cl_object x)
¶cl_object
ECL_CLASS_CPL (cl_object x)
¶bool
ECL_INSTANCEP (cl_object x)
¶Convenience functions for the structures.
A bytecodes object is a lisp object with a piece of code that can be
interpreted. The objects of type t_bytecodes
are implicitly
constructed by a call to eval, but can also be explicitly constructed
with the si_make_lambda
function.
cl_object
si_safe_eval (cl_narg narg, cl_object form, cl_object env, ...)
¶si_safe_eval
evaluates form
in the lexical
environment5 env
, which can be ECL_NIL
. Before
evaluating it, the expression form is bytecompiled. If the form
signals an error, or tries to jump to an outer point, the function has
two choices: by default, it will invoke a debugger, but if a third
value is supplied, then si_safe_eval
will not use a debugger
but rather return that value.
cl_object cl_eval (cl_object form)
-
cl_eval
is the equivalent of si_safe_eval
but without
environment and with no err-value
supplied. It exists only for
compatibility with previous versions.
cl_object cl_safe_eval (cl_object form, cl_object env, cl_object err_value)
-
Equivalent of si_safe_eval
.
cl_object form = ecl_read_from_cstring("(print 1)"); si_safe_eval(2, form, ECL_NIL); si_safe_eval(3, form, ECL_NIL, ecl_make_fixnum(3)); /* on error function will return 3 */
cl_object
si_make_lambda (cl_object name, cl_object def)
¶Builds an interpreted lisp function with name given by the symbol name
and body given by def
.
For instance, we would achieve the equivalent of
(funcall #'(lambda (x y) (block foo (+ x y))) 1 2)
with the following code
cl_object def = ecl_read_from_cstring("((x y) (+ x y))"); cl_object name = ecl_make_symbol("FOO", "COMMON-LISP-USER"); cl_object fun = si_make_lambda(name, def); return cl_funcall(3, fun, ecl_make_fixnum(1), ecl_make_fixnum(2));
Notice that si_make_lambda
performs a bytecodes compilation of
the definition and thus it may signal some errors. Such errors are not
handled by the routine itself so you might consider using
si_safe_eval
instead.
Note that env
must be a lexical
environment as used in the interpreter, See The lexical environment