If you want to extend, fix or simply customize ECL for your own needs, you should understand how the implementation works.
Union containing all first-class ECL types.
In ECL a lisp object is represented by a type called
cl_object. This type is a word which is long enough to host both
an integer and a pointer. The least significant bits of this word, also
called the tag bits, determine whether it is a pointer to a C structure
representing a complex object, or whether it is an immediate data, such
as a fixnum or a character.
Figure 4.1: Immediate types
The topic of the immediate values and bit fiddling is nicely described
in
Peter Bex’s blog describing Chicken
Scheme internal data representation. We could borrow some ideas from it
(like improving fixnum bitness and providing more immediate
values). All changes to code related to immediate values should be
carefully benchmarked.
The fixnums and characters are called immediate data types,
because they require no more than the cl_object datatype to store
all information. All other ECL objects are non-immediate and they are
represented by a pointer to a cell that is allocated on the heap. Each
cell consists of several words of memory and contains all the
information related to that object. By storing data in multiples of a
word size, we make sure that the least significant bits of a pointer are
zero, which distinguishes pointers from immediate data.
In an immediate datatype, the tag bits determine the type of the object. In non-immediate datatypes, the first byte in the cell contains the secondary type indicator, and distinguishes between different types of non immediate data. The use of the remaining bytes differs for each type of object. For instance, a cons cell consists of three words:
+---------+----------+ | CONS | | +---------+----------+ | car-pointer | +--------------------+ | cdr-pointer | +--------------------+
Note, that this is one of the possible implementation of
cons. The second one (currently default) uses the immediate value
for the list and consumes two words instead of three. Such
implementation is more memory and speed efficient (according to the
comments in the source code):
/* * CONSES * * We implement two variants. The "small cons" type carries the type * information in the least significant bits of the pointer. We have * to do some pointer arithmetics to find out the CAR / CDR of the * cons but the overall result is faster and memory efficient, only * using two words per cons. * * The other scheme stores conses as three-words objects, the first * word carrying the type information. This is kept for backward * compatibility and also because the oldest garbage collector does * not yet support the smaller datatype. * * To make code portable and independent of the representation, only * access the objects using the common macros below (that is all * except ECL_CONS_PTR or ECL_PTR_CONS). */
This is the type of a lisp object. For your C/C++ program, a cl_object
can be either a fixnum, a character, or a pointer to a union of
structures (See cl_lispunion in the header object.h). The actual
interpretation of that object can be guessed with the macro
ecl_t_of.
For example, if x is of type cl_object, and it is of type fixnum, we may retrieve its value:
if (ecl_t_of(x) == t_fixnum)
printf("Integer value: %d\n", ecl_fixnum(x));
If x is of type cl_object and it does not contain an
immediate datatype, you may inspect the cell associated to the lisp
object using x as a pointer. For example:
if (ecl_t_of(x) == t_vector)
printf("Vector's dimension is: %d\n", x->vector.dim);
You should see the following sections and the header object.h to learn
how to use the different fields of a cl_object pointer.
Enumeration type which distinguishes the different types of lisp objects. The most important values are:
t_cons t_fixnum, t_character, t_bignum,
t_ratio, t_singlefloat, t_doublefloat,
t_complex, t_symbol, t_package, t_hashtable,
t_array, t_vector, t_string, t_bitvector,
t_stream, t_random, t_readtable, t_pathname,
t_bytecodes, t_cfun, t_cclosure, t_gfun,
t_instance, t_foreign and t_thread.
cl_type ecl_t_of (cl_object x) ¶If x is a valid lisp object, ecl_t_of(x) returns an integer
denoting the type that lisp object. That integer is one of the values of
the enumeration type cl_type.
bool ECL_CHARACTERP (cl_object o) ¶bool ECL_BASE_CHAR_P (cl_object o) ¶bool ECL_BASE_CHAR_CODE_P (ecl_character o) ¶bool ECL_NUMBER_TYPE_P (cl_object o) ¶bool ECL_COMPLEXP (cl_object o) ¶bool ECL_REAL_TYPE_P (cl_object o) ¶bool ECL_FIXNUMP (cl_object o) ¶bool ECL_BIGNUMP (cl_object o) ¶bool ECL_SINGLE_FLOAT_P (cl_object o) ¶bool ECL_DOUBLE_FLOAT_P (cl_object o) ¶bool ECL_LONG_FLOAT_P (cl_object o) ¶bool ECL_CONSP (cl_object o) ¶bool ECL_LISTP (cl_object o) ¶bool ECL_ATOM (cl_object o) ¶bool ECL_SYMBOLP (cl_object o) ¶bool ECL_ARRAYP (cl_object o) ¶bool ECL_VECTORP (cl_object o) ¶bool ECL_BIT_VECTOR_P (cl_object o) ¶bool ECL_STRINGP (cl_object o) ¶bool ECL_HASH_TABLE_P (cl_object o) ¶bool ECL_RANDOM_STATE_P (cl_object o) ¶bool ECL_PACKAGEP (cl_object o) ¶bool ECL_PATHNAMEP (cl_object o) ¶bool ECL_READTABLEP (cl_object o) ¶bool ECL_FOREIGN_DATA_P (cl_object o) ¶bool ECL_SSE_PACK_P (cl_object o) ¶Different macros that check whether o belongs to the specified
type. These checks have been optimized, and are preferred over several
calls to ecl_t_of.
bool ECL_IMMEDIATE (cl_object o) ¶Tells whether x is an immediate datatype.
On each of the following sections we will document the standard interface for building objects of different types. For some objects, though, it is too difficult to make a C interface that resembles all of the functionality in the lisp environment. In those cases you need to
The first way makes use of a C or Lisp string to construct an object. The two functions you need to know are the following ones.
cl_object si_string_to_object (cl_narg narg, cl_object str, ...) ¶cl_object ecl_read_from_cstring (const char *s) ¶ecl_read_from_cstring builds a lisp object from a C string
which contains a suitable representation of a lisp
object. si_string_to_object performs the same task, but uses a
lisp string, and therefore it is less useful.
c_string_to_object – equivalent to ecl_read_from_cstring
Using a C string
cl_object array1 = ecl_read_from_cstring("#(1 2 3 4)");
Using a Lisp string
cl_object string = make_simple_base_string("#(1 2 3 4)");
cl_object array2 = si_string_to_object(string);
Common-Lisp distinguishes two types of integer types: bignums and
fixnums. A fixnum is a small integer, which ideally occupies only
a word of memory and which is between the values
MOST-NEGATIVE-FIXNUM and MOST-POSITIVE-FIXNUM. A
bignum is any integer which is not a fixnum and it is only
constrained by the amount of memory available to represent it.
In ECL a fixnum is an integer that, together with the tag bits,
fits in a word of memory. The size of a word, and thus the size of a
fixnum, varies from one architecture to another, and you should
refer to the types and constants in the ecl.h header to make sure that
your C extensions are portable. All other integers are stored as
bignums, they are not immediate objects, they take up a variable
amount of memory and the GNU Multiprecision Library is required to
create, manipulate and calculate with them.
This is a C signed integer type capable of holding a whole fixnum
without any loss of precision. The opposite is not true, and you may
create a cl_fixnum which exceeds the limits of a fixnum and
should be stored as a bignum.
This is a C unsigned integer type capable of holding a non-negative
fixnum without loss of precision. Typically, a cl_index is
used as an index into an array, or into a proper list, etc.
These constants mark the limits of a fixnum.
bool ecl_fixnum_lower (cl_fixnum a, cl_fixnum b) ¶bool ecl_fixnum_greater (cl_fixnum a, cl_fixnum b) ¶bool ecl_fixnum_leq (cl_fixnum a, cl_fixnum b) ¶bool ecl_fixnum_geq (cl_fixnum a, cl_fixnum b) ¶bool ecl_fixnum_plusp (cl_fixnum a) ¶bool ecl_fixnum_minusp (cl_fixnum a) ¶Operations on fixnums (comparison and predicates).
cl_object ecl_make_fixnum (cl_fixnum n) ¶cl_fixnum ecl_fixnum (cl_object o) ¶ecl_make_fixnum converts from an integer to a lisp object, while
the ecl_fixnum does the opposite (converts lisp object fixnum to
integer). These functions do not check their arguments.
MAKE_FIXNUM – equivalent to ecl_make_fixnum
fix – equivalent to ecl_fixnum
cl_fixnum fixint (cl_object o) ¶cl_index fixnint (cl_object o) ¶Safe conversion of a lisp fixnum to a C integer of the
appropriate size. Signals an error if o is not of fixnum type.
fixnint additionally ensure that o is not negative.
ECL has two types of characters – one fits in the C type char, while
the other is used when ECL is built with a configure option
--enable-unicode which defaults to 32 (characters are stored in
32bit variable and codepoints have 21-bits).
Immediate type t_character. If ECL built with Unicode support,
then may be either base or extended character, which may be
distinguished with the predicate ECL_BASE_CHAR_P.
Additionally we have ecl_base_char for base strings, which is an
equivalent to the ordinary char.
if (ECL_CHARACTERP(o) && ECL_BASE_CHAR_P(o))
printf("Base character: %c\n", ECL_CHAR_CODE(o));
Each character is assigned an integer code which ranges from 0 to (ECL_CHAR_CODE_LIMIT-1).
CHAR_CODE_LIMIT – equivalent to ECL_CHAR_CODE_LIMIT
cl_object ECL_CODE_CHAR (ecl_character o) ¶ecl_character ECL_CHAR_CODE (cl_object o) ¶ecl_character ecl_char_code (cl_object o) ¶ecl_base_char ecl_base_char_code (cl_object o) ¶ECL_CHAR_CODE, ecl_char_code and
ecl_base_char_code return the integer code associated to a
lisp character. ecl_char_code and ecl_base_char_code
perform a safe conversion, while ECL_CHAR_CODE doesn’t check
its argument.
ECL_CODE_CHAR returns the lisp character associated to an integer
code. It does not check its arguments.
CHAR_CODE – equivalent to ECL_CHAR_CODE
CODE_CHAR – equivalent to ECL_CODE_CHAR
bool ecl_char_eq (cl_object x, cl_object y) ¶bool ecl_char_equal (cl_object x, cl_object y) ¶Compare two characters for equality. char_eq take case into account and char_equal ignores it.
int ecl_char_cmp (cl_object x, cl_object y) ¶int ecl_char_compare (cl_object x, cl_object y) ¶Compare the relative order of two characters. ecl_char_cmp
takes care of case and ecl_char_compare converts all
characters to uppercase before comparing them.
An array is an aggregate of data of a common type, which can be accessed with one or more non-negative indices. ECL stores arrays as a C structure with a pointer to the region of memory which contains the actual data. The cell of an array datatype varies depending on whether it is a vector, a bit-vector, a multidimensional array or a string.
bool ECL_ADJUSTABLE_ARRAY_P (cl_object x) ¶bool ECL_ARRAY_HAS_FILL_POINTER_P (cl_object x) ¶All arrays (arrays, strings and bit-vectors) may be tested for being adjustable and whenever they have a fill pointer with this two macros. They don’t check the type of their arguments.
If x contains a vector, you can access the following fields:
x->vector.elttypeThe type of the elements of the vector.
x->vector.displacedList storing the vectors that x is displaced from and that x
displaces to.
x->vector.dimThe maximum number of elements.
x->vector.fillpActual number of elements in the vector or fill pointer.
x->vector.selfUnion of pointers of different types. You should choose the right pointer depending on x->vector.elttype.
If x contains a multidimensional array, you can access the
following fields:
x->array.elttypeThe type of the elements of the array.
x->array.rankThe number of array dimensions.
x->array.displacedList storing the arrays that x is displaced from and that x
displaces to.
x->array.dimThe maximum number of elements.
x->array.dims[]Array with the dimensions of the array. The elements range from
x->array.dim[0] to x->array.dim[x->array.rank-1].
x->array.fillpActual number of elements in the array or fill pointer.
x->array.selfUnion of pointers of different types. You should choose the right pointer depending on x->array.elttype.
Each array is of an specialized type which is the type of the elements
of the array. ECL has arrays only a few following specialized types, and
for each of these types there is a C integer which is the corresponding
value of x->array.elttype or x->vector.elttype. We list
some of those types together with the C constant that denotes that type:
ecl_aet_object
ecl_aet_sf
ecl_aet_df
ecl_aet_lf
ecl_aet_csf
ecl_aet_cdf
ecl_aet_clf
ecl_aet_bit
ecl_aet_fix
ecl_aet_index
ecl_aet_ch
ecl_aet_bc
cl_elttype ecl_array_elttype (cl_object array) ¶Returns the element type of the array o, which can be a string, a
bit-vector, vector, or a multidimensional array.
For example, the code
ecl_array_elttype(ecl_read_from_cstring("\"AAA\"")); /* returns ecl_aet_ch */
ecl_array_elttype(ecl_read_from_cstring("#(A B C)")); /* returns ecl_aet_object */
cl_object ecl_aref (cl_object x, cl_index index) ¶cl_object ecl_aset (cl_object x, cl_index index, cl_object value) ¶These functions are used to retrieve and set the elements of an array. The elements are accessed with one index, index, as in the lisp function ROW-MAJOR-AREF.
cl_object array = ecl_read_from_cstring("#2A((1 2) (3 4))");
cl_object x = ecl_aref(array, 3);
cl_print(1, x); /* Outputs 4 */
ecl_aset(array, 3, ecl_make_fixnum(5));
cl_print(1, array); /* Outputs #2A((1 2) (3 5)) */
cl_object ecl_aref1 (cl_object x, cl_index index) ¶cl_object ecl_aset1 (cl_object x, cl_index index, cl_object value) ¶These functions are similar to aref and aset, but they operate on vectors.
cl_object array = ecl_read_from_cstring("#(1 2 3 4)");
cl_object x = ecl_aref1(array, 3);
cl_print(1, x); /* Outputs 4 */
ecl_aset1(array, 3, ecl_make_fixnum(5));
cl_print(1, array); /* Outputs #(1 2 3 5) */
A string, both in Common-Lisp and in ECL is nothing but a vector of characters. Therefore, almost everything mentioned in the section of arrays remains valid here.
The only important difference is that ECL stores the base-strings (non-Unicode version of a string) as a lisp object with a pointer to a zero terminated C string. Thus, if a string has n characters, ECL will reserve n+1 bytes for the base-string. This allows us to pass the base-string self pointer to any C routine.
If x is a lisp object of type string or a base-string, we can
access the following fields:
x->string.dim x->base_string.dimActual number of characters in the string.
x->string.fillp x->base_string.fillpActual number of characters in the string.
x->string.self x->base_string.selfPointer to the characters (appropriately ecl_character’s and ecl_base_char’s).
bool ECL_EXTENDED_STRING_P (cl_object object) ¶bool ECL_BASE_STRING_P (cl_object object) ¶Verifies if an objects is an extended or base string. If Unicode isn’t
supported, then ECL_EXTENDED_STRING_P always returns 0.
Bit-vector operations are implemented in file
src/c/array.d. Bit-vector shares the structure with a vector,
therefore, almost everything mentioned in the section of arrays remains
valid here.
Streams implementation is a broad topic. Most of the implementation is
done in the file src/c/file.d. Stream handling may have different
implementations referred by a member pointer ops.
Additionally on top of that we have implemented Gray Streams (in
portable Common Lisp) in file src/clos/streams.lsp, which may be
somewhat slower (we need to benchmark it!). This implementation is in a
separate package GRAY. We may redefine functions in the
COMMON-LISP package with a function redefine-cl-functions
at run-time.
ecl_smmode modeStream mode (in example ecl_smm_string_input).
int closedWhenever stream is closed or not.
ecl_file_ops *opsPointer to the structure containing operation implementations (dispatch table).
union fileUnion of ANSI C streams (FILE *stream) and POSIX files interface (cl_fixnum descriptor).
cl_object object0, object1Some objects (may be used for a specific implementation purposes).
cl_object byte_stackBuffer for unread bytes.
cl_index columnFile column.
cl_fixnum last_charLast character read.
cl_fixnum last_code[2]Actual composition of the last character.
cl_fixnum int0 int1Some integers (may be used for a specific implementation purposes).
cl_index byte_sizeSize of byte in binary streams.
cl_fixnum last_op0: unknown, 1: reading, -1: writing
char *bufferBuffer for FILE
cl_object formatexternal format
cl_eformat_encoder encodercl_eformat_encoder decodercl_object format_tablein flagsCharacter table, flags, etc
ecl_character eof_characterbool ECL_ANSI_STREAM_P (cl_object o) ¶Predicate determining if o is a first-class stream object.
bool ECL_ANSI_STREAM_TYPE_P (cl_object o, ecl_smmode m) ¶Predicate determining if o is a first-class stream object of
type m.
Structures and instances share the same datatype t_instance (
with a few exceptions. Structure implementation details are the file
src/c/structure.d.
cl_object ECL_STRUCT_TYPE (cl_object x) ¶cl_object ECL_STRUCT_SLOTS (cl_object x) ¶cl_object ECL_STRUCT_LENGTH (cl_object x) ¶cl_object ECL_STRUCT_SLOT (cl_object x, cl_index i) ¶cl_object ECL_STRUCT_NAME (cl_object x) ¶Convenience functions for the structures.
cl_object ECL_CLASS_OF (cl_object x) ¶cl_object ECL_SPEC_FLAG (cl_object x) ¶cl_object ECL_SPEC_OBJECT (cl_object x) ¶cl_object ECL_CLASS_NAME (cl_object x) ¶cl_object ECL_CLASS_SUPERIORS (cl_object x) ¶cl_object ECL_CLASS_INFERIORS (cl_object x) ¶cl_object ECL_CLASS_SLOTS (cl_object x) ¶cl_object ECL_CLASS_CPL (cl_object x) ¶bool ECL_INSTANCEP (cl_object x) ¶Convenience functions for the structures.
A bytecodes object is a lisp object with a piece of code that can be
interpreted. The objects of type t_bytecodes are implicitly
constructed by a call to eval, but can also be explicitly constructed
with the si_make_lambda function.
cl_object si_safe_eval (cl_narg narg, cl_object form, cl_object env, ...) ¶si_safe_eval evaluates form in the lexical
environment5 env, which can be ECL_NIL. Before
evaluating it, the expression form is bytecompiled. If the form
signals an error, or tries to jump to an outer point, the function has
two choices: by default, it will invoke a debugger, but if a third
value is supplied, then si_safe_eval will not use a debugger
but rather return that value.
cl_object cl_eval (cl_object form) -
cl_eval is the equivalent of si_safe_eval but without
environment and with no err-value supplied. It exists only for
compatibility with previous versions.
cl_object cl_safe_eval (cl_object form, cl_object env, cl_object err_value) -
Equivalent of si_safe_eval.
cl_object form = ecl_read_from_cstring("(print 1)");
si_safe_eval(2, form, ECL_NIL);
si_safe_eval(3, form, ECL_NIL, ecl_make_fixnum(3)); /* on error function will return 3 */
cl_object si_make_lambda (cl_object name, cl_object def) ¶Builds an interpreted lisp function with name given by the symbol name
and body given by def.
For instance, we would achieve the equivalent of
(funcall #'(lambda (x y)
(block foo (+ x y)))
1 2)
with the following code
cl_object def = ecl_read_from_cstring("((x y) (+ x y))");
cl_object name = ecl_make_symbol("FOO", "COMMON-LISP-USER");
cl_object fun = si_make_lambda(name, def);
return cl_funcall(3, fun, ecl_make_fixnum(1), ecl_make_fixnum(2));
Notice that si_make_lambda performs a bytecodes compilation of
the definition and thus it may signal some errors. Such errors are not
handled by the routine itself so you might consider using
si_safe_eval instead.
Note that env must be a lexical
environment as used in the interpreter, See The lexical environment