4.4 Manipulating Lisp objects

If you want to extend, fix or simply customize ECL for your own needs, you should understand how the implementation works.

C/C++ identifier: cl_lispunion cons big ratio SF DF longfloat gencomplex csfloat cdfloat clfloat symbol pack hash array vector base_string string stream random readtable pathname bytecodes bclosure cfun cfunfixed cclosure d instance process queue lock rwlock condition_variable semaphore barrier mailbox cblock foreign frame weak sse

Union containing all first-class ECL types.


4.4.1 Objects representation

In ECL a lisp object is represented by a type called cl_object. This type is a word which is long enough to host both an integer and a pointer. The least significant bits of this word, also called the tag bits, determine whether it is a pointer to a C structure representing a complex object, or whether it is an immediate data, such as a fixnum or a character.

figures/immediate-types

Figure 4.1: Immediate types

The topic of the immediate values and bit fiddling is nicely described in Peter Bex’s blog describing Chicken Scheme internal data representation. We could borrow some ideas from it (like improving fixnum bitness and providing more immediate values). All changes to code related to immediate values should be carefully benchmarked.

The fixnums and characters are called immediate data types, because they require no more than the cl_object datatype to store all information. All other ECL objects are non-immediate and they are represented by a pointer to a cell that is allocated on the heap. Each cell consists of several words of memory and contains all the information related to that object. By storing data in multiples of a word size, we make sure that the least significant bits of a pointer are zero, which distinguishes pointers from immediate data.

In an immediate datatype, the tag bits determine the type of the object. In non-immediate datatypes, the first byte in the cell contains the secondary type indicator, and distinguishes between different types of non immediate data. The use of the remaining bytes differs for each type of object. For instance, a cons cell consists of three words:

+---------+----------+
| CONS    |          |
+---------+----------+
|     car-pointer    |
+--------------------+
|     cdr-pointer    |
+--------------------+

Note, that this is one of the possible implementation of cons. The second one (currently default) uses the immediate value for the list and consumes two words instead of three. Such implementation is more memory and speed efficient (according to the comments in the source code):

/*
 * CONSES
 *
 * We implement two variants. The "small cons" type carries the type
 * information in the least significant bits of the pointer. We have
 * to do some pointer arithmetics to find out the CAR / CDR of the
 * cons but the overall result is faster and memory efficient, only
 * using two words per cons.
 *
 * The other scheme stores conses as three-words objects, the first
 * word carrying the type information. This is kept for backward
 * compatibility and also because the oldest garbage collector does
 * not yet support the smaller datatype.
 *
 * To make code portable and independent of the representation, only
 * access the objects using the common macros below (that is all
 * except ECL_CONS_PTR or ECL_PTR_CONS).
 */
C/C++ identifier: cl_object

This is the type of a lisp object. For your C/C++ program, a cl_object can be either a fixnum, a character, or a pointer to a union of structures (See cl_lispunion in the header object.h). The actual interpretation of that object can be guessed with the macro ecl_t_of.

Example

For example, if x is of type cl_object, and it is of type fixnum, we may retrieve its value:

if (ecl_t_of(x) == t_fixnum)
    printf("Integer value: %d\n", ecl_fixnum(x));

Example

If x is of type cl_object and it does not contain an immediate datatype, you may inspect the cell associated to the lisp object using x as a pointer. For example:

if (ecl_t_of(x) == t_vector)
    printf("Vector's dimension is: %d\n", x->vector.dim);

You should see the following sections and the header object.h to learn how to use the different fields of a cl_object pointer.

C/C++ identifier: cl_type

Enumeration type which distinguishes the different types of lisp objects. The most important values are:

t_cons t_fixnum, t_character, t_bignum, t_ratio, t_singlefloat, t_doublefloat, t_complex, t_symbol, t_package, t_hashtable, t_array, t_vector, t_string, t_bitvector, t_stream, t_random, t_readtable, t_pathname, t_bytecodes, t_cfun, t_cclosure, t_gfun, t_instance, t_foreign and t_thread.

Function: cl_type ecl_t_of (cl_object x)

If x is a valid lisp object, ecl_t_of(x) returns an integer denoting the type that lisp object. That integer is one of the values of the enumeration type cl_type.

Function: bool ECL_CHARACTERP (cl_object o)
Function: bool ECL_BASE_CHAR_P (cl_object o)
Function: bool ECL_BASE_CHAR_CODE_P (ecl_character o)
Function: bool ECL_NUMBER_TYPE_P (cl_object o)
Function: bool ECL_COMPLEXP (cl_object o)
Function: bool ECL_REAL_TYPE_P (cl_object o)
Function: bool ECL_FIXNUMP (cl_object o)
Function: bool ECL_BIGNUMP (cl_object o)
Function: bool ECL_SINGLE_FLOAT_P (cl_object o)
Function: bool ECL_DOUBLE_FLOAT_P (cl_object o)
Function: bool ECL_LONG_FLOAT_P (cl_object o)
Function: bool ECL_CONSP (cl_object o)
Function: bool ECL_LISTP (cl_object o)
Function: bool ECL_ATOM (cl_object o)
Function: bool ECL_SYMBOLP (cl_object o)
Function: bool ECL_ARRAYP (cl_object o)
Function: bool ECL_VECTORP (cl_object o)
Function: bool ECL_BIT_VECTOR_P (cl_object o)
Function: bool ECL_STRINGP (cl_object o)
Function: bool ECL_HASH_TABLE_P (cl_object o)
Function: bool ECL_RANDOM_STATE_P (cl_object o)
Function: bool ECL_PACKAGEP (cl_object o)
Function: bool ECL_PATHNAMEP (cl_object o)
Function: bool ECL_READTABLEP (cl_object o)
Function: bool ECL_FOREIGN_DATA_P (cl_object o)
Function: bool ECL_SSE_PACK_P (cl_object o)

Different macros that check whether o belongs to the specified type. These checks have been optimized, and are preferred over several calls to ecl_t_of.

Function: bool ECL_IMMEDIATE (cl_object o)

Tells whether x is an immediate datatype.


4.4.2 Constructing objects

On each of the following sections we will document the standard interface for building objects of different types. For some objects, though, it is too difficult to make a C interface that resembles all of the functionality in the lisp environment. In those cases you need to

  1. build the objects from their textual representation, or
  2. use the evaluator to build these objects.

The first way makes use of a C or Lisp string to construct an object. The two functions you need to know are the following ones.

Function: si::string-to-object string &optional (err-value nil)
Function: cl_object si_string_to_object (cl_narg narg, cl_object str, ...)
Function: cl_object ecl_read_from_cstring (const char *s)

ecl_read_from_cstring builds a lisp object from a C string which contains a suitable representation of a lisp object. si_string_to_object performs the same task, but uses a lisp string, and therefore it is less useful.

Example

Using a C string

cl_object array1 = ecl_read_from_cstring("#(1 2 3 4)");

Using a Lisp string

cl_object string = make_simple_base_string("#(1 2 3 4)");
cl_object array2 = si_string_to_object(string);

Integers

Common-Lisp distinguishes two types of integer types: bignums and fixnums. A fixnum is a small integer, which ideally occupies only a word of memory and which is between the values MOST-NEGATIVE-FIXNUM and MOST-POSITIVE-FIXNUM. A bignum is any integer which is not a fixnum and it is only constrained by the amount of memory available to represent it.

In ECL a fixnum is an integer that, together with the tag bits, fits in a word of memory. The size of a word, and thus the size of a fixnum, varies from one architecture to another, and you should refer to the types and constants in the ecl.h header to make sure that your C extensions are portable. All other integers are stored as bignums, they are not immediate objects, they take up a variable amount of memory and the GNU Multiprecision Library is required to create, manipulate and calculate with them.

C/C++ identifier: cl_fixnum

This is a C signed integer type capable of holding a whole fixnum without any loss of precision. The opposite is not true, and you may create a cl_fixnum which exceeds the limits of a fixnum and should be stored as a bignum.

C/C++ identifier: cl_index

This is a C unsigned integer type capable of holding a non-negative fixnum without loss of precision. Typically, a cl_index is used as an index into an array, or into a proper list, etc.

Constant: MOST_NEGATIVE_FIXNUM
Constant: MOST_POSITIVE_FIXNUM

These constants mark the limits of a fixnum.

Function: bool ecl_fixnum_lower (cl_fixnum a, cl_fixnum b)
Function: bool ecl_fixnum_greater (cl_fixnum a, cl_fixnum b)
Function: bool ecl_fixnum_leq (cl_fixnum a, cl_fixnum b)
Function: bool ecl_fixnum_geq (cl_fixnum a, cl_fixnum b)
Function: bool ecl_fixnum_plusp (cl_fixnum a)
Function: bool ecl_fixnum_minusp (cl_fixnum a)

Operations on fixnums (comparison and predicates).

Function: cl_object ecl_make_fixnum (cl_fixnum n)
Function: cl_fixnum ecl_fixnum (cl_object o)

ecl_make_fixnum converts from an integer to a lisp object, while the ecl_fixnum does the opposite (converts lisp object fixnum to integer). These functions do not check their arguments.

  • DEPRECATED MAKE_FIXNUM – equivalent to ecl_make_fixnum
  • DEPRECATED fix – equivalent to ecl_fixnum
Function: cl_fixnum fixint (cl_object o)
Function: cl_index fixnint (cl_object o)

Safe conversion of a lisp fixnum to a C integer of the appropriate size. Signals an error if o is not of fixnum type.

fixnint additionally ensure that o is not negative.

Characters

ECL has two types of characters – one fits in the C type char, while the other is used when ECL is built with a configure option --enable-unicode which defaults to 32 (characters are stored in 32bit variable and codepoints have 21-bits).

C/C++ identifier: ecl_character

Immediate type t_character. If ECL built with Unicode support, then may be either base or extended character, which may be distinguished with the predicate ECL_BASE_CHAR_P.

Additionally we have ecl_base_char for base strings, which is an equivalent to the ordinary char.

Example

if (ECL_CHARACTERP(o) && ECL_BASE_CHAR_P(o))
    printf("Base character: %c\n", ECL_CHAR_CODE(o));
Constant: ECL_CHAR_CODE_LIMIT

Each character is assigned an integer code which ranges from 0 to (ECL_CHAR_CODE_LIMIT-1).

  • DEPRECATED CHAR_CODE_LIMIT – equivalent to ECL_CHAR_CODE_LIMIT
Function: cl_object ECL_CODE_CHAR (ecl_character o)
Function: ecl_character ECL_CHAR_CODE (cl_object o)
Function: ecl_character ecl_char_code (cl_object o)
Function: ecl_base_char ecl_base_char_code (cl_object o)

ECL_CHAR_CODE, ecl_char_code and ecl_base_char_code return the integer code associated to a lisp character. ecl_char_code and ecl_base_char_code perform a safe conversion, while ECL_CHAR_CODE doesn’t check its argument.

ECL_CODE_CHAR returns the lisp character associated to an integer code. It does not check its arguments.

  • DEPRECATED CHAR_CODE – equivalent to ECL_CHAR_CODE
  • DEPRECATED CODE_CHAR – equivalent to ECL_CODE_CHAR
Function: bool ecl_char_eq (cl_object x, cl_object y)
Function: bool ecl_char_equal (cl_object x, cl_object y)

Compare two characters for equality. char_eq take case into account and char_equal ignores it.

Function: int ecl_char_cmp (cl_object x, cl_object y)
Function: int ecl_char_compare (cl_object x, cl_object y)

Compare the relative order of two characters. ecl_char_cmp takes care of case and ecl_char_compare converts all characters to uppercase before comparing them.

Arrays

An array is an aggregate of data of a common type, which can be accessed with one or more non-negative indices. ECL stores arrays as a C structure with a pointer to the region of memory which contains the actual data. The cell of an array datatype varies depending on whether it is a vector, a bit-vector, a multidimensional array or a string.

Function: bool ECL_ADJUSTABLE_ARRAY_P (cl_object x)
Function: bool ECL_ARRAY_HAS_FILL_POINTER_P (cl_object x)

All arrays (arrays, strings and bit-vectors) may be tested for being adjustable and whenever they have a fill pointer with this two macros. They don’t check the type of their arguments.

C/C++ identifier: ecl_vector

If x contains a vector, you can access the following fields:

x->vector.elttype

The type of the elements of the vector.

x->vector.displaced

List storing the vectors that x is displaced from and that x displaces to.

x->vector.dim

The maximum number of elements.

x->vector.fillp

Actual number of elements in the vector or fill pointer.

x->vector.self

Union of pointers of different types. You should choose the right pointer depending on x->vector.elttype.

C/C++ identifier: ecl_array

If x contains a multidimensional array, you can access the following fields:

x->array.elttype

The type of the elements of the array.

x->array.rank

The number of array dimensions.

x->array.displaced

List storing the arrays that x is displaced from and that x displaces to.

x->array.dim

The maximum number of elements.

x->array.dims[]

Array with the dimensions of the array. The elements range from x->array.dim[0] to x->array.dim[x->array.rank-1].

x->array.fillp

Actual number of elements in the array or fill pointer.

x->array.self

Union of pointers of different types. You should choose the right pointer depending on x->array.elttype.

C/C++ identifier: cl_elttype ecl_aet_object ecl_aet_sf ecl_aet_df ecl_aet_lf ecl_aet_csf ecl_aet_cdf ecl_aet_clf ecl_aet_bit ecl_aet_fix ecl_aet_index ecl_aet_b8 ecl_aet_i8 ecl_aet_b16 ecl_aet_i16 ecl_aet_b32 ecl_aet_i32 ecl_aet_b64 ecl_aet_i64 ecl_aet_ch ecl_aet_bc

Each array is of an specialized type which is the type of the elements of the array. ECL has arrays only a few following specialized types, and for each of these types there is a C integer which is the corresponding value of x->array.elttype or x->vector.elttype. We list some of those types together with the C constant that denotes that type:

t

ecl_aet_object

single-float

ecl_aet_sf

double-float

ecl_aet_df

long-float

ecl_aet_lf

(COMPLEX SINGLE-FLOAT)

ecl_aet_csf

(COMPLEX DOUBLE-FLOAT)

ecl_aet_cdf

(COMPLEX LONG-FLOAT)

ecl_aet_clf

BIT

ecl_aet_bit

FIXNUM

ecl_aet_fix

INDEX

ecl_aet_index

CHARACTER

ecl_aet_ch

BASE-CHAR

ecl_aet_bc

Function: cl_elttype ecl_array_elttype (cl_object array)

Returns the element type of the array o, which can be a string, a bit-vector, vector, or a multidimensional array.

Example

For example, the code

ecl_array_elttype(ecl_read_from_cstring("\"AAA\""));  /* returns ecl_aet_ch */
ecl_array_elttype(ecl_read_from_cstring("#(A B C)")); /* returns ecl_aet_object */
Function: cl_object ecl_aref (cl_object x, cl_index index)
Function: cl_object ecl_aset (cl_object x, cl_index index, cl_object value)

These functions are used to retrieve and set the elements of an array. The elements are accessed with one index, index, as in the lisp function ROW-MAJOR-AREF.

Example

cl_object array = ecl_read_from_cstring("#2A((1 2) (3 4))");
cl_object x = ecl_aref(array, 3);
cl_print(1, x);	/* Outputs 4 */
ecl_aset(array, 3, ecl_make_fixnum(5));
cl_print(1, array); /* Outputs #2A((1 2) (3 5)) */
Function: cl_object ecl_aref1 (cl_object x, cl_index index)
Function: cl_object ecl_aset1 (cl_object x, cl_index index, cl_object value)

These functions are similar to aref and aset, but they operate on vectors.

Example

cl_object array = ecl_read_from_cstring("#(1 2 3 4)");
cl_object x = ecl_aref1(array, 3);
cl_print(1, x);	    /* Outputs 4 */
ecl_aset1(array, 3, ecl_make_fixnum(5));
cl_print(1, array); /* Outputs #(1 2 3 5) */

Strings

A string, both in Common-Lisp and in ECL is nothing but a vector of characters. Therefore, almost everything mentioned in the section of arrays remains valid here.

The only important difference is that ECL stores the base-strings (non-Unicode version of a string) as a lisp object with a pointer to a zero terminated C string. Thus, if a string has n characters, ECL will reserve n+1 bytes for the base-string. This allows us to pass the base-string self pointer to any C routine.

C/C++ identifier: ecl_string
C/C++ identifier: ecl_base_string

If x is a lisp object of type string or a base-string, we can access the following fields:

x->string.dim x->base_string.dim

Actual number of characters in the string.

x->string.fillp x->base_string.fillp

Actual number of characters in the string.

x->string.self x->base_string.self

Pointer to the characters (appropriately ecl_character’s and ecl_base_char’s).

Function: bool ECL_EXTENDED_STRING_P (cl_object object)
Function: bool ECL_BASE_STRING_P (cl_object object)

Verifies if an objects is an extended or base string. If Unicode isn’t supported, then ECL_EXTENDED_STRING_P always returns 0.

Bit-vectors

Bit-vector operations are implemented in file src/c/array.d. Bit-vector shares the structure with a vector, therefore, almost everything mentioned in the section of arrays remains valid here.

Streams

Streams implementation is a broad topic. Most of the implementation is done in the file src/c/file.d. Stream handling may have different implementations referred by a member pointer ops.

Additionally on top of that we have implemented Gray Streams (in portable Common Lisp) in file src/clos/streams.lsp, which may be somewhat slower (we need to benchmark it!). This implementation is in a separate package GRAY. We may redefine functions in the COMMON-LISP package with a function redefine-cl-functions at run-time.

C/C++ identifier: ecl_file_ops write_* read_* unread_* peek_* listen clear_input clear_output finish_output force_output input_p output_p interactive_p element_type length get_position set_position column close
C/C++ identifier: ecl_stream
ecl_smmode mode

Stream mode (in example ecl_smm_string_input).

int closed

Whenever stream is closed or not.

ecl_file_ops *ops

Pointer to the structure containing operation implementations (dispatch table).

union file

Union of ANSI C streams (FILE *stream) and POSIX files interface (cl_fixnum descriptor).

cl_object object0, object1

Some objects (may be used for a specific implementation purposes).

cl_object byte_stack

Buffer for unread bytes.

cl_index column

File column.

cl_fixnum last_char

Last character read.

cl_fixnum last_code[2]

Actual composition of the last character.

cl_fixnum int0 int1

Some integers (may be used for a specific implementation purposes).

cl_index byte_size

Size of byte in binary streams.

cl_fixnum last_op

0: unknown, 1: reading, -1: writing

char *buffer

Buffer for FILE

cl_object format

external format

cl_eformat_encoder encoder
cl_eformat_encoder decoder
cl_object format_table
in flags

Character table, flags, etc

ecl_character eof_character
Function: bool ECL_ANSI_STREAM_P (cl_object o)

Predicate determining if o is a first-class stream object.

Function: bool ECL_ANSI_STREAM_TYPE_P (cl_object o, ecl_smmode m)

Predicate determining if o is a first-class stream object of type m.

Structures

Structures and instances share the same datatype t_instance ( with a few exceptions. Structure implementation details are the file src/c/structure.d.

Function: cl_object ECL_STRUCT_TYPE (cl_object x)
Function: cl_object ECL_STRUCT_SLOTS (cl_object x)
Function: cl_object ECL_STRUCT_LENGTH (cl_object x)
Function: cl_object ECL_STRUCT_SLOT (cl_object x, cl_index i)
Function: cl_object ECL_STRUCT_NAME (cl_object x)

Convenience functions for the structures.

Instances

Function: cl_object ECL_CLASS_OF (cl_object x)
Function: cl_object ECL_SPEC_FLAG (cl_object x)
Function: cl_object ECL_SPEC_OBJECT (cl_object x)
Function: cl_object ECL_CLASS_NAME (cl_object x)
Function: cl_object ECL_CLASS_SUPERIORS (cl_object x)
Function: cl_object ECL_CLASS_INFERIORS (cl_object x)
Function: cl_object ECL_CLASS_SLOTS (cl_object x)
Function: cl_object ECL_CLASS_CPL (cl_object x)
Function: bool ECL_INSTANCEP (cl_object x)

Convenience functions for the structures.

Bytecodes

A bytecodes object is a lisp object with a piece of code that can be interpreted. The objects of type t_bytecodes are implicitly constructed by a call to eval, but can also be explicitly constructed with the si_make_lambda function.

Function: si:safe-eval form env &optional err-value
Function: cl_object si_safe_eval (cl_narg narg, cl_object form, cl_object env, ...)

si_safe_eval evaluates form in the lexical environment5 env, which can be ECL_NIL. Before evaluating it, the expression form is bytecompiled. If the form signals an error, or tries to jump to an outer point, the function has two choices: by default, it will invoke a debugger, but if a third value is supplied, then si_safe_eval will not use a debugger but rather return that value.

  • DEPRECATED cl_object cl_eval (cl_object form) - cl_eval is the equivalent of si_safe_eval but without environment and with no err-value supplied. It exists only for compatibility with previous versions.
  • DEPRECATED cl_object cl_safe_eval (cl_object form, cl_object env, cl_object err_value) - Equivalent of si_safe_eval.

Example

cl_object form = ecl_read_from_cstring("(print 1)");
si_safe_eval(2, form, ECL_NIL);
si_safe_eval(3, form, ECL_NIL, ecl_make_fixnum(3)); /* on error function will return 3 */
Function: cl_object si_make_lambda (cl_object name, cl_object def)

Builds an interpreted lisp function with name given by the symbol name and body given by def.

Example

For instance, we would achieve the equivalent of

(funcall #'(lambda (x y)
             (block foo (+ x y)))
         1 2)

with the following code

cl_object def = ecl_read_from_cstring("((x y) (+ x y))");
cl_object name = ecl_make_symbol("FOO", "COMMON-LISP-USER");
cl_object fun = si_make_lambda(name, def);
return cl_funcall(3, fun, ecl_make_fixnum(1), ecl_make_fixnum(2));

Notice that si_make_lambda performs a bytecodes compilation of the definition and thus it may signal some errors. Such errors are not handled by the routine itself so you might consider using si_safe_eval instead.


Footnotes

(5)

Note that env must be a lexical environment as used in the interpreter, See The lexical environment