The ECL implementation of strings is ANSI Common-Lisp compliant. There are basically four string types as shown in Table 2.7. As explained in Characters, when Unicode support is disabled, character
and base-character
are the same type and the last two string types are equivalent to the first two.
Abbreviation | Expanded type | Remarks |
---|---|---|
string | (array character (*)) | 8 or 32 bits per character, adjustable. |
simple-string | (simple-array character (*)) | 8 or 32 bits per character, not adjustable nor displaced. |
base-string | (array base-char (*)) | 8 bits per character, adjustable. |
simple-base-string | (simple-array base-char (*)) | 8 bits per character, not adjustable nor displaced. |
It is important to remember that strings with unicode characters can only be printed readably when the external format supports those characters. If this is not the case, ECL will signal a serious-condition
. This condition will abort your program if not properly handled.
Building strings of C data
cl_object
ecl_alloc_adjustable_base_string (cl_index length);
¶cl_object
ecl_alloc_simple_base_string (cl_index length);
¶cl_object
ecl_make_simple_base_string (const char* data, cl_fixnum length);
¶cl_object
ecl_make_constant_base_string (const char* data, cl_fixnum length);
¶Description
These are different ways to create a base string, which is a string that holds a small subset of characters, the base-char
, with codes ranging from 0 to 255.
ecl_alloc_simple_base_string
creates an empty string with that much space for characters and a fixed length. The string does not have a fill pointer and cannot be resized, and the initial data is unspecified
ecl_alloc_adjustable_base_string
is similar to the previous function, but creates an adjustable string with a fill pointer. This means that the length of the string can be changed and the string itself can be resized to accommodate more data.
The other constructors create strings but use some preexisting data. ecl_make_simple_base_string
creates a string copying the data that the user supplies, and using freshly allocated memory. ecl_make_constant_base_string
on the other hand, does not allocate memory, but simply uses the supplied pointer as buffer for the string. This last function should be used with care, ensuring that the supplied buffer is not deallocated. If the length argument of these functions is -1, the length is determined by strlen
.
Reading and writing characters into a string
ecl_character
ecl_char (cl_object string, cl_index index);
¶ecl_character
ecl_char_set (cl_object string, cl_index index, ecl_character c);
¶Description
Access to string information should be done using these two functions. The first one implements the equivalent of the char
function from Common Lisp, returning the character that is at position index in the string string.
The counterpart of the previous function is ecl_char_set
, which implements (setf char)
and stores character c at the position index in the given string.
Both functions check the type of their arguments and verify that the indices do not exceed the string boundaries. Otherwise they signal a serious-condition
.
Converting between different encodings. See External formats for a list of supported encodings (external formats).
Decode a sequence of octets (i.e. 8-bit bytes) into a string according
to the given external format. octets must be a vector whose
elements have a size of 8-bit. The bounding index designators
start and end optionally denote a subsequence to be decoded.
Signals an ext:character-decoding-error
if the decoding fails.
Encode a string into a sequence of octets according to the given
external format. The bounding index designators start and
end optionally denote a subsequence to be encoded. If
null-terminate is true, add a terminating null byte. Signals an
ext:character-encoding-error
if the encoding fails.
cl_object
ecl_decode_from_cstring (const char *string, cl_fixnum length, cl_object external_format)
¶Decode a C string of the given length into a Lisp string using the
specified external format. If length is -1, the length is
determined by strlen
. Returns NULL
if the decoding fails.
cl_fixnum
ecl_encode_to_cstring (char *output, cl_fixnum output_length, cl_object input, cl_object external_format)
¶Encode the Lisp string input into a C string of the given length using the specified external format. Returns the number of characters necessary to encode the Lisp string (including the null terminator). If this is larger than output_length, output is unchanged. Returns -1 if the encoding fails.
cl_object
ecl_decode_from_unicode_wstring (const wchar_t *string, cl_fixnum length)
¶cl_fixnum
ecl_encode_to_unicode_wstring (wchar_t *output, cl_fixnum output_length, cl_object input)
¶These functions work the same as ecl_decode_from_cstring
,
ecl_encode_to_cstring
, except that the external format used is
either utf-8, utf-16 or utf-32 depending on whether
sizeof(wchar_t)
is 1, 2, or 4 respectively.
Common Lisp and C equivalence
Lisp symbol | C function |
---|---|
simple-string-p | cl_object cl_simple_string_p(cl_object string) |
char | cl_object cl_char(cl_object string, cl_object index) |
(setf char) | cl_object si_char_set(cl_object string, cl_object index, cl_object char) |
schar | cl_object cl_schar(cl_object string, cl_object index) |
(setf schar) | cl_object si_char_set(cl_object string, cl_object index, cl_object char) |
string | cl_object cl_string(cl_object x) |
string-upcase | cl_object cl_string_upcase(cl_narg narg, cl_obejct string, ...) |
string-downcase | cl_object cl_string_downcase(cl_narg narg, cl_obejct string, ...) |
string-capitalize | cl_object cl_string_capitalize(cl_narg narg, cl_obejct string, ...) |
nstring-upcase | cl_object cl_nstring_upcase(cl_narg narg, cl_obejct string, ...) |
nstring-downcase | cl_object cl_nstring_downcase(cl_narg narg, cl_obejct string, ...) |
nstring-capitalize | cl_object cl_nstring_capitalize(cl_narg narg, cl_obejct string, ...) |
string-trim | cl_object cl_string_trim(cl_object character_bag, cl_object string) |
string-left-trim | cl_object cl_string_left_trim(cl_object character_bag, cl_object string) |
string-right-trim | cl_object cl_string_right_trim(cl_object character_bag, cl_object string) |
string | cl_object cl_string(cl_object x) |
string= | cl_object cl_stringE(cl_narg narg, cl_object string1, cl_object string2, ...) |
string/= | cl_object cl_stringNE(cl_narg narg, cl_object string1, cl_object string2, ...) |
string< | cl_object cl_stringL(cl_narg narg, cl_object string1, cl_object string2, ...) |
string> | cl_object cl_stringG(cl_narg narg, cl_object string1, cl_object string2, ...) |
string<= | cl_object cl_stringLE(cl_narg narg, cl_object string1, cl_object string2, ...) |
string>= | cl_object cl_stringGE(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-equal | cl_object cl_string_equal(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-not-equal | cl_object cl_string_not_equal(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-lessp | cl_object cl_string_lessp(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-greaterp | cl_object cl_string_greaterp(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-not-greaterp | cl_object cl_string_not_greaterp(cl_narg narg, cl_object string1, cl_object string2, ...) |
string-not-lessp | cl_object cl_string_not_lessp(cl_narg narg, cl_object string1, cl_object string2, ...) |
stringp | cl_object cl_stringp(cl_object x) |
make-string | cl_object cl_make_string(cl_narg narg, cl_object size, ...) |