Philippe Choquette's universe

Data Types

PCASTL Data Types

C data types
Pointer to node
Pointer to FILE structure
Pointer to fpos_t
Raw memory
Memory address
Pointer to types above.

To display the type of an expression, call gettype.

C data types

signed char
unsigned char
short int
signed short
signed short int
unsigned short
unsigned short int
signed int
unsigned int
long int
signed long
signed long int
unsigned long
unsigned long int
long long
long long int
signed long long
signed long long int
unsigned long long
unsigned long long int
long double

For detailed information about those types, see Wikipedia.

Numbers in a PCASTL syntax tree are stored as C double. When entered in hexadecimal notation, they are taken as signed integers berfore being casted to double. Hexadecimal notation have to begin with "0x". When casted and stored in a variable or displayed, numbers can have any of the C base data types.

> 0xA3
> 0xF0000000
> 4.0
> a = (char)64
> gettype(a)
> a = (int)3.1416
> gettype(a)

Strings in PCASTL are dynamically allocated C char arrays. When you call function length, you receive the length of the allocated memory. To get the number of characters before the ending null character, call strlen.

To get a character in a string at a given index, you can use a subscript. The first index is zero.

> a = "hello"
> b = a[1]
> gettype(b)
> "hello"[3]

In a string, an escape sequence indicate a special character. Escape sequences begin with a backslash (\) character.

Sequence Name
\a Alert (bell)
\b Backspace
\f Formfeed
\n Newline
\r Carriage return
\t Horizontal tab
\v Vertical tab
\? Literal quotation mark
\' Single quotation mark
\" Double quotation mark
\\ Backslash
\ddd ASCII character in octal notation
\xdd ASCII character in hex notation
\0 Null character

Note that for characters notated in hexadecimal, the interperter ignores all leading zeros. It establishes the end of the hex-specified escape character when it encounters either the first non-hex character or more than two hex characters - not including leading zeros.

Table and following paragraph are from Microsoft Visual C++ 6.0 Docs.


Arrays are dynamically allocated arrays of variable like elements. They are contiguous in memory. Unlike in C or R, each of those variables can have a different type. Arrays are created with the array function. The data is accesible with subscripts.

> a = array(28, "alpha")
[0]     28
[1]     "alpha"
> a[0]
> gettype(a)

List are implemented as linked lists. They are created with the list function. The data is accesible with subscripts.

> b = list(3, 2, 1)
[0]     3
[1]     2
[2]     1
> b[2] = 0
> b
[0]     3
[1]     2
[2]     0

Objects are created with the names function. An object is a group of variables, accessibles with the dot (.) operator. If a member of an object is a function, variables accessed inside this function are searched inside the object's context before looking outside.

> id = names("name", "age")
[name]  undefined
[age]   undefined
> = "Philippe"
> id.age = 29
> id
[name]  "Philippe"
[age]   29
> gettype(id)
Pointer to node

A function definition return a node pointer, as well as a genealogical dotted list. An explicit code segment also give a node pointer. Nodes are basic elements of the syntax tree of the code. The syntax tree structure is illustrated in the Tree Structure page.

> a = parent
> gettype(a)
        "node pointer"
> b = function() print("hello")
> gettype(b)
        "node pointer"
Pointer to FILE structure

A FILE structure pointer is given by the fopen function. This type is used by the stream manipulation functions. Predefined stream identifiers stdin, stdout and stderr are of the FILE pointer type.

Pointer to fpos_t

The pointer to fpos_t type is exclusively used by functions fgetpos and fsetpos. It's a position in a file.

Raw memory

Raw memory is a buffer dynamically allocated and can be obtained by functions vartomem, memclone and memory. Each byte of raw memory can be read or written with the subscript operator.

> a = memory(4)
> gettype(a)
        "raw memory"
> a[0]
> a[0] = 0xe1

A byte is represented by an unsigned char and is the type received when applying the subscript operator in reading mode to raw memory.

> a = memory(4)
> b = a[0]
> gettype(b)
> (byte) 255
Memory address

Memory address is the type obtained when applying the address-of operator "&" to a variable of raw memory or string type to which we already applied the subscript operator.

> a = memory(4)
> &a[0]
> b = &a[1]
> gettype(b)
        "memory address"
> c = "abc"
> gettype(&c[0])
        "memory address"
Pointer to types above.

If you apply the address-of (&) operator to a variable of any of the types given above, you will get a pointer to its space in memory. If you apply the indirection (*) operator to a variable containing a pointer, you will get the value of the first variable.

If you apply the address-of operator to a variable holding a pointer, you will get a pointer to a pointer. With this result you can apply the indirection operator twice.

If you apply the address-of operator to a variable holding an object, you can acces its contents with the pointer using the structure dereference (->) operator.

A number can also be casted to a pointer to one of the C data type. For the moment, we cannot cast to pointer to string, object, array or list.

> a = 42
> b = &a
> gettype(b)
> c = &b
> gettype(c)
> **c
> a = names("x", "y")
[x]     undefined
[y]     undefined
> b = &a
> b->x = 5
> b->y = -5
> a
[x]     5
[y]     -5

back to PCASTL