Low level embeddable scripting language for C projects.
Warning
Only supports x86_64 architectures that use either System V or Win64 ABIs. (TODO: aarch64 System V support)
Warning
macOS untested, only tested on Linux and Windows
Simply run make with clang installed.
The language has 2 constructs: commands and expressions. A command can be an expression but an expression cannot be a command. Expressions perform operations on values, while commands usually handle control flow. Commands can only be used in codeblocks. Non-expression commands start with a specific keyword.
Types are a runtime construct (i.e. they're regular values). There are 12 primitive types:
| Width | Signed | Unsigned | Floating |
|---|---|---|---|
| 8-bit | s8 |
u8 bool |
|
| 16-bit | s16 |
u16 |
|
| 32-bit | s32 |
u32 |
f32 |
| 64-bit | s64 |
u64 |
f64 |
There's also void, which represents no type, and type, which can hold a type.
Then there are 3 kinds of composed types, which are pointers, functions and structs. Pointers are defined via the # operator after a type.
Note
The reason for this choice is simple. C uses asterisks (*), however this conflicts with the multiply operator. Since the parser doesn't know if it's parsing a type or a different value, it just knows it's parsing a value, so s32* ptr would get parsed as "multiply s32 by ptr". Semantically, this doesn't make sense, but the parser doesn't know that.
Functions are like pointers, except they hold pointers to executable code. They are declared as return_type<-().
Inside the parentheses is a comma separated list of types. They can have an identifier, although it's optional.
At the end of the list, ... can be specified. This represents variadic arguments.
Placing a dollar sign ($) between <- and ( (s32<-$()) makes the function assignable. The function must return the exact type and the expression returned must be assignable (variable reference or pointer dereference).
Struct are declared like this: struct {}. Inside the curly braces is a semicolon (;) separated list of types, however now an identifier is required. An example struct is:
struct {
s32 a;
u32 b;
}
There is no such thing as struct-by-value. Each struct is a pointer. This simplifies interpreter logic as all possible values are 8 bytes wide. It also simplifies the ABI logic.
A struct can be inlined into another struct, which acts as a by-value declaration in C. Inline structs don't need an identifier, in which case, the struct acts as an anonymous struct declaration in C.
struct {
Struct str; // valid
Struct; // invalid
inline Struct str; // valid
inline Struct; // valid
}
Pointers can be inlined as well:
struct {
s64 value;
inline u8# bytes;
}
Their offsets are the same as the previous field's offset.
Manual offset can also be specified with the at (@) symbol after the identifier:
struct {
s64 value;
u32 low @ 0;
u32 high @ 4;
}
Or you can use the +@ symbol for a relative offset from the previous field. This doesn't take the size of the field into account:
struct {
s32 field1 @ 4;
s32 field2 +@ 4; // offset 8, not 12
}
Using a colon (:), you can specify a parent. These two declarations are equivalent:
struct: Parent {}
struct { inline Parent super; }
Evaluates to an integer value. If the value is lower than 2^31, it is evaluated as s32. If it's higher than that but lower than 2^63, it is evaluated as s64. If it's outside of all of these ranges, then it's u64.
Hexadecimal, octal and binary digits can be specified with the standard 0x, 0 and 0b prefixes respectively.
Always evaluates to an f64. Examples: 1.5, .5, 1., 1e+6, 1.5e-3, 0x1.fp5.
Returns a const s8# pointing to an array of null-terminated characters. These escape codes are supported:
\\a: Alert (\x07)\\b: Backspace (\x08)\\e: Escape character (\x1B)\\f: Formfeed page break (\x0C)\\n: New line (\x0A)\\r: Carriage return (\x0D)\\t: Horizontal tab (\x09)\\v: Vertical tab (\x0B)\\\\: Backslash (\x5C)\\': Quote (\x27)\\": Double quote (\x22)\\<newline>: New line (\x0A)\\ooo: Octal number\\xnn: 2 digit hexadecimal number\\unnnn: 4 digit hexadecimal number\\Unnnnnnnn: 8 digit hexadecimal number
Evaluated to s32, value being the codepoint of the character between single quotes ('). Also supports escape sequences.
Evaluates to u8 holding the value 1.
Evaluates to u8 holding the value 0.
Evaluates to void# holding the value 0.
Searches the context for a defined variable and evaluates to it.
Evaluates to a variadic argument at the specified index.
Evaluates the expression inside and returns the result.
Evaluates to u64 holding the size of the expressions value's type. If the expression gets evaluated to a type, it returns the size of the type it holds, not the expression's type itself.
sizeof(...) evaluates to the number of variadic arguments.
Evaluates to type holding the type of the expression.
Evaluates to the scope ID of a variable.
ID of global scope is 0, each codeblock and function call increments the ID by 1.
Allocates a block of memory and returns a pointer to it.
After new, a type in brackets ([ ]) is specified: new[s32] returns an s32# pointing to the allocate 4 byte block of memory.
Arrays can be allocated by specifying the size of the array in parentheses (( )): new[s32](4) returns an s32# pointing to the allocated 16 byte block of memory.
The scoped keyword can be used to allocate in the current scope, meaning the allocation would get destroyed after the scope pops: new scoped[s32].
Curly braces can be added to initialize the allocation.
If the allocation allocates an array, the initializer is a comma separated list of expressions: new[s32](4) { 0, 1, 2, 3 }.
Otherwise, the initializer can be a scalar: new[s32]{ 5 }, a struct: new[Struct]{ .field1 = 1, .field2 = 1.5 } or a function: new[void<-()] => { ... }
Function initializer can capture variables in the local scope.
[$]{ ... }- Capture by reference: Mutation reflects outside of function[~]{ ... }- Capture by copy: Mutation reflects in the function and future calls[=]{ ... }- Capture by value: Mutation reflects only in a single call
Manually deletes an allocation allocated by new.
Moves an allocation allocated by new into a different scope, at the end of which gets freed automatically.
move(x) => [scopeof(this) - 1] guarantees that it gets moved one scope frame up.
Evaluates the expression x and if it's truthy, a gets evaluated and returned, otherwise b gets evaluated and returned.
Runs the file file.paw and evaluates to whatever that file returned.
First it searches for the file relative to the directory the current file is in, and if it isn't there, it instead uses the interpreter's current working directory.
Defines a new function in the current scope and returns a reference to it, just like a variable reference.
Declared using an identifier after an expression. The expression must return a type to be semantically valid.
If there's a codeblock after the identifier, it gets attached to the variable if it's a function.
Just like with function allocations, [$], [~] or [=] can be specified before the codeblock to enable capture of variables.
An extern can be specified at the beginning (extern s32 value). If specified, the interpreter will search all defined symbols in the host program and value will reflect its state.
Type kinds:
integer- Any integer typenumber- Any integer or floating point typepointer- Any pointer typefunction- Any function typestruct- Any struct typetype- Thetypetypeassignable- Any value that can be assignedany- Anything
| Operator | Precedence | Left-hand Side | Right-hand Side | Description |
|---|---|---|---|---|
x -> y |
14 -> |
any |
type |
Casts x to type y |
x ~> y |
14 -> |
any |
type |
Bitcasts x to type y |
x ^^ y |
13 -> |
number |
number |
Raises x to the power of y |
x * y |
12 -> |
number |
number |
Multiplies x by y |
x / y |
12 -> |
number |
number |
Divides x by y |
x % y |
12 -> |
number |
number |
Divides x by y and returns the remainder |
x + y |
11 -> |
number |
number |
Adds two numbers |
x + y |
11 -> |
pointer |
integer |
Advances the pointer x by y values |
x + y |
11 -> |
integer |
pointer |
Advances the pointer y by x values |
x - y |
11 -> |
number |
number |
Subtracts two numbers |
x - y |
11 -> |
pointer |
integer |
Rewinds the pointer x by y values |
x - y |
11 -> |
integer |
pointer |
Rewinds the pointer y by x values |
x << y |
10 -> |
integer |
integer |
Shifts x by y bits to the left |
x >> y |
10 -> |
integer |
integer |
Shifts x by y bits to the right |
x < y |
9 -> |
any |
any |
Checks if x is less than y |
x > y |
9 -> |
any |
any |
Checks if x is greater than y |
x <= y |
9 -> |
any |
any |
Checks if x is less than or equal to y |
x >= y |
9 -> |
any |
any |
Checks if x is greater than or equal to y |
x == y |
8 -> |
any |
any |
Checks if x is equal to y |
x != y |
8 -> |
any |
any |
Checks if x is not equal to y |
x & y |
7 -> |
integer |
integer |
Bitwise ANDs x and y |
x ^ y |
6 -> |
integer |
integer |
Bitwise XORs x and y |
x | y |
5 -> |
integer |
integer |
Bitwise ORs x and y |
x && y |
4 -> |
any |
any |
If x and y is truthy, 1 is returned, otherwise 0 |
x || y |
3 -> |
any |
any |
If x or y is truthy, 1 is returned, otherwise 0 |
x ? y |
2 -> |
any |
any |
If x is truthy, x is returned, otherwise y |
x = y |
1 <- |
assignable |
any |
Assigns y to x |
x += y |
1 <- |
assignable |
number |
Performs x + y and stores the result to x |
x -= y |
1 <- |
assignable |
number |
Performs x - y and stores the result to x |
x *= y |
1 <- |
assignable |
number |
Performs x * y and stores the result to x |
x /= y |
1 <- |
assignable |
number |
Performs x / y and stores the result to x |
x %= y |
1 <- |
assignable |
number |
Performs x % y and stores the result to x |
x ^^= y |
1 <- |
assignable |
number |
Performs x ^^ y and stores the result to x |
x <<= y |
1 <- |
assignable |
integer |
Performs x << y and stores the result to x |
x >>= y |
1 <- |
assignable |
integer |
Performs x >> y and stores the result to x |
x &= y |
1 <- |
assignable |
integer |
Performs x & y and stores the result to x |
x |= y |
1 <- |
assignable |
integer |
Performs x | y and stores the result to x |
x ^= y |
1 <- |
assignable |
integer |
Performs x ^ y and stores the result to x |
x ?= y |
1 <- |
assignable |
any |
Performs x ? y and stores the result to x |
++x |
assignable |
Increments x and stores it to itself |
||
--x |
assignable |
Decrements x and stores it to itself |
||
$x |
assignable |
Takes the address of x |
||
#x |
pointer |
Dereferences pointer x |
||
+x |
number |
Does nothing. Just, nothing. | ||
-x |
number |
Negates the number x |
||
!x |
any |
If x is truthy, 0 is returned, otherwise 1 |
||
~x |
integer |
Flips all the bits in x |
||
x++ |
assignable |
Increments x and stores it to itself, evaluates to the value pre-operation |
||
x-- |
assignable |
Decrements x and stores it to itself, evaluates to the value pre-operation |
||
x# |
type |
Turns the type x into a pointer |
||
x[y] |
pointer |
Performs #(x + y), however x must be a pointer |
||
x[...] |
struct |
Calls the array function in a struct | ||
x(...) |
function |
Calls the function | ||
x<-(...) |
type |
Turns the type x into a function |
||
x::scope |
pointer |
Returns the scope ID of an allocation pointed to by x |
||
x::size |
pointer |
Returns the size of an allocation pointed to by x |
||
x::length |
pointer |
Performs x::size / sizeof(#x) |
||
x.y |
struct |
Looks up the field y in struct x |
Evaluates the expression, and if it's truthy, it runs the codeblock. Otherwise, if specified, runs the else block
Repeatedly runs the codeblock as long as the expression evaluates to a truthy value
Iterates through all values in a range. Iterator type must be integer. If there's no values in the range, then nothing is executed. If inclusivity isn't specified, then on the left side, incl (inclusive) is used by default, however excl (exclusive) is used by default. Step size, if unspecified, is 1. If negative, the loop iterates backwards.
for s32 i: 0 => 10 => printf("%d ", i);
// "0 1 2 3 4 5 6 7 8 9 "
for s32 i: 0 excl => 10 incl => printf("%d ", i);
// "1 2 3 4 5 6 7 8 9 10 "
for s32 i: 0 excl => 10 incl step -1 => printf("%d ", i);
// "10 9 8 7 6 5 4 3 2 1 "
Makes a function return a value
Cancels the current loop iteration
Cancels the current loop iteration and forces the loop to exit
If a runtime error occurs inside the try block, the catch block executes. The catch block can be omitted entirely.
When a catch statement is declared, the codeblock can be omitted as long as the catch is silent and doesn't declare an identifier.
By default, catch logs the error into stderr. The specification of silently will disable this behavior.
When an identifier is declared, the error gets stored into it, only visible in the catch block. A language error will store void, but a user-defined error (throw) will store the value thrown.
Throws a value. By default, the program terminates, however when in a try block, the corresponding catch will execute
Can either be inline (=> ...) or multiline ({ ... }). Inline must have exactly one command, but multiline can have any amount
Evaluates the expression
In order to use the library, you need to have exactly one C++ file containing this:
#define PAWSCRIPT_IMPLEMENTATION
#include "pawscript.h"PawScriptContext* pawscript_create_context()returns: A new script context
PawScriptError* pawscript_run(PawScriptContext* context, const char* code)- Runs code from a string in memory
returns:NULLif there weren't any errors,PawScriptError*otherwise
PawScriptError* pawscript_run_file(PawScriptContext* context, const char* filename)- Runs code from a file
returns:NULLif there weren't any errors,PawScriptError*otherwise
bool pawscript_get(PawScriptContext* context, const char* name, void* ptr)- Copies the data from variable
nameintoptr returns:trueif the variable is found,falseotherwise
- Copies the data from variable
bool pawscript_set(PawScriptContext* context, const char* name, void* ptr)- Copies the data from
ptrinto variablename returns:trueif the variable is found,falseotherwise
- Copies the data from
bool pawscript_print_variable(PawScriptContext* context, FILE* f, const char* name)- Prints the variable
nameinto file streamf returns:trueif the variable is found,falseotherwise
- Prints the variable
void pawscript_log_error(PawScriptError* error, FILE* f)- Logs the
errorinto the file streamf.errorgets destroyed
- Logs the
void pawscript_destroy_error(PawScriptError* error)- Destroys
errorwithout logging it
- Destroys
void pawscript_destroy_context(PawScriptContext* context)- Destroys the
context
- Destroys the
void on_segfault(void(*handler)(void* addr))- The interpreter installs its own segfault handler to catch invalid memory accesses caused by scripts. This function can be used to install callbacks that get called if a segfault occurs outside of scripts
addr- The address that was tried to be accessed
The engine also has a special variable: @RESULT@ (stored in the PAWSCRIPT_RESULT macro), which contains the result of the code last run. pawscript_run and pawscript_run_file both update this variable.
In C, you can just define a non-static function:
void function() {
...
}and then in PawScript, you can use extern to use it in the script:
extern void<-() function;
Then you can use it as a regular PawScript function.
If the function is defined in the host program, you need to compile it with -rdynamic on Linux and use __declspec(dllexport) on Windows: __declspec(dllexport) void function() {}.
If the function is defined in a library, no extra compiler flags or function attributes are needed.
Alternatively, you can store the function directly into a variable:
void<-() function;
void function() {
...
}
pawscript_set(context, "function", function); // function is already a pointerIn this case, no compiler flags or function attributes are needed as no symbol lookup is performed.
For a function defined like
void<-() function { ... }
the function can be obtained in C like this:
void(*function)();
pawscript_get(context, "function", &function);
function(); // calls the script functionTODO