Skip to content

coolandcodes/antro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

76 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

antro

This is a toy parser for an experimental programming language called antro scripting language which is still in development. This project is purely educational (for now) as the language cannot be used for any industry work in its current form. Therefore, the sole aim of this project is to show and teach the skills required of designing computer languages and implementing them.

The Regular Grammar as details for the Tokenizer and the Context-Free Grammar as production rules for the Parser (written in EBNF format) as well as other details of the algorithm used to implement the recursive descent strategy of the parser.

FrontEnd Design

  • Tokenizer (lexical analysis)
  • Parser (semantic analysis)
  • Executor (a concurrent thread-safe queue for passing tokens from the Tokenizer to the Parser with a lookahead of 1)

Backend Design

Make use of the LLVM IR Builder for IR (intermediate representation) generation via a Java library alongside the LLVM Module and LLVM Context and then convert the IR to machine code using an LLVM Codegen toolchain.

What is a Regular Grammar ?

  • A regular grammar is the set of all strings generated by a grammar which all contain characters of the alphabet (or other single-character symbols) as defined by that grammar and which result to tokens (terminal symbols).

What is a Context-Free Grammar ?

  • A context-free grammar (CFG) is a set of recursive rewriting rules (or productions) used to generate patterns of strings. A CFG consists of the following components: a set of terminal symbols, which are the characters of the alphabet that appear in the strings generated by the grammar as well as a set of non-terminals.

Sample program written in antro

	require: "sys.module";
	require: "errors.module";
	require: "logging.module";
	require: "types.module";

	def: MAX 200;

	begin: (void)
	  --# A novel programming language design for error handling (antro)
	  --# This uses chained exceptions behind the scenes (within the runtime).
	  var error = call: error("Program crashed");
	  var ty = call: factorUpBy2(MAX) -> eject_on error -> use {
 	    if (error) {
	      call: output(f"{error.message} - {error.context.cause}");
	    }
		
  	    call: print("A fatal error occurred");

		panic_on error;
	  };
	  call: print(ty);
	end;

	def: factorUpBy2(x){
	  var y, g = true;

	   if (x > 0) {
	     y = (x / 2) * 4;
	   } else {
	     g = false;
	   }

	    y = call: convertToFactor(g, x);
	    retn y
	};

	def: convertToFactor(c, d){
	  var error_message_prefix = "Argument type error: ";

	  invariants {
	    error_message_prefix += "calling `convertToFactor(..)`; "
		var error_message_suffix = "`c` is not a number"
		var error_message = error_message_prefix + error_message_suffix

		--# `$$<...>` is a macro call into the runtime internal API
		--# for auto error propagation and chaining used here as
		--# `$$<error_message>`
		call: type(c, "number") -> eject_on $$<error_message>;
		error_message_suffix = "`d` is not a number"
		error_message = error_message_prefix + error_message_suffix
		call: type(d, "number") -> eject_on $$<error_message>;
	  }

	  --# before the `retn` statement below executes...
	  --# ... we call the invariants below πŸ‘‡πŸΎπŸ‘‡πŸΎ
	  --# Antro bakes invariants right into the...
	  --# ... programming model of the language πŸ’―
	  
	  defer -> invariants {
		call: print("leaving `convertToFactor` function");
      }

	  retn c * d;
	};

Though the above program doesn't do anything useful for now (i.e. the parser as currently written does not yet produce an Absract Syntax Tree - AST nor does it provide an Immediate Representation - IR), one can still get to understand the basics of what's going on.

About

Module/File Imports

The require keyword is used to require/import a module (i.e. a folder) or a single source (i.e. a file) as an implicit dependency. For example, require: "sys.module"; is a statement that requires/imports a folder named "sys.module". This folder must dircetly contain a root.antro file. The folder can also contain other files and folders.

Entry Point Definition

The begin keyword is used to defined the entry point of the antro program. It is truncated by the end keyword.

Other Definitions

The def keyword is used to define variables in the global scope (i.e. outside functions) that cannot be changed. When using def, it doesn't matter if the variable is defined in a global scope or local scope, it will always be a globally-scoped variable. Also, variables created with the def keyword cannot have their values changed/mutated but only copied into a variable whose value can be changed/mutated. NOTE: Antro makes use of lexical scoping.

Variable Creation

The var keyword is used to define variables or functions within a local scope (i.e. within functions) only. When using the var keyword, it matters that it isn't used in a global scope (i.e. outside functions) else the antro parser will throw a parse error. Also, variables created with the var keyword can have their value changed/mutated.

Exception Handling - Part 1

The eject_on keyword is the antro equivalent of a catch block in other scripting languages like JavaScript. Antro does not directly use the try/catch model for error handling. It uses an error to catch other errors that occur higher up on the call stack. In this way, the try/catch block is abstracted away from the source-level (hidden from the programmer) and handled by the antro compiler and runtime.

Exception Handling - Part 2

The panic_on keyword is the antro equivalent of panic keyword in Golang which triggers abandonment.

NOTE: Antro does not support multiple return value NOTE: Antro does not support enums (as they're mostly useless in any language that implements them) NOTE: Antro has a build-flag (i.e. --build-mode) system on the CLI that relaxes the enforcement of certain compilation rules:

  • Using --build-mode=dev, any declared yet unused variable does not cause a compilation error
  • Using --build-mode=prod, any variable declaration where the right-hand side is a non-standard library API/non-literal must be typed
  • Using --build-mode=prod, any function definition without an invariants block causes a compilation error
  • Using --build-mode=dev, any call to panic_on (directly or indirectly) outside of a use block does not cause a compilation error

NOTE: Antro only has 2 broad classifications for errors:

  • Recoverable Errors
  • Non-recoverable Errors

Error Handling - Part 3

The use keyword is the antro equivalent of finally keyword in most c-based programming languages like Java, C#, Python or PHP. Yet, it is used specifically to

Invaraints

The invariants keyword is used to setup invariants within a local scope (i.e. within functions). For the design of antro, i believe that invariants ought to be baked into the programming model (i.e. the programming language). In the future, i plan to setup macros just like they are used in Rust to make the invariants block shorter and more compact. All function definitions MUST contain an invariants block else the antro runtime will throw an error.

Defering Action

The defer keyword is the antro equivalent of the defer keyword in Golang.

Function Output

The retn keyword is used to return a value from a function definition or begin block.

Limiting Scope

The static keyword (similar to same in C programming language) is usedd in antro to limit the lexical scope access of a function or variable within a module source file.

License

This is released under the MIT license.

Design Inspiration

Antro language design was inspired by C, Go, Zig, Python and TypeScript all combined.

About

This is a toy parser for an experimental programming language

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages