My own programming language interpreter

source code and technical insights: github

This time I was asked at Uni to write my own programming language. It could be imperative, functional or logical but had to meet some general requirements. One of them was that we had to create its interpreter in Haskell. I decided to go the standard way and implemented a language based on a Latte syntax but added two fancy features in order to not copy something one to one.

Parsing

First task was to parse a source code and convert it into so-called Abstract Syntax Tree (AST) which is evaluated later on. Thankfully, there is already a ready-to-use tool that automates this job. I used BNF Converter and all I had to do was to provide grammar in a special format. BNFC would then generate the Haskell skeleton for the language logic code, parser and lexer for my language. There was a problem with obtaining line numbers for each token so I had to use a specific version which supports functors.

Semantics

Semantics is, simply speaking, the meaning of various commands in my programming language. And it is quite intuitive. With familiar C-based syntax, primitive types, logic, arithmetic, variables, ifs, loops, functions, static binding, etc. (full list below). But, as mentioned before, I added some extras.

Multi break
Typical break has an additional optional integer argument which indicates how many loops it should break out of.

int main () {
    int i = 0, n = 3;
    while (i <= n) {
        int j = 0;
        while (j <= n) {
            break 2;
            j = j + 1;
        }
        i = i + 1;
    }
    print i;
    
    return 0;
}

Output:

Cost blocks
You can enclose some part of the code with a named block and get the simple benchmark of the executed part at the end. Cost is a trivial counting of logic, arithmetic operations and allocated memory.

int nwd(int x, int y) {
	if (y == 0) return x;
	return nwd(y, x % y);
}

int main () {
    cost A {
        bool x = 1 + 2 * 3 > 3 + 2 * 1 ;
    }
    
    cost B {
        bool x = 1 + 2 * 3 > 3 + 2 * 1 ;
    }

    cost A {
        bool x = 1 + 2 * 3 > 3 + 2 * 1 ;
    }

    cost C {
        print nwd(832426, 498372);
    }

    print "Cost Blocks!";
    return 0;      
}

Output:

14
Cost Blocks!

======================
Cost of A
Arithmetic :	4 ops
Logical ops:	1 ops
Memory alloc:	1 B

Cost of B
Arithmetic :	8 ops
Logical ops:	2 ops
Memory alloc:	2 B

Cost of C
Arithmetic :	7 ops
Logical ops:	8 ops
Memory alloc:	64 B

Features list

Three primitive types (Int, Char, String)
Literals, arithmetic, comparison
Variables, assignments
Printing (built-in method)
While, If
Functions, recursion
Passing arguments by value or by reference
Variable shadowing, static binding
Handling runtime errors
Break, continue
Multi break
Counting cost of program execution in named blocks

Takeaways

Writing an interpreter with functional language turned out to be really neat, like it had been tailored to this specific task. Along the way, I was able to look at the programming languages in the bigger picture. I got used to the functional paradigm (it was a mind-opening experience), learned basic concepts of Haskell, used State Monad and understood static and dynamic binding, environment. Who knows, maybe I will create a serious language one day.

Piotr Borowski

CS student exploring Web development, with algorithmic background

My own programming language interpreter

Parsing

Semantics

Features list

Takeaways