Create a programming language
Python is a high-level language. It should be first interpreted into bytecode by interpreter written by C, and then translated into machine learning to be executed.
A powerful form of abstraction is to define a new language that is tailored to a particular type of application or problem domain.
A programming language has:
- Syntax: The legal statements and expressions in the language
- Semantics: The execution/evaluation rule for those statements and expressions
For example, if now we want to create a language called “Calculator”. We need to define:
- Syntax:
- The Calculator language has primitive expressions and call expressions. (That’s it!)
- A primitive expression is a number:
2
-4
5.6
; A call expression is a combination that begins with an operator (+, -, *, /) followed by 0 or more expressions:(+ 1 2 3)
(/ 3 (+ 4 5)) 17
- Expressions are represented as Scheme lists (Pair instances) that encode tree structures.
- Senmantics: The value of a calculator expression is defined recursively.
- Primitive: A number evaluates to itself.
- Call: A call expression evaluates to its argument values combined by an operator.
- +: Sum of the arguments
- *: Product of the arguments
- -: If one argument, negate it. If more than one, subtract the rest from the first.
- /: If one argument, invert it. If more than one, divide the rest from the first.
To create a new programming language, you either need a:
- Specification: A document describe the precise syntax and semantics of the language
- Canonical Implementation: An interpreter or compiler for the language
Interpreter
So now let’s create an interpreter for Calculator.
Structure of Interpreter
As we use eval
to evaluate expressions in python interpreter, we define a calc_eval
as below:
def calc_eval(exp):
if isinstance(exp, (int, float)):
return exp
elif isinstance(exp, Pair):
return calc_append(exp.first, exp.rest.map(calc_eval))
else:
raise TypeError
def calc_apply(operator, args):
if operator == '+':
return reduce(add, args, 0)
elif operator == '-':
return reduce(sub, args, 0)
elif operator == '*':
return reduce(mul, args, 1)
elif operator == '/':
return reduce(truediv, args, 1)
else:
raise TypeError
def reduce(fn, lst, initial):
for item in lst:
initial = fn(initial, item)
return initial
# interactive interpreter
@main
def read_eval_print_loop():
"""run a read-eval-print loop for Calculator"""
while True:
try:
src = buffer_input()
while src.more_on_line:
expression = scheme_read(src)#parsing
print(calc_eval(expression))#evaluation
except (SyntaxError, TypeError, ValueError, ZeroDivisionError) as err:
print(type(err).__name__, ":", err)
except (KeyboardInterrupt, EOFError):
print('Calculation completed.')
return
Special forms
Calculate is a simple example, which only includes a very small part of Scheme. How do interpreters process special forms in Scheme?
Special forms are identified by the first list element, like:
(if <predicate> <consequent> <alternative>)
(lambda (<formal-parameters>) <body>)
(define <name> <expression>)
Any combination that is not a known special form is a call expression.
Parsing
The task of parsing a language involves coercing a string representation of an expression. to the expression itself.