中文|Engilsh
punyGo is a minimalist programming language implemented in Go. It includes a simple lexer, parser, Abstract Syntax Tree (AST), interpreter, and a REPL (Read-Eval-Print Loop). The language currently supports integer arithmetic operations, variable assignment, and the use of parentheses to control operation precedence.
This project is inspired by the book "Writing An Interpreter In Go" and aims to provide a learning resource for building interpreters and compilers.
- Features
- Getting Started
- Usage Examples
- Project Structure
- Project Case Study
- Contribution Guidelines
- License
- Integer Arithmetic Operations: Supports
-
,*
,/
operators - Variable Assignment: Use the
let
keyword for variable declaration and assignment - Parentheses Precedence: Use parentheses to control the order of operations
- REPL: Provides an interactive programming environment
- Simple Lexer and Parser
- Abstract Syntax Tree (AST) Representation
Go programming language (version 1.17 or higher) installed. You can download and install Go from the official Go website.
-
Clone the repository:
git clone https://github.com/bistutu/punyGo.git cd punyGo
-
Build the project:
Use the provided
Makefile
to build the project.make build
This will compile the source code and generate the executable
punyGo
in thebin/
directory.
To start the punyGo REPL, run:
make run
Or execute the binary directly:
./bin/punyGo
You will see the following prompt:
Hello yourusername! This is the punyGo programming language!
Feel free to type in commands
>>
In the command line, you can input punyGo code and see the results immediately.
-
Basic Arithmetic Operations:
>> 1 2 * 3; (1 (2 * 3)); 7
-
Using Parentheses to Control Precedence:
>> (1 2) * 3; ((1 2) * 3); 9
-
Variable Assignment:
>> let x = 5; let x = 5; >> x; x; 5
-
Variable Operations:
>> let y = x 10; let y = (x 10); >> y; y; 15 >> x y; (x y); 20
-
Complex Expressions:
>> let result = (x y) * 2; let result = ((x y) * 2); >> result; result; 40
punyGo/
├── Makefile # Build and run commands
├── go.mod
├── pkg/
│ ├── ast/
│ │ └── ast.go # Defines nodes of the Abstract Syntax Tree (AST)
│ ├── evaluator/
│ │ └── evaluator.go # Interpreter for evaluating the AST
│ ├── lexer/
│ │ └── lexer.go # Lexer that tokenizes the input code
│ ├── object/
│ │ └── object.go # Defines the object system (integers, environment, etc.)
│ ├── parser/
│ │ └── parser.go # Parser that builds the AST
│ ├── repl/
│ │ └── repl.go # Implementation of the Read-Eval-Print Loop
│ └── token/
│ └── token.go # Defines tokens used by the lexer and parser
└── main.go # Main entry point of the application
Makefile
: Contains commands to build and run the project, making it easy to perform common operations.main.go
: The main entry point of the application, which starts the REPL and handles user input.pkg/
: Contains the core code of punyGo.ast/
: Definitions of the Abstract Syntax Tree (AST), describing the structure of the program.ast.go
: Defines all AST node structs and methods, used to represent and traverse the syntax structure of the program.
evaluator/
: The interpreter that evaluates the AST.evaluator.go
: Contains the core logic for evaluating expressions, handling variable assignment, and environment.
lexer/
: The lexer that converts source code into tokens.lexer.go
: Implements scanning of source code, recognizing keywords, identifiers, numbers, operators, etc.
object/
: Defines the object system, including integers, environment, etc.object.go
: Defines the object interface and concrete object types, such as integers, booleans, environment, and error objects.
parser/
: The parser that parses token sequences into the AST.parser.go
: Implements a recursive descent parser, handling expressions, prefix and infix operators, etc.
repl/
: The Read-Eval-Print Loop, implementing the interactive programming environment.repl.go
: Handles user input, invokes the lexer, parser, and interpreter, and outputs results.
token/
: Defines token types used by the lexer and parser.token.go
: Defines token types, the token struct, and the mapping from keywords to token types.
When you input a statement (e.g., let x = 5 5;
), the entire process involves multiple components such as the lexer, parser, abstract syntax tree (AST), evaluator, and object system. The lexer is responsible for breaking down the source code into tokens, the parser constructs the AST, the evaluator traverses the AST and executes the corresponding operations, ultimately achieving program execution and output. This modular design makes the interpreter highly maintainable and extensible, facilitating the support of more language features and complex syntax structures.
Below is a detailed flow description:
The statement you input, let x = 5 5;
, is passed to the interpreter as source code.
Component: Lexer
struct and related methods in the lexer
package.
Process:
-
Initialize Lexer:
- Call
lexer.New(input)
to create a newLexer
instance and initialize it with the input source code. - The
readChar
method is called to read the first character intol.ch
.
- Call
-
Tokenization:
- The
NextToken
method is repeatedly called to sequentially convert the input source code into a series of lexical tokens. - For
let x = 5 5;
, the lexer generates the following token sequence:LET
(keyword)IDENT
(identifierx
)ASSIGN
(assignment operator=
)INT
(integer5
)PLUS
(plus signINT
(integer5
)SEMICOLON
(semicolon;
)EOF
(end of file marker)
- The
Example:
lexer := lexer.New("let x = 5 5;")
tokens := []token.Token{}
for tok := lexer.NextToken(); tok.Type != token.EOF; tok = lexer.NextToken() {
tokens = append(tokens, tok)
}
Component: Parser
struct and related methods in the parser
package.
Process:
-
Initialize Parser:
- Create a new
Parser
instance using the tokens generated by the lexer:parser.New(lexer)
. - Register prefix and infix parsing functions to parse different types of expressions and operators.
- Create a new
-
Construct AST:
- Call the
ParseProgram
method to begin parsing the entire program. - The
parseStatement
method parses different types of statements based on the current token type. - For
let x = 5 5;
, the parser constructs the following AST structure:
- Call the
Program
└── LetStatement
├── Name: Identifier(x)
└── Value: InfixExpression
├── Left: IntegerLiteral(5)
├── Operator:
└── Right: IntegerLiteral(5)
Example:
parser := parser.New(lexer)
ast := parser.ParseProgram()
if len(parser.Errors()) != 0 {
// Handle parsing errors
}
Components: Evaluator
in the evaluator
package and the object system in the object
package.
Process:
-
Initialize Environment:
- Create a new environment instance to store variables and their values:
env := object.NewEnvironment()
.
- Create a new environment instance to store variables and their values:
-
Traverse and Evaluate AST:
- Call
evaluator.Eval(ast, env)
to start evaluating the AST. - The evaluation process is as follows:
- Program Node:
- Traverse its child node
LetStatement
.
- Traverse its child node
- LetStatement Node:
- Evaluate the assignment expression
5 5
:- InfixExpression Node:
- Evaluate the left operand
5
, resulting in anInteger
object. - Evaluate the right operand
5
, resulting in anInteger
object. - Perform the addition operation, resulting in an
Integer
object10
.
- Evaluate the left operand
- InfixExpression Node:
- Bind the variable
x
to theInteger
object10
, stored in the environment.
- Evaluate the assignment expression
- Program Node:
- Call
Example:
result := evaluator.Eval(ast, env)
if errObj, ok := result.(*object.Error); ok {
fmt.Println(errObj.Message)
} else {
// Handle other types of results
}
For a let
statement, there is no direct output, as it is a variable declaration and assignment operation. However, the variable x
has been stored in the environment and can be used in subsequent expressions or statements.
Example:
// Suppose there is a subsequent expression `x 10;`
lexer := lexer.New("let x = 5 5; x 10;")
parser := parser.New(lexer)
ast := parser.ParseProgram()
env := object.NewEnvironment()
result := evaluator.Eval(ast, env)
if errObj, ok := result.(*object.Error); ok {
fmt.Println(errObj.Message)
} else {
fmt.Println(result.Inspect()) // Output: 20
}
-
Input Source Code:
let x = 5 5;
-
Lexical Analysis:
Tokenize into a sequence:
LET
,IDENT(x)
,ASSIGN
,INT(5)
,PLUS
,INT(5)
,SEMICOLON
,EOF
. -
Parsing:
Construct AST:
Program └── LetStatement ├── Name: Identifier(x) └── Value: InfixExpression( ) ├── Left: IntegerLiteral(5) ├── Operator: └── Right: IntegerLiteral(5)
-
Evaluation:
- Evaluate the expression
5 5
, resulting in10
. - Bind
x
to10
.
- Evaluate the expression
-
Output:
- The
let
statement does not produce direct output, but the variablex
can be used in subsequent operations.
- The
If there are more statements or expressions in the source code, the evaluator will continue to traverse the AST, evaluating each node in sequence and updating the environment as needed. For example:
let x = 5 5;
x 10;
When evaluating x 10;
, the evaluator retrieves the value of x
(which is 10
) from the environment, then performs the addition operation, ultimately outputting 20
.
Throughout the process, if errors are encountered during lexical analysis, parsing, or evaluation (e.g., undefined variables, type mismatches, syntax errors), the corresponding error messages will be recorded and output, ensuring the robustness of the program and ease of debugging.
Contributions are welcome! Please feel free to submit pull requests or open issues.
Areas you can contribute to include but are not limited to:
- Error Handling: Improve error messages in the lexer, parser, and interpreter.
- New Features:
- Implement boolean types and operations.
- Add conditional statements (
if-else
). - Support function definitions and calls, as well as scope management.
- Extend the object system to include more data types (strings, arrays, etc.).
- Optimization: Enhance the performance of the interpreter.
- Documentation: Enrich documentation and add more examples.
This project is licensed under the MIT License. For details, please refer to the LICENSE file.