RL-Optimized Source-to-Source Compiler

Source Code

Input Data

Enter input values here, one per line

Compilation Pipeline

Lexical Analysis

Breaks down source code into individual tokens (keywords, operators, identifiers, etc.)

Run compilation to see tokens...

Abstract Syntax Tree

Builds a tree structure representing the grammatical structure of the program

Run compilation to see AST...

Semantic Analysis

Checks for semantic errors like undefined variables and type mismatches

Run compilation to see semantic analysis...

Intermediate Code

Generates three-address code as an intermediate representation

Run compilation to see intermediate code...

RL-Based Optimization

Uses reinforcement learning to optimize code through constant folding, dead code elimination, etc.

Run compilation to see optimization log...

Optimized code will appear here...

Python Code Generation

Translates optimized intermediate code into executable Python

Run compilation to see generated Python code...

Execution Results

Secure execution of the generated Python code with captured output

Execution time: --

Status: Ready

Output

Run compilation to see output...

Errors

Language & Compiler Overview

This custom programming language and its accompanying Reinforcement Learning (RL) Optimized Source-to-Source Compiler represents a novel approach to automated code enhancement. Designed to translate custom structural semantics directly into heavily optimized Python, the compiler employs a multi-pass pipeline covering tokenization, syntax analysis, semantic validation, intermediate code generation, and AI-driven optimization.

The compiler evaluates every branch of the intermediate representation (Three-Address Code), allowing the autonomous RL agent to explore code restructuring options dynamically before generating the final target executable.

Reinforcement Learning Architecture

Unlike traditional compilers that use static heuristic rules (like fixed dead-code elimination loops), this compiler integrates a Q-Learning reinforcement agent that treats code optimization as an interactive state-space problem.

State Representation: The RL agent reads the hashed state of the Intermediate Code at specific token indices, mapping assignments, logic operations, and control branches.
Action Space: The agent can dynamically execute distinct mutations: Constant Folding, Loop Invariant Motion, Common Subexpression Elimination, and Dead Code Removal.
Epsilon-Greedy Logic: Over repeated executions, the agent balances exploiting known optimization paths (Q-Table lookups) while occasionally exploring new sequence mutations to break out of local minima.
Reward Function: Code size reduction, loop iteration stripping, and runtime reductions provide positive scalar values to update the Q-Table weights.

Variables and Assignment

Basic Assignment

x = 10;
name = "John";
print(x);
print(name);

Output:

10
John

Compound Assignment

x = 5;
x += 3;  // x = x + 3
print(x);

Output:

Arithmetic Operations

a = 10;
b = 3;
print(a + b);
print(a - b);
print(a * b);
print(a / b);

Output:

13
7
30
3.3333333333333335

Control Flow

If Statements

age = 20;
if (age >= 18) {
    print("Adult");
} else {
    print("Minor");
}

Output:

Adult

While Loops

counter = 1;
while (counter <= 3) {
    print(counter);
    counter += 1;
}

Output:

1
2
3

Input and Output

print("Enter name:");
scan(name);
print("Hello");
print(name);

Output (with input "Alice"):

Enter name:
Hello
Alice

Operators

Arithmetic Operators

+ Addition
- Subtraction
* Multiplication
/ Division

Comparison Operators

> Greater than
< Less than
>= Greater than or equal
<= Less than or equal
== Equal to
!= Not equal to

Logical Operators

&& Logical AND
|| Logical OR
! Logical NOT

Assignment Operators

= Assign
+= Add and assign
-= Subtract and assign

System Security & Sandboxing

To safely evaluate potentially hazardous or deeply recursive adversarial AST structures, the compiler utilizes a multi-process execution sandbox rather than standard multi-threading.

Multi-Process Isolation: Infinite loops (`while(1)`) are forcefully terminated by isolating Python executions in separate processes, instantly garbage collecting resources upon timeout.
AST Depth Protection: System recursion limits are hardened to protect against aggressive fuzz-test tree generation.
Zero-Division Safety: Direct python translation generates ternary protective blocks (Left / Right if Right != 0 else 0), inherently preventing execution-level crashing.

Source Code

Input Data

Compilation Pipeline

Lexical Analysis

Abstract Syntax Tree

Semantic Analysis

Intermediate Code

RL-Based Optimization

Python Code Generation

Execution Results

Output

Errors

Contents

Language & Compiler Overview

Reinforcement Learning Architecture

Variables and Assignment

Basic Assignment

Output:

Compound Assignment

Output:

Arithmetic Operations

Output:

Control Flow

If Statements

Output:

While Loops

Output:

Input and Output

Output (with input "Alice"):

Operators

Arithmetic Operators

Comparison Operators

Logical Operators

Assignment Operators

System Security & Sandboxing