Files

nessi 0da224325a chore: initialize NexaCore compiler workspace with basic frontend and CLI

Add initial project structure for NexaCore programming language compiler:
- Create Cargo workspace with 4 crates (cli, driver, frontend, runtime)
- Add lexer with indentation-based tokenization and keyword support
- Add parser for modules, functions, structs, and basic expressions
- Implement CLI with build command and placeholder subcommands
- Add driver crate to orchestrate compilation pipeline
- Include .gitignore for Rust build

2026-04-06 16:57:54 +02:00

15 KiB

Raw Permalink Blame History

NexaCore Foundation

1. Language Vision

What NexaCore is

NexaCore is a compiled backend language designed for APIs, database-heavy services, internal platforms, and system daemons. The language aims to keep code visually simple and readable while enforcing stronger correctness guarantees than Python. It prioritizes predictable performance, structured concurrency, explicit error handling, and batteries-included backend tooling.

Target users

backend engineers building REST APIs and service layers
platform teams building internal tools and service orchestration
companies replacing Python microservices that have grown too dynamic or too slow
teams that want a simpler language than Rust for application-level backend work

Why it is better suited for backend systems than Python

compiled deployment artifact instead of shipping source trees and interpreter environments
static typing with local inference catches failures before production
explicit error model improves reliability for service code
structured async runtime designed around network and database workloads
first-class PostgreSQL and HTTP support as standard capabilities, not bolted-on frameworks
smaller operational surface for packaging, startup, and deployment
stronger encapsulation and harder-to-read binaries than plain source code shipping

Design goals

keep syntax approachable and easy to scan
optimize for backend productivity, not language cleverness
compile to efficient deployable artifacts
make async IO, HTTP routing, and PostgreSQL first-class
provide strong type safety with low annotation burden
support Linux first with a path to Windows later
keep tooling simple: new, build, run, test, fmt, add, doc

Non-goals

replacing C or Rust for kernel, driver, or embedded programming
full zero-cost manual memory control in the MVP
metaprogramming-heavy language features in version one
multiple inheritance, operator overloading, or macros in the MVP
universal frontend or browser runtime in the MVP

2. Language Features

MVP feature set

Variables: immutable by default with let, mutable with var
Functions: named functions with return types and local type inference
Structs: product types with methods and visibility control
Modules/imports: file-based modules with package namespaces
If/else: expression-friendly branching
Match: exhaustive matching on enums, literals, and guards
Loops: for, while, and iterator-based traversal
Error handling: Result<T, E>, ? propagation, defer later
Async/await: structured async for network and database operations
Database access: typed query APIs and row-to-struct mapping
HTTP API: built-in routing and request/response abstractions in stdlib
Package management: first-party package manifest and lockfile

Typing model

Static typing with local inference is the right MVP choice. NexaCore should infer obvious local types while requiring type signatures on public functions, struct fields, and externally visible module boundaries. This gives Python-like authoring speed without Python’s runtime ambiguity.

Memory model

The MVP should use automatic memory management through reference-counted heap objects plus arena ownership inside the compiler and runtime internals. This is simpler than full tracing GC and easier to implement safely than Rust-like borrow checking in a new language. Long term, the language can evolve toward region-based optimization and escape analysis.

Concurrency model

Structured async concurrency for IO-bound backend work is the default. The language runtime owns the async scheduler and task model. Shared-state threads are not part of the first language surface; background workers and task spawning go through runtime primitives.

3. Syntax Design

NexaCore should be indentation-aware for readability but use explicit block starters so the parser remains simple and code stays visually structured. A colon starts a block, and indentation ends it.

Hello world

fn main() -> Int:
    print("Hello, NexaCore")

Variables

let host = "127.0.0.1"
var port: Int = 8080
let debug = true

Functions

fn add(a: Int, b: Int) -> Int:
    a + b

Structs and methods

pub struct User:
    id: Int
    email: String

impl User:
    fn display(self) -> String:
        "{self.id}:{self.email}"

REST API endpoint

use web.http.{App, Request, Response}

fn health(_req: Request) -> Response:
    Response.json({
        "status": "ok"
    })

PostgreSQL query

use db.postgres.{Pool, query}

async fn load_user(pool: Pool, id: Int) -> Result<User, DbError>:
    let row = await query(pool,
        "select id, email from users where id = $1",
        [id]
    )?.one()

    row.into<User>()

Async function

async fn fetch_profile(user_id: Int) -> Result<Profile, AppError>:
    let profile = await profiles.load(user_id)?
    profile

Error handling

fn parse_port(raw: String) -> Result<Int, ConfigError>:
    match raw.to_int():
        ok(value) => value
        err(_) => fail ConfigError.invalid("PORT must be numeric")

Imports and modules

use core.env
use web.http.{App, Response}
use db.postgres.Pool

4. Technical Architecture

Compiler pipeline

Lexer Converts UTF-8 source into tokens, including indentation-sensitive block tokens.
Parser Builds an AST from tokens using a recursive descent parser.
AST Stores module declarations, items, statements, expressions, types, and spans.
Semantic analyzer Resolves names, module symbols, scopes, and visibility.
Type checker Infers local types, validates function signatures, and resolves generic instantiations.
HIR and MIR HIR for resolved source-level structure, MIR for lowered control flow and typed operations.
Backend code generation Emit portable C in the MVP, then compile with a system C compiler.
Binary output Native executable or shared object linked with the NexaCore runtime.

Module system

one package contains a nexa.toml manifest
source files live in src/
src/main.nx builds an application binary
src/lib.nx builds a library package
use imports symbol paths
package dependencies resolve through a first-party registry later; local path dependencies first

Package manager

The nexacore CLI owns package management:

nexacore new api-service
nexacore add postgres
nexacore build
nexacore test

Manifest:

[package]
name = "orders-api"
version = "0.1.0"
edition = "2026"

[dependencies]
postgres = "0.1"
http = "0.1"

Standard library layout

core: strings, collections, io, env, time, result, option
async: tasks, channels, timers
web: http server, routing, requests, responses, middleware
db: postgres client, pooling, migrations later
json: encode, decode, schema helpers
auth: jwt and password utilities
log: structured logging

Best MVP implementation path

The best MVP path is NexaCore -> AST/HIR/MIR -> C -> native binary.

Why C is the best first backend

easier to implement than a full LLVM backend
produces native binaries immediately
lets the team focus first on language semantics, standard library shape, and runtime
easier to debug generated output during compiler bring-up
keeps a clean migration path to LLVM or a direct machine-code backend later
avoids the operational and implementation overhead of designing a serious VM before validating the language

Why not LLVM first

LLVM is powerful, but it significantly increases implementation surface area early. For an MVP language team, front-end maturity and runtime design are bigger risks than instruction selection.

Why not a bytecode VM first

A VM is attractive for portability, but it weakens the deployment and code-protection story and requires designing both a language and a production runtime execution engine at once.

5. Security and Code Protection

No compiled format is impossible to reverse engineer. NexaCore should aim for strong practical resistance, not absolute secrecy.

Realistic protection model

compile to native binaries for deployment
strip symbols in release mode
minimize embedded reflection metadata
avoid preserving source-like names unless needed for diagnostics
separate debug symbols from production artifacts
support link-time optimization and dead-code elimination
optionally obfuscate private symbol names in hardened builds
keep secrets out of binaries; load them from environment or secret managers

Tradeoffs

native binaries are materially harder to inspect than source, but still reversible with enough effort
bytecode is easier to decompile than optimized native code
aggressive obfuscation complicates debugging and incident response
encrypted assets help with packaged resources, not code secrecy after runtime decryption

Recommended release modes

debug: symbols and source maps kept
release: optimized and stripped
release-hardened: stripped, symbol-minimized, optional control-flow obfuscation hooks later

6. Web Backend Standard Library

Core backend SDK modules

HTTP server: TCP listener, HTTP parser integration, request lifecycle
Routing: method/path routing, path params, nested groups
Middleware: auth, logging, recovery, tracing
JSON: serializer and parser with typed model mapping
Environment config: .env loading later, env parsing, typed config helpers
PostgreSQL driver: async client, pool, prepared statements
Logging: structured logger with JSON output option
File handling: streams, safe path utilities, upload helpers later
JWT/auth helpers: token signing, verification, password hashing later
Background jobs: runtime task spawning, scheduling, queues later
WebSocket: defer until after HTTP core is stable

7. PostgreSQL Integration

NexaCore should treat PostgreSQL as a first-class backend primitive. The syntax stays explicit, but the API should be much tighter than Python ORMs and less ceremony-heavy than many async driver stacks.

Opening a connection

use db.postgres.Pool

let pool = Pool.connect(env.require("DATABASE_URL"), max: 20)?

Select queries

let users = await pool.query<User>(
    "select id, email from users order by id"
)?

Inserts and updates

let inserted = await pool.exec(
    "insert into users(email) values($1)",
    ["a@example.com"]
)?

Transactions

let tx = await pool.begin()?
await tx.exec("update accounts set balance = balance - $1 where id = $2", [10, from])?
await tx.exec("update accounts set balance = balance + $1 where id = $2", [10, to])?
await tx.commit()?

Mapping rows to structs

struct User:
    id: Int
    email: String

let user = await pool.query_one<User>(
    "select id, email from users where id = $1",
    [id]
)?

Connection pooling

let pool = Pool.connect(url, max: 32, min: 4, idle_timeout_sec: 30)?

Async queries

All PostgreSQL APIs are async-first. Blocking database access is not part of the standard application path.

8. Developer Experience

CLI commands

nexacore new <name>: create a new app or library
nexacore build: compile package to binary or library
nexacore run: build and execute
nexacore test: run language and package tests
nexacore fmt: format source code
nexacore add <package>: add dependency
nexacore doc: build documentation

Example backend project layout

orders-api/
  nexa.toml
  src/
    main.nx
    api/
      routes.nx
      users.nx
    db/
      models.nx
      queries.nx
    config.nx
  tests/
    users_test.nx

9. Starter Implementation Plan

Phase 1: language spec MVP

freeze core syntax rules
define token grammar and block structure
define AST and type system MVP
define package manifest format

Phase 2: lexer/parser/AST

implement token definitions
implement indentation-aware lexer
implement parser for modules, functions, structs, statements, and expressions
snapshot parser test fixtures

Phase 3: semantic analysis

name resolution
scope tracking
visibility rules
type inference for locals
public API type validation

Phase 4: code generation

HIR and MIR lowering
C backend emitter
runtime ABI definition
compile driver invoking system C compiler

Phase 5: runtime and stdlib

string and collection runtime
result/option representations
async task scheduler
IO primitives

Phase 6: PostgreSQL and HTTP framework

async socket and HTTP runtime
routing and JSON helpers
PostgreSQL client and connection pooling
example backend service

Phase 7: package manager and tooling

manifest parser
dependency resolver
formatter
test runner
docs generator

10. Repository Structure

NexaCore/
  Cargo.toml
  README.md
  docs/
    nexacore-foundation.md
  crates/
    nxc-cli/
    nxc-driver/
    nxc-frontend/
    nxc-runtime/
  stdlib/
    core/
    db/
    http/
  packages/
  examples/
    backend-api/
  tests/
    compiler/
    integration/
  tools/

11. MVP Code Generation

Bootstrapping in Rust is the best choice:

excellent fit for compiler engineering
strong enums and pattern matching for token/AST modeling
memory safety for a long-lived systems project
good ecosystem for CLI, testing, and later LLVM/C toolchain integrations

The starter code in this repo includes:

token definitions
lexer
AST nodes
parser skeleton
compiler driver
CLI entrypoint

12. Example NexaCore Program

use core.env
use db.postgres.Pool
use web.http.{App, Response}

struct AppState:
    pool: Pool

async fn health(state: AppState) -> Response:
    let version = env.get("APP_VERSION").or("dev")
    let row = await state.pool.query_one<Map>(
        "select now() as now"
    )?

    Response.json({
        "status": "ok",
        "version": version,
        "database_time": row["now"]
    })

async fn main() -> Result<Void, AppError>:
    let database_url = env.require("DATABASE_URL")?
    let port = env.get("PORT").or("8080").to_int()?
    let pool = Pool.connect(database_url, max: 16)?

    let app = App.new()
        .state(AppState { pool: pool })
        .get("/health", health)

    await app.listen("0.0.0.0", port)?

13. Codex Execution Rules

prioritize correctness over false completeness
implement compileable starter code, not pseudocode disguised as finished work
leave clear TODOs where the compiler is intentionally incomplete
keep package boundaries aligned with the compiler pipeline
avoid inventing third-party dependencies unless they are explicitly added

14. Final Deliverable

Recommended architecture choice

Rust front-end compiler with a C backend for the MVP.

MVP scope

parser and type-checked core language
C code generation
native binary build flow on Linux
minimal runtime
first-party HTTP and PostgreSQL runtime modules

First coding step

Build the front-end pipeline end to end for a single file: lex -> parse -> AST dump -> diagnostics.

First files to generate

workspace manifest
nxc-frontend token/lexer/parser/AST
nxc-driver compile pipeline
nxc-cli entrypoint
example main.nx

Build order

tokens and spans
lexer
AST
parser
diagnostics
semantic resolver
type checker
HIR/MIR lowering
C backend
runtime and stdlib

15 KiB Raw Permalink Blame History Unescape Escape