Files
NexaCore/docs/nexacore-foundation.md
nessi 0da224325a chore: initialize NexaCore compiler workspace with basic frontend and CLI
Add initial project structure for NexaCore programming language compiler:
- Create Cargo workspace with 4 crates (cli, driver, frontend, runtime)
- Add lexer with indentation-based tokenization and keyword support
- Add parser for modules, functions, structs, and basic expressions
- Implement CLI with build command and placeholder subcommands
- Add driver crate to orchestrate compilation pipeline
- Include .gitignore for Rust build
2026-04-06 16:57:54 +02:00

15 KiB
Raw Permalink Blame History

NexaCore Foundation

1. Language Vision

What NexaCore is

NexaCore is a compiled backend language designed for APIs, database-heavy services, internal platforms, and system daemons. The language aims to keep code visually simple and readable while enforcing stronger correctness guarantees than Python. It prioritizes predictable performance, structured concurrency, explicit error handling, and batteries-included backend tooling.

Target users

  • backend engineers building REST APIs and service layers
  • platform teams building internal tools and service orchestration
  • companies replacing Python microservices that have grown too dynamic or too slow
  • teams that want a simpler language than Rust for application-level backend work

Why it is better suited for backend systems than Python

  • compiled deployment artifact instead of shipping source trees and interpreter environments
  • static typing with local inference catches failures before production
  • explicit error model improves reliability for service code
  • structured async runtime designed around network and database workloads
  • first-class PostgreSQL and HTTP support as standard capabilities, not bolted-on frameworks
  • smaller operational surface for packaging, startup, and deployment
  • stronger encapsulation and harder-to-read binaries than plain source code shipping

Design goals

  • keep syntax approachable and easy to scan
  • optimize for backend productivity, not language cleverness
  • compile to efficient deployable artifacts
  • make async IO, HTTP routing, and PostgreSQL first-class
  • provide strong type safety with low annotation burden
  • support Linux first with a path to Windows later
  • keep tooling simple: new, build, run, test, fmt, add, doc

Non-goals

  • replacing C or Rust for kernel, driver, or embedded programming
  • full zero-cost manual memory control in the MVP
  • metaprogramming-heavy language features in version one
  • multiple inheritance, operator overloading, or macros in the MVP
  • universal frontend or browser runtime in the MVP

2. Language Features

MVP feature set

  • Variables: immutable by default with let, mutable with var
  • Functions: named functions with return types and local type inference
  • Structs: product types with methods and visibility control
  • Modules/imports: file-based modules with package namespaces
  • If/else: expression-friendly branching
  • Match: exhaustive matching on enums, literals, and guards
  • Loops: for, while, and iterator-based traversal
  • Error handling: Result<T, E>, ? propagation, defer later
  • Async/await: structured async for network and database operations
  • Database access: typed query APIs and row-to-struct mapping
  • HTTP API: built-in routing and request/response abstractions in stdlib
  • Package management: first-party package manifest and lockfile

Typing model

Static typing with local inference is the right MVP choice. NexaCore should infer obvious local types while requiring type signatures on public functions, struct fields, and externally visible module boundaries. This gives Python-like authoring speed without Pythons runtime ambiguity.

Memory model

The MVP should use automatic memory management through reference-counted heap objects plus arena ownership inside the compiler and runtime internals. This is simpler than full tracing GC and easier to implement safely than Rust-like borrow checking in a new language. Long term, the language can evolve toward region-based optimization and escape analysis.

Concurrency model

Structured async concurrency for IO-bound backend work is the default. The language runtime owns the async scheduler and task model. Shared-state threads are not part of the first language surface; background workers and task spawning go through runtime primitives.

3. Syntax Design

NexaCore should be indentation-aware for readability but use explicit block starters so the parser remains simple and code stays visually structured. A colon starts a block, and indentation ends it.

Hello world

fn main() -> Int:
    print("Hello, NexaCore")

Variables

let host = "127.0.0.1"
var port: Int = 8080
let debug = true

Functions

fn add(a: Int, b: Int) -> Int:
    a + b

Structs and methods

pub struct User:
    id: Int
    email: String

impl User:
    fn display(self) -> String:
        "{self.id}:{self.email}"

REST API endpoint

use web.http.{App, Request, Response}

fn health(_req: Request) -> Response:
    Response.json({
        "status": "ok"
    })

PostgreSQL query

use db.postgres.{Pool, query}

async fn load_user(pool: Pool, id: Int) -> Result<User, DbError>:
    let row = await query(pool,
        "select id, email from users where id = $1",
        [id]
    )?.one()

    row.into<User>()

Async function

async fn fetch_profile(user_id: Int) -> Result<Profile, AppError>:
    let profile = await profiles.load(user_id)?
    profile

Error handling

fn parse_port(raw: String) -> Result<Int, ConfigError>:
    match raw.to_int():
        ok(value) => value
        err(_) => fail ConfigError.invalid("PORT must be numeric")

Imports and modules

use core.env
use web.http.{App, Response}
use db.postgres.Pool

4. Technical Architecture

Compiler pipeline

  1. Lexer Converts UTF-8 source into tokens, including indentation-sensitive block tokens.
  2. Parser Builds an AST from tokens using a recursive descent parser.
  3. AST Stores module declarations, items, statements, expressions, types, and spans.
  4. Semantic analyzer Resolves names, module symbols, scopes, and visibility.
  5. Type checker Infers local types, validates function signatures, and resolves generic instantiations.
  6. HIR and MIR HIR for resolved source-level structure, MIR for lowered control flow and typed operations.
  7. Backend code generation Emit portable C in the MVP, then compile with a system C compiler.
  8. Binary output Native executable or shared object linked with the NexaCore runtime.

Module system

  • one package contains a nexa.toml manifest
  • source files live in src/
  • src/main.nx builds an application binary
  • src/lib.nx builds a library package
  • use imports symbol paths
  • package dependencies resolve through a first-party registry later; local path dependencies first

Package manager

The nexacore CLI owns package management:

  • nexacore new api-service
  • nexacore add postgres
  • nexacore build
  • nexacore test

Manifest:

[package]
name = "orders-api"
version = "0.1.0"
edition = "2026"

[dependencies]
postgres = "0.1"
http = "0.1"

Standard library layout

  • core: strings, collections, io, env, time, result, option
  • async: tasks, channels, timers
  • web: http server, routing, requests, responses, middleware
  • db: postgres client, pooling, migrations later
  • json: encode, decode, schema helpers
  • auth: jwt and password utilities
  • log: structured logging

Best MVP implementation path

The best MVP path is NexaCore -> AST/HIR/MIR -> C -> native binary.

Why C is the best first backend

  • easier to implement than a full LLVM backend
  • produces native binaries immediately
  • lets the team focus first on language semantics, standard library shape, and runtime
  • easier to debug generated output during compiler bring-up
  • keeps a clean migration path to LLVM or a direct machine-code backend later
  • avoids the operational and implementation overhead of designing a serious VM before validating the language

Why not LLVM first

LLVM is powerful, but it significantly increases implementation surface area early. For an MVP language team, front-end maturity and runtime design are bigger risks than instruction selection.

Why not a bytecode VM first

A VM is attractive for portability, but it weakens the deployment and code-protection story and requires designing both a language and a production runtime execution engine at once.

5. Security and Code Protection

No compiled format is impossible to reverse engineer. NexaCore should aim for strong practical resistance, not absolute secrecy.

Realistic protection model

  • compile to native binaries for deployment
  • strip symbols in release mode
  • minimize embedded reflection metadata
  • avoid preserving source-like names unless needed for diagnostics
  • separate debug symbols from production artifacts
  • support link-time optimization and dead-code elimination
  • optionally obfuscate private symbol names in hardened builds
  • keep secrets out of binaries; load them from environment or secret managers

Tradeoffs

  • native binaries are materially harder to inspect than source, but still reversible with enough effort
  • bytecode is easier to decompile than optimized native code
  • aggressive obfuscation complicates debugging and incident response
  • encrypted assets help with packaged resources, not code secrecy after runtime decryption
  • debug: symbols and source maps kept
  • release: optimized and stripped
  • release-hardened: stripped, symbol-minimized, optional control-flow obfuscation hooks later

6. Web Backend Standard Library

Core backend SDK modules

  • HTTP server: TCP listener, HTTP parser integration, request lifecycle
  • Routing: method/path routing, path params, nested groups
  • Middleware: auth, logging, recovery, tracing
  • JSON: serializer and parser with typed model mapping
  • Environment config: .env loading later, env parsing, typed config helpers
  • PostgreSQL driver: async client, pool, prepared statements
  • Logging: structured logger with JSON output option
  • File handling: streams, safe path utilities, upload helpers later
  • JWT/auth helpers: token signing, verification, password hashing later
  • Background jobs: runtime task spawning, scheduling, queues later
  • WebSocket: defer until after HTTP core is stable

7. PostgreSQL Integration

NexaCore should treat PostgreSQL as a first-class backend primitive. The syntax stays explicit, but the API should be much tighter than Python ORMs and less ceremony-heavy than many async driver stacks.

Opening a connection

use db.postgres.Pool

let pool = Pool.connect(env.require("DATABASE_URL"), max: 20)?

Select queries

let users = await pool.query<User>(
    "select id, email from users order by id"
)?

Inserts and updates

let inserted = await pool.exec(
    "insert into users(email) values($1)",
    ["a@example.com"]
)?

Transactions

let tx = await pool.begin()?
await tx.exec("update accounts set balance = balance - $1 where id = $2", [10, from])?
await tx.exec("update accounts set balance = balance + $1 where id = $2", [10, to])?
await tx.commit()?

Mapping rows to structs

struct User:
    id: Int
    email: String

let user = await pool.query_one<User>(
    "select id, email from users where id = $1",
    [id]
)?

Connection pooling

let pool = Pool.connect(url, max: 32, min: 4, idle_timeout_sec: 30)?

Async queries

All PostgreSQL APIs are async-first. Blocking database access is not part of the standard application path.

8. Developer Experience

CLI commands

  • nexacore new <name>: create a new app or library
  • nexacore build: compile package to binary or library
  • nexacore run: build and execute
  • nexacore test: run language and package tests
  • nexacore fmt: format source code
  • nexacore add <package>: add dependency
  • nexacore doc: build documentation

Example backend project layout

orders-api/
  nexa.toml
  src/
    main.nx
    api/
      routes.nx
      users.nx
    db/
      models.nx
      queries.nx
    config.nx
  tests/
    users_test.nx

9. Starter Implementation Plan

Phase 1: language spec MVP

  • freeze core syntax rules
  • define token grammar and block structure
  • define AST and type system MVP
  • define package manifest format

Phase 2: lexer/parser/AST

  • implement token definitions
  • implement indentation-aware lexer
  • implement parser for modules, functions, structs, statements, and expressions
  • snapshot parser test fixtures

Phase 3: semantic analysis

  • name resolution
  • scope tracking
  • visibility rules
  • type inference for locals
  • public API type validation

Phase 4: code generation

  • HIR and MIR lowering
  • C backend emitter
  • runtime ABI definition
  • compile driver invoking system C compiler

Phase 5: runtime and stdlib

  • string and collection runtime
  • result/option representations
  • async task scheduler
  • IO primitives

Phase 6: PostgreSQL and HTTP framework

  • async socket and HTTP runtime
  • routing and JSON helpers
  • PostgreSQL client and connection pooling
  • example backend service

Phase 7: package manager and tooling

  • manifest parser
  • dependency resolver
  • formatter
  • test runner
  • docs generator

10. Repository Structure

NexaCore/
  Cargo.toml
  README.md
  docs/
    nexacore-foundation.md
  crates/
    nxc-cli/
    nxc-driver/
    nxc-frontend/
    nxc-runtime/
  stdlib/
    core/
    db/
    http/
  packages/
  examples/
    backend-api/
  tests/
    compiler/
    integration/
  tools/

11. MVP Code Generation

Bootstrapping in Rust is the best choice:

  • excellent fit for compiler engineering
  • strong enums and pattern matching for token/AST modeling
  • memory safety for a long-lived systems project
  • good ecosystem for CLI, testing, and later LLVM/C toolchain integrations

The starter code in this repo includes:

  • token definitions
  • lexer
  • AST nodes
  • parser skeleton
  • compiler driver
  • CLI entrypoint

12. Example NexaCore Program

use core.env
use db.postgres.Pool
use web.http.{App, Response}

struct AppState:
    pool: Pool

async fn health(state: AppState) -> Response:
    let version = env.get("APP_VERSION").or("dev")
    let row = await state.pool.query_one<Map>(
        "select now() as now"
    )?

    Response.json({
        "status": "ok",
        "version": version,
        "database_time": row["now"]
    })

async fn main() -> Result<Void, AppError>:
    let database_url = env.require("DATABASE_URL")?
    let port = env.get("PORT").or("8080").to_int()?
    let pool = Pool.connect(database_url, max: 16)?

    let app = App.new()
        .state(AppState { pool: pool })
        .get("/health", health)

    await app.listen("0.0.0.0", port)?

13. Codex Execution Rules

  • prioritize correctness over false completeness
  • implement compileable starter code, not pseudocode disguised as finished work
  • leave clear TODOs where the compiler is intentionally incomplete
  • keep package boundaries aligned with the compiler pipeline
  • avoid inventing third-party dependencies unless they are explicitly added

14. Final Deliverable

Rust front-end compiler with a C backend for the MVP.

MVP scope

  • parser and type-checked core language
  • C code generation
  • native binary build flow on Linux
  • minimal runtime
  • first-party HTTP and PostgreSQL runtime modules

First coding step

Build the front-end pipeline end to end for a single file: lex -> parse -> AST dump -> diagnostics.

First files to generate

  • workspace manifest
  • nxc-frontend token/lexer/parser/AST
  • nxc-driver compile pipeline
  • nxc-cli entrypoint
  • example main.nx

Build order

  1. tokens and spans
  2. lexer
  3. AST
  4. parser
  5. diagnostics
  6. semantic resolver
  7. type checker
  8. HIR/MIR lowering
  9. C backend
  10. runtime and stdlib