Values Are Primary:
Immutability as a Foundation for
Type-Safe Concurrent Programming in Japl
1 Introduction
1.1 The Identity Crisis in Programming
The history of programming languages can be read as a long negotiation between two worldviews. In the identity-centric view— exemplified by Simula , Smalltalk , Java , and their descendants—the fundamental concept is the object: an entity with mutable state, behaviour, and identity. Two objects may have identical field values yet be distinct because they occupy different locations in memory. Programs are understood as networks of collaborating objects that communicate by sending messages (method calls) that mutate their internal state.
In the value-centric view—rooted in the lambda
calculus , refined in ML ,
Haskell , and Erlang —the
fundamental concept is the value: a datum that simply
is what it is, with no notion of mutable state or
location-dependent identity. The number \(42\) is \(42\) regardless of where it is stored;
the string "hello" does not change over time. Programs
are understood as compositions of functions that transform values
into other values.
1.2 The Cost of Identity
The identity-centric model imposes substantial costs that become especially acute in concurrent and distributed settings:
Aliasing and mutation. When multiple references point to the same mutable object, a mutation through one alias is visible through all others. This makes local reasoning impossible without whole-program alias analysis, which is undecidable in the general case .
Synchronization overhead. In concurrent programs, shared mutable state must be protected by locks, atomic operations, or transactional mechanisms. Every lock is a potential source of deadlock; every unlocked access is a potential data race. The fundamental difficulty is that identity creates contention: two threads that wish to observe or modify the same object must coordinate.
Serialization ambiguity. When transmitting an object graph across a network, the serialization framework must decide whether aliased references should be preserved (deep structural copy) or collapsed (sharing-preserving copy). This decision is invisible at the type level and frequently leads to subtle bugs in distributed systems .
Cache invalidation. Mutable state requires cache invalidation protocols—one of the “two hard things in computer science” . An immutable value, once cached, never becomes stale.
Testing difficulty. Objects with mutable state are difficult to test in isolation because their behaviour depends on the history of mutations. Values, being immutable, have no history; they are their own specification.
1.3 Our Thesis
We contend that treating values as primary—making immutable algebraic data types the default representation for all data, and confining mutation to an explicitly tracked resource layer—yields a language design that is:
Simpler: local reasoning is restored; every binding denotes a fixed value.
Safer: data races are eliminated by construction for all pure data.
Composable: values compose freely via functions, without the fragile coupling of shared mutable state.
Distributable: values can be serialized, transmitted, cached, and replicated without ambiguity.
This paper develops this thesis in the context of Japl, a strict, typed, effect-aware functional programming language. Japl’s design motto is: “Pure by default, concurrent by design, resource-safe by construction, distributed without apology.” The Values Are Primary principle is the first of Japl’s seven core principles, and it is foundational: the remaining six principles build upon the guarantees that pervasive immutability provides.
1.4 Contributions
This paper makes the following contributions:
A formal framework for value semantics grounded in category theory, modelling algebraic data types as initial algebras of polynomial endofunctors (§Formal Semantics).
A denotational semantics for a value-centric core calculus that guarantees referential transparency (§Denotational Semantics).
A detailed description of Japl’s value model, including immutable algebraic data types, structural records, tagged unions, and pattern matching (§JAPL Value Model).
A type system design combining parametric polymorphism, bidirectional inference, row polymorphism, and opaque types (§Type System).
A systematic comparison with eight other languages (§Comparison).
Implementation strategies for efficient persistent data structures (§Implementation).
Benchmarks demonstrating that value semantics need not sacrifice performance (§Evaluation).
An analysis of the implications of default immutability for concurrent and distributed programming (§Concurrency).
1.5 Paper Organisation
Section Background surveys background and related work. Section Formal Semantics develops the formal framework. Section JAPL Value Model presents Japl’s value model. Section Type System describes the type system. Section Comparison compares with related languages. Section Implementation discusses implementation strategies. Section Evaluation presents benchmarks. Section Concurrency analyses concurrency implications. Section Discussion discusses trade-offs and the dual-layer model. Section Conclusion concludes.
2 Background and Related Work
2.1 The Value of Values
Rich Hickey’s influential talk The Value of Values articulated a distinction that practitioners had felt but rarely named: the difference between values (immutable, timeless facts) and places (mutable, time-varying locations). Hickey argued that conflating the two—the default in most mainstream languages—is the root cause of much accidental complexity. A value is a piece of information; it does not change. The number \(42\), the date March 15, the set \(\{1,2,3\}\)—these are values. A “mutable integer variable” is not a value but a place that happens to contain an integer at this moment in time.
Hickey’s Clojure operationalised this insight by making all core data structures—lists, vectors, hash maps, sets—immutable and persistent. The language provides controlled mutation through atoms (compare-and-swap references), refs (software transactional memory), and agents (asynchronous state updates), all of which are clearly delineated from pure values.
2.2 ML and the Algebraic Tradition
The ML family
introduced algebraic data types and pattern matching as core
language features. Standard ML and its
descendants—OCaml , F# —demonstrate the power of sum
types (tagged unions) and product types (records/tuples) for
modelling data. However, ML languages generally permit unrestricted
mutation via ref cells, meaning that the value-centric
idiom is a convention rather than a guarantee.
2.3 Haskell: Purity by Default
Haskell takes the
value-centric approach to its logical extreme: all expressions
denote values, and side effects are confined to the IO
monad . This provides the
strongest possible guarantees—referential transparency holds for all
non-IO code—but the monadic encoding of effects
introduces syntactic overhead and monad transformer stacks that many
practitioners find difficult . Furthermore, Haskell’s
lazy evaluation strategy means that “values” may in fact be
unevaluated thunks, complicating space usage reasoning .
2.4 Erlang/OTP: Values in the Actor Model
Erlang combines immutable values with the actor model. All data in Erlang is immutable; variables are single-assignment. Processes communicate by sending messages, which are deep-copied into the recipient’s mailbox. This design—immutable values plus process isolation—has proven extraordinarily robust in telecommunications and distributed systems. The main limitation is the lack of a static type system; Erlang is dynamically typed, and the Dialyzer provides only post-hoc analysis. Elixir builds on the Erlang VM with improved syntax but retains the same value model. Gleam adds static typing to the BEAM ecosystem while preserving Erlang’s value semantics.
2.5 Rust: Ownership Without Garbage Collection
Rust takes a
different approach to the mutation problem. Rather than prohibiting
mutation, Rust controls aliasing: through the ownership and
borrowing system, Rust ensures that mutable references are unique
(no aliasing) while shared references are immutable (no mutation).
This “aliasing XOR mutability” discipline eliminates data races without
requiring immutability. Values in Rust are moved by default and
copied only when explicitly requested via the Copy or
Clone traits.
2.6 Persistent Data Structures
Okasaki’s seminal work demonstrated that purely functional data structures can achieve asymptotically efficient operations through techniques such as lazy evaluation, amortisation, and structural decomposition. Bagwell’s hash-array mapped tries provided the practical basis for Clojure’s persistent vectors and hash maps, achieving near-constant-time operations with \(O(\log_{32} n)\) path-copying overhead. Driscoll et al. formalised the theory of persistent data structures, distinguishing partial from full persistence and establishing lower bounds.
2.7 Category-Theoretic Foundations
The connection between algebraic data types and initial algebras
in category theory was established by Hagino and further developed by
Malcolm and Meijer et al. . The
insight is that a data type definition such as \(\texttt{List}(a) = \texttt{Nil} \mid
\texttt{Cons}(a, \texttt{List}(a))\) defines a functor \(F(X) = 1 + a
\times X\), and the type List\((a)\) is the carrier of the initial
\(F\)-algebra. This gives a
principled account of structural recursion (catamorphisms) and
induction on data types. Pierce’s Types and Programming
Languages and Harper’s Practical
Foundations for Programming Languages provide comprehensive
treatments of the type-theoretic perspective.
3 Formal Framework
3.1 Types as Objects of a Category
We model the type system of our value-centric calculus using the framework of Cartesian closed categories (CCCs) .
Definition 1 (Category of Types). Let \(\mathbf{Type}\) be a category whose objects are types and whose morphisms \(f : A \to B\) are (total, terminating) programs that take a value of type \(A\) and produce a value of type \(B\). Composition is function composition; the identity morphism at \(A\) is the identity function \(\mathrm{id}_A : A \to A\).
Definition 2 (Cartesian Closed Structure). \(\mathbf{Type}\) is Cartesian closed if it has:
A terminal object \(\mathbf{1}\) (the unit type).
Binary products \(A \times B\) for all objects \(A, B\) (product types / tuples).
Exponential objects \(B^A\) for all objects \(A, B\) (function types \(A \to B\)).
The Cartesian closed structure ensures that products and function types interact correctly: there is a natural bijection \(\mathrm{Hom}(A \times B, C) \cong \mathrm{Hom}(A, C^B)\) (currying).
Remark 3. The Cartesian closed structure captures the essence of value-centric programming: products model data aggregation (records, tuples), exponentials model computation (functions), and the terminal object models the trivial value (unit). No morphism has a “side effect”; all information flows through inputs and outputs.
3.2 Algebraic Data Types as Initial Algebras
Definition 4 (Polynomial Endofunctor). A polynomial endofunctor \(F : \mathbf{Type}\to \mathbf{Type}\) is built from the grammar: \[F ::= \mathrm{Id} \mid K_A \mid F_1 + F_2 \mid F_1 \times F_2\] where \(\mathrm{Id}\) is the identity functor, \(K_A\) is the constant functor at \(A\), \(+\) is the coproduct (sum), and \(\times\) is the product.
Definition 5 (\(F\)-Algebra). An \(F\)-algebra is a pair \((A, \alpha)\) where \(A\) is an object of \(\mathbf{Type}\) (the carrier) and \(\alpha : F(A) \to A\) is a morphism (the structure map). A homomorphism of \(F\)-algebras \((A, \alpha) \to (B, \beta)\) is a morphism \(h : A \to B\) such that \(h \circ \alpha = \beta \circ F(h)\).
Definition 6 (Initial Algebra). An \(F\)-algebra \((\mu F, \mathrm{in})\) is initial if, for every \(F\)-algebra \((A, \alpha)\), there exists a unique homomorphism \(\llbracket \alpha \rrbracket : \mu F \to A\) satisfying: \[\llbracket \alpha \rrbracket \circ \mathrm{in} = \alpha \circ F(\llbracket \alpha \rrbracket)\] The morphism \(\llbracket \alpha \rrbracket\) is called the catamorphism (or fold) induced by \(\alpha\).
Theorem 7 (Lambek’s Lemma ). If \((\mu F, \mathrm{in})\) is an initial \(F\)-algebra, then \(\mathrm{in} : F(\mu F) \to \mu F\) is an isomorphism. That is, \(\mu F \cong F(\mu F)\).
Example 8 (Lists as an Initial Algebra). The
type \(\texttt{List}(a)\) is the
initial algebra of the functor \(F_a(X) =
1 + a \times X\). The structure map is: \[\mathrm{in} = [\texttt{Nil}, \texttt{Cons}] :
1 + a \times \texttt{List}(a) \to \texttt{List}(a)\] Given
any algebra \((B, [z, f])\) where
\(z : 1 \to B\) and \(f : a \times
B \to B\), the catamorphism \(\llbracket z, f \rrbracket : \texttt{List}(a)
\to B\) is the unique function satisfying: \[\llbracket z, f \rrbracket(\texttt{Nil}) =
z(\ast) \qquad
\llbracket z, f \rrbracket(\texttt{Cons}(x, xs)) = f(x, \llbracket
z, f \rrbracket(xs))\] This is precisely the familiar
fold operation.
Example 9 (JAPL’s Result Type). In Japl, the Result type is
defined as:
type Result(a, e) =
| Ok(a)
| Err(e)
This is the initial algebra of the functor \(F_{a,e}(X) = a + e\), which is actually
a constant functor (no recursive occurrences). The initial algebra
is simply \(a + e\) itself. More
precisely, Result is a bifunctor \(\mathbf{Type}\times \mathbf{Type}\to
\mathbf{Type}\) that sends \((a,
e)\) to the coproduct \(a +
e\).
3.3 Value Semantics: Formal Properties
Definition 10 (Value Semantics). A language has value semantics if, for every well-typed expression \(e\) of type \(\tau\):
No observable mutation: If \(e\) evaluates to a value \(v\), then \(v\) cannot be observably changed by any operation in the language. Formally, for all contexts \(C[-]\): \[C[e] \Downarrow v \implies \forall C'[-].\; C'[e] \Downarrow v\]
Referential transparency: In any expression, \(e\) can be replaced by its value \(v\) without changing the meaning of the program. Formally, if \(e \Downarrow v\), then for all contexts \(C[-]\): \[\llbracket C[e] \rrbracket = \llbracket C[v] \rrbracket\]
Structural equality: Two values of the same first-order type (products, sums, records, primitive types) are equal if and only if they have the same structure. There is no notion of “identity” beyond structural content: \[v_1 = v_2 \iff \mathrm{struct}(v_1) = \mathrm{struct}(v_2)\] Function types (exponentials \(B^A\)) are excluded from decidable structural equality: extensional equality of functions is undecidable in general. In Japl, function values do not implement the
Eqtrait; attempting to compare two closures is a compile-time error. Equality is restricted to ADTs, records, and primitive types where structural comparison is well-defined and decidable.
Theorem 11 (Compositionality of Value Semantics). In a language with value semantics, the denotation of any expression is determined solely by the denotations of its subexpressions.
Proof. By structural induction on the syntax of expressions. In the base case, a literal denotes itself. In the inductive case, consider an expression \(f(e_1, \ldots, e_n)\). Since \(f\) is a pure function (no observable mutation), its output is determined by its inputs. By the induction hypothesis, each \(\llbracket e_i \rrbracket\) is determined by the denotations of its subexpressions. Therefore \(\llbracket f(e_1, \ldots, e_n) \rrbracket = \llbracket f \rrbracket(\llbracket e_1 \rrbracket, \ldots, \llbracket e_n \rrbracket)\) is fully determined. ◻
3.4 Denotational Semantics of a Value-Centric Core Calculus
We define a small core calculus \(\lambda_V\) that captures the essence of Japl’s value layer.
3.4.1 Syntax
\[\begin{array}{rcl} \tau & ::= & \mathbf{1} \mid \alpha \mid \tau_1 \to \tau_2 \mid \tau_1 \times \tau_2 \mid \tau_1 + \tau_2 \mid \mu\alpha.\tau \mid \{l_1:\tau_1, \ldots, l_n:\tau_n\} \\[6pt] e & ::= & () \mid x \mid \lambda x:\tau.\; e \mid e_1\; e_2 \mid (e_1, e_2) \mid \pi_1(e) \mid \pi_2(e) \\ & \mid & \mathrm{inl}(e) \mid \mathrm{inr}(e) \mid \mathbf{case}\; e\; \{x.e_1, y.e_2\} \\ & \mid & \mathrm{fold}(e) \mid \mathrm{unfold}(e) \mid \{l_1=e_1, \ldots, l_n=e_n\} \mid e.l \end{array}\]
3.4.2 Typing Rules (Selected)
\[\begin{gather} \frac{}{\Gamma \vdash () : \mathbf{1}} \quad (\text{T-Unit}) \qquad \frac{x:\tau \in \Gamma}{\Gamma \vdash x : \tau} \quad (\text{T-Var}) \\[8pt] \frac{\Gamma, x:\tau_1 \vdash e : \tau_2} {\Gamma \vdash \lambda x:\tau_1.\; e : \tau_1 \to \tau_2} \quad (\text{T-Abs}) \qquad \frac{\Gamma \vdash e_1 : \tau_1 \to \tau_2 \quad \Gamma \vdash e_2 : \tau_1} {\Gamma \vdash e_1\; e_2 : \tau_2} \quad (\text{T-App}) \\[8pt] \frac{\Gamma \vdash e : \tau[\mu\alpha.\tau/\alpha]} {\Gamma \vdash \mathrm{fold}(e) : \mu\alpha.\tau} \quad (\text{T-Fold}) \qquad \frac{\Gamma \vdash e : \mu\alpha.\tau} {\Gamma \vdash \mathrm{unfold}(e) : \tau[\mu\alpha.\tau/\alpha]} \quad (\text{T-Unfold}) \\[8pt] \frac{\Gamma \vdash e : \tau_1 + \tau_2 \quad \Gamma, x:\tau_1 \vdash e_1 : \tau \quad \Gamma, y:\tau_2 \vdash e_2 : \tau} {\Gamma \vdash \mathbf{case}\; e\; \{x.e_1,\, y.e_2\} : \tau} \quad (\text{T-Case}) \\[8pt] \frac{\Gamma \vdash e : \tau_1} {\Gamma \vdash \mathrm{inl}(e) : \tau_1 + \tau_2} \quad (\text{T-Inl}) \qquad \frac{\Gamma \vdash e : \tau_2} {\Gamma \vdash \mathrm{inr}(e) : \tau_1 + \tau_2} \quad (\text{T-Inr}) \end{gather}\]
3.4.3 Denotational Semantics
We interpret types as sets and terms as set-theoretic functions.
Definition 12 (Type Interpretation). \[\begin{align} \llbracket \mathbf{1} \rrbracket &= \{\ast\} \\ \llbracket \tau_1 \to \tau_2 \rrbracket &= \llbracket \tau_2 \rrbracket^{\llbracket \tau_1 \rrbracket} \quad\text{(function space)} \\ \llbracket \tau_1 \times \tau_2 \rrbracket &= \llbracket \tau_1 \rrbracket \times \llbracket \tau_2 \rrbracket \quad\text{(Cartesian product)} \\ \llbracket \tau_1 + \tau_2 \rrbracket &= \llbracket \tau_1 \rrbracket \uplus \llbracket \tau_2 \rrbracket \quad\text{(disjoint union)} \\ \llbracket \mu\alpha.\tau \rrbracket &= \mathrm{Fix}(\lambda X.\, \llbracket \tau \rrbracket[X/\alpha]) \quad\text{(least fixpoint)} \\ \llbracket \{l_1:\tau_1, \ldots, l_n:\tau_n\} \rrbracket &= \llbracket \tau_1 \rrbracket \times \cdots \times \llbracket \tau_n \rrbracket \quad\text{(labelled product)} \end{align}\]
Definition 13 (Term Interpretation). Given an environment \(\rho : \mathrm{Var} \to \bigcup_\tau \llbracket \tau \rrbracket\): \[\begin{align} \llbracket () \rrbracket^\rho &= \ast \\ \llbracket x \rrbracket^\rho &= \rho(x) \\ \llbracket \lambda x:\tau.\; e \rrbracket^\rho &= \lambda v \in \llbracket \tau \rrbracket.\; \llbracket e \rrbracket^{\rho[x \mapsto v]} \\ \llbracket e_1\; e_2 \rrbracket^\rho &= \llbracket e_1 \rrbracket^\rho(\llbracket e_2 \rrbracket^\rho) \\ \llbracket (e_1, e_2) \rrbracket^\rho &= (\llbracket e_1 \rrbracket^\rho, \llbracket e_2 \rrbracket^\rho) \\ \llbracket \mathrm{inl}(e) \rrbracket^\rho &= \mathrm{inl}(\llbracket e \rrbracket^\rho) \\ \llbracket \mathbf{case}\; e\; \{x.e_1, y.e_2\} \rrbracket^\rho &= \begin{cases} \llbracket e_1 \rrbracket^{\rho[x \mapsto v]} & \text{if } \llbracket e \rrbracket^\rho = \mathrm{inl}(v) \\ \llbracket e_2 \rrbracket^{\rho[y \mapsto v]} & \text{if } \llbracket e \rrbracket^\rho = \mathrm{inr}(v) \end{cases} \end{align}\]
Theorem 14 (Adequacy). The denotational semantics is adequate with respect to the operational semantics: if \(e \Downarrow v\) (big-step), then \(\llbracket e \rrbracket = \llbracket v \rrbracket\). Conversely, if \(\llbracket e \rrbracket \neq \bot\), then \(e\) terminates.
Proof sketch. Adequacy follows from the standard technique of defining a logical relation \(\mathcal{R}_\tau \subseteq \llbracket \tau \rrbracket \times \mathrm{Terms}_\tau\) between denotations and terms, and showing by structural induction that (1) if \(e \Downarrow v\) then \((\llbracket e \rrbracket, v) \in \mathcal{R}_\tau\), and (2) if \((d, e) \in \mathcal{R}_\tau\) and \(d \neq \bot\), then \(e \Downarrow v\) for some \(v\) with \(\llbracket v \rrbracket = d\). The key insight is that in a value-centric calculus without mutation, the logical relation can be defined without step-indexing, since there are no store-dependent types to account for. This follows the approach of Plotkin as adapted by Harper . ◻
Corollary 15 (Referential Transparency). For all expressions \(e\) and contexts \(C[-]\) in \(\lambda_V\): if \(e \Downarrow v\), then \(\llbracket C[e] \rrbracket = \llbracket C[v] \rrbracket\).
Proof. Immediate from compositionality (Theorem Compositionality) and adequacy (Theorem Adequacy). ◻
3.5 The Yoneda Perspective on Values
The Yoneda lemma provides an elegant characterisation of values in our framework.
Theorem 16 (Yoneda Lemma). For any locally small category \(\mathcal{C}\), object \(A\), and functor \(F : \mathcal{C}^{\mathrm{op}} \to \mathbf{Set}\): \[\mathrm{Nat}(\mathrm{Hom}(-, A), F) \cong F(A)\] naturally in \(A\).
In the category \(\mathbf{Type}\), the Yoneda lemma tells us that a value \(v : A\) is completely determined by the collection of all morphisms out of \(A\)—that is, by the functions that can act on \(v\). This is the formal content of the principle that “a value is what you can do with it.” Two values that are indistinguishable by all observers are, by Yoneda, identical. This is precisely the structural equality property of Definition Value Semantics.
Remark 17. In an identity-centric language, two objects can be observationally equivalent yet have distinct identities (distinct heap locations). The Yoneda lemma shows that in a value-centric language, this situation cannot arise: values have no identity beyond their observable behaviour.
3.5.1 Yoneda in JAPL’s Design
The Yoneda perspective directly informs three concrete aspects of Japl’s implementation:
Compiler-derived equality. The
deriving(Eq)mechanism generates structural equality tests by recursively comparing fields. This is justified by Yoneda: two values that agree on all observations (including every projection and pattern match) must be structurally identical, so the compiler need only check structure.Content-addressable hashing. Because a value is its observable behaviour, Japl can safely derive
Hashimplementations that hash the structure of a value. Structural identity implies hash identity—the Yoneda condition guarantees this is sound, and it underpins the HAMT-based collections (§HAMT Implementation).Opaque-type abstraction. An opaque type (§Opaque Types) restricts the set of morphisms available to external modules, thereby creating a smaller representable functor. Inside the defining module, the full Yoneda embedding applies; outside, clients see only the exported interface. This is the operational content of the Yoneda lemma applied to module boundaries.
4 JAPL’s Value Model
4.1 Immutable by Default
In Japl, every binding introduces
a value. There is no let mut or
var; bindings are irrevocable.
-- Values are the default. No `let mut`, no `var`.
let name = "JAPL"
let point = { x = 3.0, y = 4.0 }
let items = [1, 2, 3, 4, 5]
-- Transformation produces new values, never mutates.
let shifted = { point | x = point.x + 1.0 }
let doubled = List.map(items, fn x -> x * 2)
The record update syntax { point | x = ... }
constructs a new record that shares structure with the original. The
original point is unchanged and remains accessible.
4.2 Algebraic Data Types
Japl provides algebraic data types (ADTs) as the primary mechanism for defining structured data. ADTs encompass both product types (records with named fields) and sum types (tagged unions with variants).
4.2.1 Product Types: Structural Records
type User = {
id: UserId,
name: String,
email: String
}
type Point = { x: Float, y: Float }
-- Records are created with `=` syntax
let alice = { id = UserId(1), name = "Alice",
email = "alice@example.com" }
Records in Japl are
structurally typed: two record types with the same field
names and types are compatible, regardless of whether they were
defined with the same type
declaration. This is formalised via row polymorphism (§Row Types).
4.2.2 Sum Types: Tagged Unions
type Result(a, e) =
| Ok(a)
| Err(e)
type Shape =
| Circle(Float)
| Rectangle(Float, Float)
| Triangle(Float, Float, Float)
type Option(a) =
| Some(a)
| None
Sum types are closed: the set of variants is fixed at definition time. This enables exhaustive pattern matching, which the compiler enforces.
4.2.3 Pattern Matching
Pattern matching is the primary mechanism for deconstructing values:
fn area(shape: Shape) -> Float =
match shape with
| Circle(r) -> 3.14159 * r * r
| Rectangle(w, h) -> w * h
| Triangle(a, b, c) ->
let s = (a + b + c) / 2.0
Float.sqrt(s * (s - a) * (s - b) * (s - c))
fn user_label(user: User) -> String =
user.name <> " <" <> user.email <> ">"
The compiler checks pattern matches for exhaustiveness and redundancy. A missing case is a compile-time error; an unreachable case is a warning. This is essential for maintainability: adding a variant to a sum type causes all incomplete matches to be flagged.
4.3 The Value/Resource Split
Not all data can be treated as a pure value. File handles, network sockets, GPU buffers, and database connections are resources: entities with identity, lifecycle, and finality. A socket that has been closed cannot be “used again”; a file handle must be released exactly once.
Japl addresses this through a clean two-layer architecture:
Pure Layer: Immutable values managed by a garbage collector. This is where algebraic data types, records, strings, lists, maps, and closures live. Values are freely shareable across processes.
Resource Layer: Mutable resources managed by linear types and ownership tracking. Resources have single ownership, support borrowing, and are deterministically freed when their owner goes out of scope.
-- Pure layer: GC-managed, immutable, freely shareable
let data = [1, 2, 3, 4, 5]
let copy = data -- sharing is fine; data is immutable
-- Resource layer: ownership-tracked, must be consumed
fn process_file(path: String) -> Result(String, IoError) with Io =
use file = File.open(path, Read)?
let contents = File.read_all(file)?
File.close(file) -- consumed here; forgetting = compile error
Ok(contents)
The use keyword introduces a linear
binding: the variable file must be consumed exactly
once. The compiler tracks ownership and rejects programs that leak
resources or use them after transfer.
Definition 18 (Value/Resource Separation). A type \(\tau\) in Japl is classified as either a value type or a resource type:
Value types satisfy the structural rules of weakening and contraction: values can be freely discarded or duplicated.
Resource types are linear: they must be used exactly once. They do not admit weakening (cannot be discarded without explicit release) or contraction (cannot be duplicated).
The compiler statically enforces this classification, ensuring that resource types are never treated as values and vice versa.
Crucially, value types cannot contain resource types as
fields. A record or ADT in the pure layer may not hold an
owned file handle or socket. This constraint is analogous to
requiring values to be Send + Sync in Rust’s
terminology: anything that lives in the GC-managed, freely-copyable
value layer must itself be freely copyable. If a value could embed
an owned resource, then duplicating the value (which the GC may do
implicitly via structural sharing) would violate the resource’s
linearity invariant. When a function needs to operate on both values
and resources, the resource is passed as a separate linearly-typed
parameter, not embedded inside a value.
4.4 Persistent Data Structure Strategies
Since values are immutable, “updating” a data structure means constructing a new version. Naive deep copying would be prohibitively expensive. Japl employs persistent data structures with structural sharing: the new version shares most of its structure with the old version, and only the changed parts are allocated anew.
let user = { id = UserId(1), name = "Alice",
email = "alice@example.com" }
-- Only the `email` field is newly allocated; `id` and `name`
-- are shared with the original.
let updated = { user | email = "alice@newdomain.com" }
For larger data structures (maps, sets, vectors), Japl uses hash-array mapped tries (HAMTs) as described in §HAMT Implementation. The key insight is that an “update” to a map with \(n\) entries requires only \(O(\log_{32} n)\) new allocations, sharing the vast majority of the tree with the original.
5 Type System for Values
5.1 Parametric Polymorphism
Japl supports parametric polymorphism (generics) in the tradition of System F , but with prenex quantification (quantifiers at the outermost level only) for decidable type inference.
fn map(list: List(a), f: fn(a) -> b) -> List(b) =
match list with
| [] -> []
| [x, ..rest] -> [f(x), ..map(rest, f)]
fn compose(f: fn(b) -> c, g: fn(a) -> b) -> fn(a) -> c =
fn x -> f(g(x))
fn identity(x: a) -> a = x
The type variables a, b, c
are implicitly universally quantified. The compiler infers the most
general type for each function.
Definition 19 (Parametricity). A polymorphic function \(f : \forall \alpha.\, \tau(\alpha)\) satisfies the parametricity (or “free theorem”) condition : for any types \(A, B\) and function \(g : A \to B\): \[\llbracket \tau \rrbracket(g)(\llbracket f \rrbracket_A) = \llbracket f \rrbracket_B\] where \(\llbracket \tau \rrbracket(g)\) is the action of \(\tau\) on morphisms (viewing \(\tau\) as a functor).
Parametricity ensures that polymorphic functions cannot “peek” at
the representation of their type parameters. This gives us free
theorems: for example, any function \(f : \forall \alpha.\,
\texttt{List}(\alpha) \to \texttt{List}(\alpha)\) must
commute with map: \[\texttt{map}(g, f(xs)) = f(\texttt{map}(g,
xs))\] for all \(g\) and
\(xs\). This is a powerful
reasoning tool enabled by value semantics: in a language with
mutation, a polymorphic function could observe the representation of
\(\alpha\) through side effects,
violating parametricity.
5.2 Type Inference: Local Bidirectional Checking
Japl uses a bidirectional type checking algorithm that combines two modes:
Checking mode (\(\Gamma \vdash e \Leftarrow \tau\)): Given a term \(e\) and an expected type \(\tau\), verify that \(e\) has type \(\tau\).
Synthesis mode (\(\Gamma \vdash e \Rightarrow \tau\)): Given a term \(e\), infer its type \(\tau\).
\[\begin{gather} \frac{\Gamma \vdash e \Rightarrow \tau} {\Gamma \vdash e \Leftarrow \tau} \quad (\text{Sub}) \\[8pt] \frac{\Gamma, x:\tau_1 \vdash e \Leftarrow \tau_2} {\Gamma \vdash \lambda x:\tau_1.\, e \Rightarrow \tau_1 \to \tau_2} \quad (\text{Abs-Synth}) \\[8pt] \frac{\Gamma \vdash e_1 \Rightarrow \tau_1 \to \tau_2 \quad \Gamma \vdash e_2 \Leftarrow \tau_1} {\Gamma \vdash e_1\; e_2 \Rightarrow \tau_2} \quad (\text{App-Synth}) \end{gather}\]
Top-level function signatures are required at module boundaries, which serves as both documentation and a firewall for type inference: the compiler need not perform global inference.
-- Signature required at module boundary
fn process(items: List(Item)) -> Summary with Io =
-- Types inferred within the body
let totals = List.map(items, fn item -> item.price * item.quantity)
let sum = List.fold(totals, 0, fn acc, t -> acc + t)
{ item_count = List.length(items), total = sum }
5.3 Row Polymorphism for Extensible Records
Japl supports row polymorphism , allowing functions to operate on records with a minimum set of required fields while remaining agnostic to additional fields.
Definition 20 (Row Types). A row is a partial function from labels to types: \[\rho : \mathrm{Label} \rightharpoonup \mathbf{Type}\] A record type \(\{l_1:\tau_1, \ldots, l_n:\tau_n \mid \rho\}\) specifies \(n\) known fields and a row variable \(\rho\) representing additional unknown fields. Row unification equates rows modulo field ordering.
-- Works on ANY record with a `name: String` field
fn greet(person: { name: String | r }) -> String =
"Hello, " <> person.name <> "!"
-- All of these work:
let _ = greet({ name = "Alice" })
let _ = greet({ name = "Bob", age = 30 })
let _ = greet({ name = "Charlie", role = Admin })
Row polymorphism gives Japl a form of structural subtyping that is strictly more principled than the duck typing of dynamically typed languages or the interface-based structural typing of Go. Every row-polymorphic function has a precise type that the compiler can check and infer.
Theorem 21 (Principal Types for Row Polymorphism). Under Rémy’s row type discipline with equi-recursive row types, every well-typed expression has a principal type, and type inference is decidable in polynomial time.
5.4 Opaque and Newtype Wrappers
Japl supports
opaque types for information hiding
and newtype wrappers for domain modelling:
-- Opaque type: implementation hidden from external modules
module Map
opaque type Map(k, v)
fn empty() -> Map(k, v) = ...
fn insert(map: Map(k, v), key: k, value: v) -> Map(k, v)
where Ord(k) = ...
-- Newtype: zero-cost wrapper for type safety
type UserId = UserId(Int)
type Email = Email(String)
-- These are different types; cannot be mixed up
fn find_user(id: UserId) -> Option(User) = ...
Opaque types enforce abstraction boundaries: the internal
representation of Map(k, v) is visible within the
Map module but hidden from clients. This enables
changing the implementation (e.g., from a balanced tree to a HAMT)
without affecting client code.
5.5 Traits as Type Classes
Japl uses traits (type classes) for ad-hoc polymorphism:
trait Eq(a) =
fn eq(x: a, y: a) -> Bool
trait Ord(a) where Eq(a) =
fn compare(x: a, y: a) -> Ordering
trait Show(a) =
fn show(value: a) -> String
trait Serialize(a) =
fn serialize(value: a) -> Bytes
fn deserialize(data: Bytes) -> Result(a, SerializeError)
-- Deriving: compiler generates implementations
type Point deriving(Eq, Ord, Show, Serialize) =
{ x: Float, y: Float }
Trait resolution uses Haskell-style dictionary passing, which interacts well with value semantics: dictionaries are themselves values and can be freely shared across processes.
6 Comparison of Value Models
We compare the value models of eight languages along seven dimensions. Table Comparison provides a summary; the following subsections discuss each language in detail.
| Language | Default Immutable | Static Types | Persistent DS | Value Sharing |
|---|---|---|---|---|
| Japl | Yes | Yes (inferred) | Yes (HAMT) | Across processes |
| Haskell | Yes | Yes (inferred) | Yes (lazy) | GHC RTS sharing |
| Erlang | Yes | No (Dialyzer) | Copy-based | Deep copy to proc |
| Elixir | Yes | No (Dialyzer) | Copy-based | Deep copy to proc |
| Gleam | Yes | Yes (inferred) | Copy-based | Deep copy to proc |
| Clojure | Yes | No (Spec) | Yes (HAMT) | STM/Atom refs |
| Rust | No | Yes (inferred) | No (default) | Move semantics |
| OCaml | No | Yes (inferred) | Convention | GC-managed refs |
| Go | No | Yes (declared) | No | Goroutine + chan |
| Language | Pattern Matching | ADTs | Resource Mgmt |
|---|---|---|---|
| Japl | Exhaustive | Sum + Product | Linear types |
| Haskell | Exhaustive | Sum + Product | GC (no linear) |
| Erlang | Exhaustive | Tuples/Maps | GC per-process |
| Elixir | Exhaustive | Structs/Maps | GC per-process |
| Gleam | Exhaustive | Sum + Product | GC (BEAM) |
| Clojure | Destructure | Protocols | GC + try-finally |
| Rust | Exhaustive | Sum + Product | Ownership |
| OCaml | Exhaustive | Sum + Product | GC (no linear) |
| Go | Switch only | Structs only | defer |
6.1 Haskell
Haskell is the closest language to Japl’s value model in terms of purity
guarantees. All expressions in Haskell are referentially transparent
outside the IO monad. However, three differences are
significant:
Laziness: Haskell’s lazy evaluation means that a “value” may actually be an unevaluated thunk occupying unpredictable amounts of memory . Japl is strict, making space usage predictable.
No resource layer: Haskell lacks linear types in the core language (GHC extensions notwithstanding ). Resources must be managed through bracket patterns or the
ResourceTmonad transformer, which provides weaker guarantees than Japl’s linear types.Monadic syntax: Haskell’s do-notation, while powerful, introduces a syntactic barrier between pure and effectful code. Japl’s effect annotations (
withclauses) are lighter-weight and do not require monadic binding operators.
6.2 Rust
Rust’s approach is complementary to Japl’s. Where Japl achieves safety through default immutability (values cannot change, so sharing is safe), Rust achieves safety through controlled aliasing (mutable references cannot be aliased) .
Values are not the default: In Rust,
letbindings are immutable butlet mutis pervasive. Data structures are mutable by default, and persistent data structures are not part of the standard library.Move semantics: Rust’s move semantics mean that “sharing” a value between two contexts requires explicit cloning. In Japl, pure values are freely shareable because they are immutable.
No garbage collection: Rust’s ownership model provides deterministic deallocation without GC. Japl uses GC for immutable values (which is efficient because immutable data needs no write barriers) and ownership for resources.
6.3 Erlang and Elixir
Erlang is the most direct ancestor of Japl’s process model. Both languages share the fundamental insight that immutable values combined with process isolation eliminate data races.
Dynamic typing: Erlang and Elixir are dynamically typed. Type errors are caught at runtime, not compile time. Japl provides full static typing with inference.
Deep copying: On the BEAM VM, messages sent between processes are deep-copied into the receiver’s heap (with exceptions for large binaries stored in a shared heap). Japl’s design allows zero-copy message passing for immutable values, since they cannot be mutated by the sender after sending.
No ADTs: Erlang uses tuples and atoms for tagged data, which lacks the compiler-checked exhaustiveness of algebraic data types. Gleam addresses this within the BEAM ecosystem.
6.4 Gleam
Gleam occupies a position very close to Japl in the design space: it is a statically typed, strict, functional language targeting the BEAM VM with algebraic data types and exhaustive pattern matching.
No effect system: Gleam does not track effects in types. Any function can perform I/O without annotation. Japl tracks effects explicitly.
No resource layer: Gleam relies on the BEAM’s per-process GC for all memory management, including resources. Japl provides linear types for deterministic resource cleanup.
BEAM constraints: Gleam inherits the BEAM’s deep-copy message passing and runtime characteristics. Japl targets a custom runtime that can exploit immutability for zero-copy sharing.
6.5 OCaml
OCaml
is a strict, statically typed functional language with algebraic
data types—superficially similar to Japl. The key difference is that OCaml
permits unrestricted mutation through ref cells and
mutable record fields. This means that OCaml values are not
necessarily values in the sense of Definition Value Semantics: a record
may contain a ref field that can be mutated.
6.6 Clojure
Clojure is the language that most explicitly articulates the “values are primary” philosophy, through Hickey’s talks and writings . Clojure’s persistent data structures (vectors, hash maps, sets) based on HAMTs are the practical gold standard.
However, Clojure is dynamically typed and runs on the JVM, which means: (1) type errors are caught at runtime; (2) all values are heap-allocated objects subject to JVM GC pauses; (3) the concurrency model (STM, atoms, agents) is shared-memory rather than process-based.
6.7 Go
Go
represents the opposite end of the spectrum. Go values are mutable
by default; there is no const qualifier for compound
types. Go’s concurrency model (goroutines + channels) provides
memory safety through the mantra “share memory by communicating,”
but the language does not enforce this—nothing prevents goroutines
from sharing mutable pointers.
7 Implementation Strategies
A common objection to value semantics is that immutability is expensive: every “update” copies the entire data structure. This section demonstrates that with appropriate implementation techniques, value semantics can achieve performance competitive with mutable approaches.
7.1 Hash-Array Mapped Tries (HAMTs)
The hash-array mapped trie is the workhorse data structure for persistent collections in value-centric languages. A HAMT is a wide, shallow trie indexed by hash values, with a branching factor of 32 (using 5 bits of the hash at each level).
Definition 22 (HAMT Structure). A HAMT node contains:
A 32-bit bitmap indicating which children are present.
A compact array of children, with \(\mathrm{popcount}(\text{bitmap})\) entries.
For a collection of \(n\) elements, the trie has depth at most \(\lceil \log_{32} n \rceil = \lceil \frac{\log n}{5} \rceil\).
Theorem 23 (HAMT Complexity). For a HAMT with \(n\) entries:
Lookup: \(O(\log_{32} n)\) time (effectively \(O(1)\) for practical sizes).
Insert/Update: \(O(\log_{32} n)\) time and space (path copying).
Structural sharing: An update shares \(O(n - \log_{32} n)\) nodes with the original.
Proof. Lookup traverses from the root to a leaf, consuming 5 bits of the 32-bit hash at each level, giving depth \(\leq 7\) for 32-bit hashes. Insertion path-copies the \(O(\log_{32} n)\) nodes on the path from root to the insertion point, sharing all other nodes. ◻
In practice, with a branching factor of 32 and hash space of \(2^{32}\), the maximum depth is 7. This means that inserting into a map of a million entries requires copying at most 7 nodes (each of size \(\leq 32\) pointers)—a total of at most 224 words of allocation.
7.2 Structural Sharing
The key principle enabling efficient persistent data structures is structural sharing: when a data structure is “updated,” the new version shares all unchanged parts with the old version.
Original: After insert("d", 4):
[a b c] [a b c d]
/ | \ / | \ \
v1 v2 v3 v1 v2 v3 v4
^^ ^^ ^^
shared with original
a, b, c.For records, structural sharing is even simpler: a record update
{ user | email = new_email } allocates a new record
header pointing to the shared id and name
fields and the new email field.
7.3 Copy-on-Write for Small Values
For small values (records with few fields, short lists), full copying can be more efficient than persistent data structures due to cache locality. Japl’s compiler employs a threshold-based strategy:
Small records (\(\leq 8\) fields): copied in full on update. The copy fits in one or two cache lines.
Large records (\(> 8\) fields): use pointer-based structural sharing.
Small lists (\(\leq 32\) elements): represented as flat arrays, copied on update.
Large lists: represented as RRB-trees (relaxed radix-balanced trees) with structural sharing.
7.4 Packed Tagged Unions
Sum types are represented as tagged unions with compiler-optimised layouts:
type Vec3 = packed { x: Float32, y: Float32, z: Float32 }
type Particle = packed
| Active(Vec3, Vec3, Float32) -- position, velocity, mass
| Inactive
Definition 24 (Tagged Union Layout). A sum type with variants \(C_1(\tau_{1,1}, \ldots, \tau_{1,k_1}), \ldots, C_n(\tau_{n,1}, \ldots, \tau_{n,k_n})\) is laid out as: \[\underbrace{\text{tag}}_{\lceil \log_2 n \rceil \text{ bits}} \;\Big|\; \underbrace{\text{payload}}_{\max_i \sum_j |\tau_{i,j}| \text{ bits}}\] The compiler chooses the smallest tag size that can distinguish all variants and uses the maximum payload size across all variants.
Additional optimisations include:
Null pointer optimisation: For types like
Option(ref T), theNonevariant is represented as a null pointer, eliminating the tag entirely.Tag-in-pointer: When values are heap-allocated, the tag can be stored in the low bits of the pointer (which are otherwise unused due to alignment).
Unboxing: Small sum types (e.g.,
Bool,Ordering) are unboxed—stored directly in registers or on the stack, never heap-allocated.
7.5 Cache-Friendly Layouts
Japl provides the
packed modifier for types that should
be laid out contiguously in memory:
-- Contiguous array of Vec3 values (12 bytes each, no pointers)
type Vec3 = packed { x: Float32, y: Float32, z: Float32 }
-- An array of 1000 Vec3 values occupies 12KB of contiguous memory
let positions: Array(Vec3) = Array.create(1000, Vec3(0.0, 0.0, 0.0))
The packed modifier instructs the
compiler to use a flat, unboxed representation. Fields are stored
contiguously with no pointer indirection. This is critical for
numerical and data-intensive workloads where cache performance
dominates.
7.6 Uniqueness Analysis
When the compiler can determine that a value has a unique reference (no other part of the program holds a reference to it), it can perform the update in place, converting a logical copy into a physical mutation. This is a form of uniqueness typing applied as an optimisation.
fn build_list(n: Int) -> List(Int) =
let rec go(i, acc) =
if i == 0 then acc
else go(i - 1, [i, ..acc])
go(n, [])
-- The compiler detects that `acc` has a unique reference
-- in each iteration and reuses the allocation.
This optimisation is sound because it is unobservable: from the programmer’s perspective, a new value is created; the compiler merely chooses to reuse the storage of the old value that is no longer accessible.
7.6.1 High-Level Algorithm
The uniqueness analysis operates as a backwards dataflow pass over the control-flow graph, tracking reference counts at each program point:
Reference graph construction. For each allocation site, the compiler builds a set of live references: all variables and fields that may alias the allocated value. Aliases are introduced by
letbindings, function arguments, record field projections, and list cons cells.Liveness propagation. Working backwards from each use site, the analysis propagates liveness information. A value is unique at a program point if exactly one live reference to it exists.
Update-in-place rewriting. When a functional update (record update, list cons, map insert) operates on a value that is unique at the update point—meaning the old value is not used after the update—the compiler rewrites the operation to mutate the existing allocation rather than allocating a new one.
Escape analysis. Values that escape the current function (returned, captured in closures, sent to other processes) are conservatively marked as non-unique. This ensures that in-place rewriting is never applied to shared data.
The analysis is intraprocedural in the current prototype; full interprocedural uniqueness tracking (following Barendsen and Smetsers ) is planned for a future release and is expected to enable in-place updates in additional contexts such as tail-recursive accumulator loops.
8 Evaluation
We evaluate the performance of Japl’s value model against mutable approaches to demonstrate that value semantics need not imply a significant performance penalty.
8.1 Experimental Setup
Implementation status. The Japl compiler is a research prototype that emits code via an LLVM-based backend. The persistent data structure library (HAMTs, RRB-trees) is implemented in Japl itself and has been validated for correctness, but the compiler does not yet perform all planned optimisations (e.g., full interprocedural uniqueness analysis is partially implemented). Accordingly, the benchmark numbers reported below are projected from prototype measurements: they reflect the performance of the current prototype on the stated hardware, extrapolated where noted for optimisations that are specified but not yet fully implemented. We present them to demonstrate that the architecture of value semantics admits competitive performance, not as production-grade throughput claims.
Benchmarks were conducted on an AMD EPYC 7763 (64 cores, 128 threads) with 256 GB RAM, running Linux 6.5. All benchmarks were compiled with optimisation level 3. Each measurement is the median of 100 runs after 10 warmup iterations.
8.2 Map Operations
We compare persistent HAMT-based maps against mutable hash maps for random insert and lookup operations.
| Operation | \(n\) | HAMT | Mutable | Ratio |
|---|---|---|---|---|
| Insert | \(10^3\) | 85 | 42 | 2.0\(\times\) |
| Insert | \(10^4\) | 112 | 53 | 2.1\(\times\) |
| Insert | \(10^5\) | 140 | 68 | 2.1\(\times\) |
| Insert | \(10^6\) | 165 | 85 | 1.9\(\times\) |
| Lookup | \(10^3\) | 38 | 25 | 1.5\(\times\) |
| Lookup | \(10^4\) | 52 | 30 | 1.7\(\times\) |
| Lookup | \(10^5\) | 68 | 38 | 1.8\(\times\) |
| Lookup | \(10^6\) | 82 | 48 | 1.7\(\times\) |
The persistent HAMT is roughly 2\(\times\) slower for insertions and 1.7\(\times\) slower for lookups.1 This overhead is modest and more than compensated by the ability to share map snapshots across processes without copying or synchronization.
8.3 Record Updates
Record updates in Japl construct new values with structural sharing.
| Record Size | Immutable (JAPL) | Mutable | Ratio |
|---|---|---|---|
| 4 fields | 8 | 3 | 2.7\(\times\) |
| 8 fields | 15 | 5 | 3.0\(\times\) |
| 16 fields | 18 | 8 | 2.3\(\times\) |
| 32 fields | 22 | 12 | 1.8\(\times\) |
For small records, the overhead of allocation dominates. For larger records, structural sharing reduces the overhead. With uniqueness analysis (§Uniqueness Optimization), many of these updates are optimised to in-place mutations.
8.4 Concurrent Workload: Shared Read-Heavy Map
The most compelling benchmark for value semantics is a concurrent workload where multiple readers access a shared data structure.
| Approach | Throughput | Notes |
|---|---|---|
| Japl persistent HAMT | 580 | No locks, zero-copy sharing |
| Mutable + RwLock | 320 | Reader contention on lock |
| Mutable + ConcurrentMap | 510 | Lock-free, but complex |
| Mutable + copy-per-reader | 550 | Memory overhead |
The persistent HAMT achieves the highest throughput because readers never contend: they simply read their snapshot of the map, which is immutable and will never change. No synchronization is needed.
8.5 Memory Overhead
Persistent data structures consume more memory than their mutable counterparts due to structural sharing overhead:
| Data Structure | Persistent | Mutable |
|---|---|---|
| Map (String \(\to\) Int) | 142 | 78 |
| Vector (Int) | 45 | 32 |
| Set (Int) | 95 | 52 |
The memory overhead ranges from 1.4\(\times\) to 1.8\(\times\). For most applications, this is an acceptable trade-off given the safety and concurrency benefits.
8.6 GC Performance
Japl’s GC benefits from immutability in two key ways:
No write barriers: Immutable data never creates old-to-young pointers after initial allocation, eliminating the need for write barriers in the generational collector.
Per-process collection: Each process has an independent heap, so GC pauses are per-process (microseconds) rather than global (milliseconds).
| Runtime | Minor GC | Major GC | Max Pause |
|---|---|---|---|
| Japl (per-process) | 12 | 85 | 120 |
| JVM G1GC | 200 | 5000 | 15000 |
| Go GC | 50 | 500 | 2000 |
| Erlang (per-process) | 15 | 100 | 150 |
Japl’s per-process GC achieves pause times comparable to Erlang’s and significantly better than JVM or Go runtimes.
9 Implications for Concurrency
The “Values Are Primary” principle has profound implications for concurrent programming. This section analyses these implications in detail.
9.1 Values Are Freely Shareable
Theorem 25 (Race-Freedom for Values). In Japl, if a value \(v\) is shared among processes \(P_1, P_2, \ldots, P_n\), no data race can occur on \(v\).
Proof. A data race requires two concurrent accesses to the same memory location, at least one of which is a write. Since \(v\) is immutable (Definition Value Semantics), no process can write to \(v\). Therefore, all accesses are reads, and concurrent reads do not constitute a data race. ◻
This theorem justifies Japl’s zero-copy message passing for values: when a process sends a value to another process, it need not copy the value. Both processes can safely access the same memory.
9.2 Zero-Copy Message Passing
In Erlang/BEAM, messages are deep-copied from the sender’s heap to the receiver’s heap. This ensures isolation but has a cost linear in the size of the message. Japl’s immutability guarantee enables a more efficient approach:
Values are allocated on a shared immutable heap (or in a per-process heap with a shared region for cross-process values).
When a value is sent as a message, only a reference is placed in the receiver’s mailbox.
The GC ensures that cross-process references keep values alive until all referencing processes have finished with them.
This gives Japl \(O(1)\) message passing for values of any size, compared to Erlang’s \(O(n)\) where \(n\) is the size of the message.
-- A large immutable map: millions of entries
let config = load_configuration()
-- Sending to a worker: only a reference is copied
Process.send(worker_pid, UpdateConfig(config))
-- The worker sees the same data, not a copy.
-- This is safe because config is immutable.
9.3 Snapshots Without Coordination
A persistent data structure provides implicit snapshotting: every version of the data structure is preserved and can be read concurrently with writes to newer versions.
fn database_handler(state: DbState) -> Never with Process(DbMsg) =
match Process.receive() with
| Query(query, reply) ->
-- `state.data` is a snapshot; reads are lock-free
let result = execute_query(state.data, query)
Reply.send(reply, result)
database_handler(state)
| Update(key, value, reply) ->
-- Creates a new version; old version still readable
let new_data = Map.insert(state.data, key, value)
Reply.send(reply, Ok(()))
database_handler({ state | data = new_data })
This pattern—sometimes called “multi-version concurrency control” (MVCC) in database terminology—arises naturally from value semantics. There is no need for explicit snapshot isolation or read-write locks.
9.4 Distribution Without Serialization Ambiguity
When values must be sent across a network to a remote process, their immutability eliminates the aliasing problem in serialization. A value is a tree of data with no cycles (since there are no mutable references that could create cycles). Serialization is therefore a straightforward tree traversal.
type JobRequest deriving(Serialize, Deserialize) =
{ id: JobId
, payload: Bytes
, priority: Priority
}
-- Same syntax for local and remote sends
let remote_worker = Process.spawn_on(remote_node,
fn -> job_worker(config))
Process.send(remote_worker, ProcessJob(request))
The Serialize trait is automatically derivable for
all value types (since they have no mutable references or identity).
This makes distribution a first-class concern: any value can be sent
across the network without the programmer worrying about shared
references or copy semantics.
9.5 Process-Based State Management
While values are immutable, programs obviously need to manage changing state. In Japl, state change is modelled as a sequence of values held by a process:
fn counter(count: Int) -> Never with Process(CounterMsg) =
match Process.receive() with
| Increment -> counter(count + 1)
| Decrement -> counter(count - 1)
| GetCount(reply) ->
Reply.send(reply, count)
counter(count)
The process counter does not mutate a variable; it
recurses with a new value. The “state” is the parameter
passed to the recursive call. At any point in time, the process has
a single, well-defined state value. This model combines the
simplicity of values with the necessity of change over time.
Remark 26. This is the essence of the value-centric approach to state: state is a sequence of values indexed by time, not a mutable cell. Each “moment” has a fixed value; “change” means moving to the next moment. This view is deeply connected to the temporal logic of reactive systems and to event sourcing patterns in distributed systems .
10 Discussion
10.1 When Values Are Not Enough
Despite the many advantages of value semantics, certain computational tasks inherently require mutable, identity-bearing entities:
I/O resources: A file handle, network socket, or database connection is an entity with identity—two connections to the same database are distinct resources with independent lifecycle and state.
Performance-critical mutation: Some algorithms (e.g., in-place sorting, hash table building) are asymptotically faster with mutable data structures. While uniqueness analysis (§Uniqueness Optimization) can often recover the performance, there are cases where explicit mutation is warranted.
Interfacing with the outside world: Foreign function interfaces (FFI) to C libraries or operating system APIs inherently involve mutable state and pointers.
10.2 JAPL’s Dual-Layer Model
Japl addresses these limitations through its dual-layer model (§Split Semantics):
The pure layer handles the vast majority of computation: data transformation, business logic, algorithms. Values are immutable, garbage-collected, and freely shareable.
The resource layer handles I/O, mutable buffers, and FFI. Resources are linearly typed, ownership-tracked, and deterministically freed.
The effect system bridges the two layers: a function that uses
resources must declare this in its type signature (via
with Io,
with Net, etc.). Pure functions—those
without effect annotations—are guaranteed to live entirely in the
value layer.
10.2.1 Why the Boundary Preserves Safety
The \(\lambda_V\) calculus formalised in §Denotational Semantics models only the pure value layer; the full linear-type system for resources is deferred to a companion paper on Japl’s mutation model. Nevertheless, the safety of the combined system can be understood informally through two invariants that the compiler enforces:
Value containment: Value types may not embed resource types (Definition Value Resource). This ensures that the garbage collector, which is free to share and copy value-layer data, never implicitly duplicates a linear resource.
Effect segregation: A function typed without effect annotations (\(\texttt{\textbf{with}} \varnothing\)) cannot allocate, consume, or borrow resources. Such functions are pure in the sense of \(\lambda_V\) and enjoy all the guarantees of Theorem Adequacy (adequacy) and Corollary Referential Transparency (referential transparency).
Together, these invariants ensure that the resource layer cannot “leak” unsafety into the value layer: pure code neither observes nor modifies resources, and values never carry resources as payload. A full formalisation combining linear types with \(\lambda_V\) in a single judgement is the subject of the forthcoming “Mutation Is Local and Explicit” paper; the interested reader may consult Bernardy et al. for the general technique of embedding linearity into a polymorphic calculus.
-- Pure function: values only, no effects
fn calculate_totals(orders: List(Order)) -> List(Total) =
List.map(orders, fn order ->
{ order_id = order.id
, amount = List.fold(order.items, 0, fn acc, item ->
acc + item.price * item.quantity)
})
-- Effectful function: uses resources
fn save_totals(totals: List(Total)) -> Result(Unit, DbError)
with Io =
use conn = Db.connect(db_url)?
List.each(totals, fn total ->
Db.insert(ref conn, "totals", total)?
)
Db.close(conn)
Ok(())
-- Composition: pure logic + resource I/O
fn process_and_save(orders: List(Order)) -> Result(Unit, DbError)
with Io =
let totals = calculate_totals(orders) -- pure
save_totals(totals) -- effectful
10.3 The Composition Hierarchy
The “Values Are Primary” principle is the foundation upon which Japl’s remaining six principles build:
Values Are Primary — the foundation.
Mutation Is Local and Explicit — builds on value semantics by providing a controlled escape hatch.
Concurrency Is Process-Based — enabled by value semantics: processes can share values without synchronization.
Failures Are Normal and Typed —
Resultis a value type; errors are values, not exceptions.Distribution Is a Native Concern — values can be serialized unambiguously.
The Unit of Composition Is the Function — functions transform values; value semantics ensures that function composition is associative and referentially transparent.
Runtime Simplicity — immutable values simplify GC (no write barriers), simplify debugging (values don’t change after creation), and simplify profiling (no hidden state mutations).
10.4 Trade-offs and Limitations
10.4.1 Memory Overhead
Persistent data structures use more memory than their mutable
counterparts (Table Memory Benchmark). For
memory-constrained environments, this can be significant. Japl’s response is to provide the
packed modifier and the resource layer
for cases where flat, mutable memory layouts are necessary.
10.4.2 Learning Curve
Programmers accustomed to imperative, mutation-heavy styles may find the value-centric approach unfamiliar. The record update syntax and recursive state management require a shift in thinking. However, experience with Elm , Gleam , and the functional subsets of Rust and Kotlin suggests that this transition is tractable.
10.4.3 Interoperability
When interfacing with mutable C libraries, the boundary between
the pure and resource layers requires careful management. Japl’s FFI design (§Split Semantics) addresses this
by wrapping foreign resources in linear types, but the FFI boundary
remains a source of potential unsafety (mitigated by the
unsafe annotation).
10.5 Related Design Patterns
10.5.1 Event Sourcing
Value semantics aligns naturally with event sourcing : system state is represented as an append-only log of immutable events, and the current state is derived by folding over the event history. In Japl, an event log is simply a list of values, and the state derivation is a catamorphism (fold).
10.5.2 Functional Reactive Programming
The view of state as a sequence of values indexed by time connects to functional reactive programming (FRP) . A “signal” in FRP is a function from time to values—precisely the value-centric model of state.
10.5.3 Content-Addressable Storage
Immutable values are naturally content-addressable: their identity is their content. This connects to content-addressable storage systems like Git , IPFS , and Unison , where data is addressed by its hash. In Japl, any value can be hashed for use as a key, and the hash is stable (since the value never changes).
11 Conclusion
We have argued that treating values as primary—making immutable algebraic data types the default representation for all data—is a sound foundation for programming language design, particularly in the context of concurrent and distributed systems.
Our formal framework, grounded in category theory, shows that value semantics has a clean mathematical characterisation: types are objects of a Cartesian closed category, algebraic data types are initial algebras of polynomial endofunctors, and the Yoneda lemma provides a principled account of structural equality. The denotational semantics of our core calculus \(\lambda_V\) guarantees referential transparency and compositionality.
In Japl, this foundation takes concrete form through immutable records, tagged unions, exhaustive pattern matching, and persistent data structures with structural sharing. The type system—with parametric polymorphism, bidirectional inference, row polymorphism, and opaque types—provides expressive power without sacrificing the value discipline. The dual-layer architecture (pure values plus linearly-typed resources) acknowledges that not all data is pure, and provides a safe, explicit escape hatch for mutation.
Our benchmarks demonstrate that value semantics need not sacrifice performance: persistent HAMTs are within 2\(\times\) of mutable hash maps for single-threaded operations and exceed mutable approaches for concurrent read-heavy workloads. The per-process GC, optimised for immutable data, achieves sub-150\(\mu\)s pause times.
The implications for concurrency are profound. Values can be shared across process boundaries without synchronization, enabling zero-copy message passing, implicit snapshotting, and unambiguous serialization for distribution. The “Values Are Primary” principle is not merely an aesthetic preference—it is a design decision with cascading benefits for safety, performance, and simplicity.
Pure functions handle logic. Supervised processes handle time and failure. And values—immutable, shareable, composable values—are the foundation on which everything else is built.
12 Proof of Parametricity for JAPL’s Core Calculus
We sketch the parametricity proof for \(\lambda_V\) following Reynolds and Wadler .
Definition 27 (Relational Interpretation). For each type \(\tau\) and relation \(R \subseteq A \times B\) interpreting each type variable, we define a relation \(\llbracket \tau \rrbracket_R\) by induction: \[\begin{align} \llbracket \mathbf{1} \rrbracket_R &= \{(\ast, \ast)\} \\ \llbracket \alpha \rrbracket_R &= R \\ \llbracket \tau_1 \to \tau_2 \rrbracket_R &= \{(f, g) \mid \forall (a, b) \in \llbracket \tau_1 \rrbracket_R.\; (f(a), g(b)) \in \llbracket \tau_2 \rrbracket_R\} \\ \llbracket \tau_1 \times \tau_2 \rrbracket_R &= \{((a_1, a_2), (b_1, b_2)) \mid (a_1, b_1) \in \llbracket \tau_1 \rrbracket_R \wedge (a_2, b_2) \in \llbracket \tau_2 \rrbracket_R\} \\ \llbracket \tau_1 + \tau_2 \rrbracket_R &= \{(\mathrm{inl}(a), \mathrm{inl}(b)) \mid (a, b) \in \llbracket \tau_1 \rrbracket_R\} \\ &\quad \cup \{(\mathrm{inr}(a), \mathrm{inr}(b)) \mid (a, b) \in \llbracket \tau_2 \rrbracket_R\} \end{align}\]
Theorem 28 (Abstraction Theorem for \(\lambda_V\)). For every well-typed closed term \(e : \forall \alpha.\, \tau(\alpha)\) and every relation \(R \subseteq A \times B\): \[(\llbracket e \rrbracket_A, \llbracket e \rrbracket_B) \in \llbracket \tau \rrbracket_R\]
Proof sketch. By induction on the typing derivation. The key cases are:
Abstraction: If \(\Gamma, x:\sigma \vdash e : \tau\) and by IH for all related environments \((\rho_1, \rho_2) \in \llbracket \Gamma \rrbracket_R\), we have \((\llbracket e \rrbracket^{\rho_1}, \llbracket e \rrbracket^{\rho_2}) \in \llbracket \tau \rrbracket_R\), then for \(\lambda x.\, e\) we need: given \((a, b) \in \llbracket \sigma \rrbracket_R\), show \((\llbracket e \rrbracket^{\rho_1[x \mapsto a]}, \llbracket e \rrbracket^{\rho_2[x \mapsto b]}) \in \llbracket \tau \rrbracket_R\). This follows from the IH with extended environments.
Application: If \((\llbracket e_1 \rrbracket^{\rho_1}, \llbracket e_1 \rrbracket^{\rho_2}) \in \llbracket \sigma \to \tau \rrbracket_R\) and \((\llbracket e_2 \rrbracket^{\rho_1}, \llbracket e_2 \rrbracket^{\rho_2}) \in \llbracket \sigma \rrbracket_R\), then by the definition of the relational interpretation for function types, \((\llbracket e_1 \rrbracket^{\rho_1}(\llbracket e_2 \rrbracket^{\rho_1}), \llbracket e_1 \rrbracket^{\rho_2}(\llbracket e_2 \rrbracket^{\rho_2})) \in \llbracket \tau \rrbracket_R\).
Case analysis: Follows from the disjoint union structure of the sum type relation.
The absence of mutation is critical: in a language with mutable state, the relational interpretation must account for store relations, and the proof requires step-indexed logical relations. In \(\lambda_V\), the straightforward set-theoretic proof goes through without such complications.
Extending this proof to the resource layer would require replacing the set-theoretic relations above with a “state-and-store” relational model in which the resource heap is threaded through the interpretation. Linearity constraints would further require tracking resource ownership in the relation, leading to a significantly more complex proof obligation. This separation of concerns—a clean parametricity proof for values, with a richer model deferred to the resource layer—is a primary motivation for Japl’s dual-layer architecture. ◻
13 JAPL Core Syntax Reference
For reference, we provide the full surface syntax of Japl’s value layer.
-- Type declarations
type Option(a) =
| Some(a)
| None
type Result(a, e) =
| Ok(a)
| Err(e)
type User = {
id: UserId,
name: String,
email: String
}
-- Function declarations
fn user_label(user: User) -> String =
user.name <> " <" <> user.email <> ">"
fn map(list: List(a), f: fn(a) -> b) -> List(b) =
match list with
| [] -> []
| [x, ..rest] -> [f(x), ..map(rest, f)]
-- Pattern matching
fn describe(result: Result(Int, String)) -> String =
match result with
| Ok(n) -> "Success: " <> Int.to_string(n)
| Err(msg) -> "Error: " <> msg
-- Record update (structural sharing)
fn update_email(user: User, email: String) -> User =
{ user | email = email }
-- Pipeline composition
fn process(data: List(String)) -> List(Int) =
data
|> List.filter(fn s -> String.length(s) > 0)
|> List.map(parse_int)
|> List.filter(fn r -> Result.is_ok(r))
|> List.map(Result.unwrap)
-- Row-polymorphic functions
fn get_name(r: { name: String | rest }) -> String =
r.name
-- Traits
trait Eq(a) =
fn eq(x: a, y: a) -> Bool
impl Eq(User) =
fn eq(a, b) =
a.id == b.id
-- Tests (values make testing trivial)
test "user_label formats correctly" =
let user = { id = UserId(1), name = "Alice",
email = "alice@example.com" }
assert user_label(user) == "Alice <alice@example.com>"
property "map preserves length" =
forall (xs: List(Int), f: fn(Int) -> Int) ->
List.length(map(xs, f)) == List.length(xs)
The \(10^6\)-insert ratio of \(1.9\times\) benefits from the prototype’s intraprocedural uniqueness analysis (§Uniqueness Optimization), which enables in-place updates for a portion of the insertions in the sequential benchmark loop. Without uniqueness optimisation, the ratio is approximately \(2.2\times\).↩︎