Failures Are Normal and Typed:
A Unified Theory of Error Handling Combining
Typed Recoverable Errors with Crash-and-Restart Semantics
1 Introduction
Every program encounters failure. Files go missing, networks drop packets, users submit malformed input, hardware degrades, and bugs escape testing. How a programming language handles these failures shapes the reliability, readability, and maintainability of the software written in it.
The history of error handling in programming languages is a history of trade-offs. Each approach optimizes for certain failure modes while leaving others poorly served:
1.0.0.1 Exceptions (Java, Python, C++).
Exceptions provide a mechanism for non-local control flow transfer, allowing functions deep in the call stack to signal failures to distant callers. Java’s checked exceptions attempt to bring static typing to this mechanism, requiring callers to either handle or propagate declared exception types. In practice, checked exceptions have been widely criticized : they create coupling between implementation details and interface signatures, encourage catching overly broad exception types, and scale poorly in the presence of higher-order functions and generics.
1.0.0.2 Error codes and values (C, Go).
C’s convention of returning integer error codes and Go’s explicit
error return values keep failure in the value domain,
making it visible at every call site. However, neither language
enforces that errors are inspected. In C, a forgotten
errno check is a silent bug. In Go, the
if err != nil pattern introduces substantial syntactic
noise and, lacking sum types, cannot express the type of
the error with the same precision as an algebraic data type.
1.0.0.3 Result types (Rust, Haskell).
Rust’s Result<T, E> and Haskell’s
Either a b embed failure in the type
system as sum types. The compiler enforces that callers handle both
success and failure cases. Rust’s ? operator provides
syntactic sugar for propagation. This approach is compositionally
elegant but can be onerous: converting between error types, chaining
fallible operations, and handling errors at appropriate levels of
abstraction all require careful design. More fundamentally, Rust
lacks a clean answer for truly unexpected
failures—panic! exists but is designed as a last
resort, not a systematic recovery mechanism.
1.0.0.4 Crash-and-restart semantics (Erlang/OTP).
Erlang takes a radically different approach: rather than preventing or carefully handling every failure, it embraces failure as normal . Processes are lightweight and isolated. When a process encounters an unexpected condition, it crashes. A supervisor process detects the crash and restarts the failed process with fresh state. This “let it crash” philosophy has proven remarkably effective for building fault-tolerant telecommunications and distributed systems. However, Erlang’s dynamic typing means that error types are not checked at compile time, and the boundary between errors that should be handled locally and errors that should trigger a crash is left to programmer convention.
1.0.0.5 The gap.
No existing language satisfactorily unifies these approaches. Rust handles domain errors beautifully but has no process-level crash recovery. Erlang handles unexpected failures gracefully but provides no compile-time error typing. Go has neither sum types nor supervision. Java’s checked exceptions were the right instinct—making failure visible in types—executed with the wrong mechanism.
Japl addresses this gap with a dual error model that cleanly separates two fundamentally different kinds of failure:
Domain errors (
Resulttypes +Faileffect): Expected, recoverable failures such as parse errors, validation failures, and not-found conditions. These are values, tracked by the type system and the effect system, that must be handled by the caller.Process failures (crash + supervision): Unexpected failures such as invariant violations, corrupted state, and unrecoverable resource loss. These terminate the enclosing process, and a supervisor restarts it with fresh state.
This paper formalizes the dual model, proves its key safety and liveness properties, and demonstrates its practical advantages through comparison and case study.
1.0.0.6 Contributions.
A formal framework for dual-mode error handling combining sum types, algebraic effects, and process algebra (3).
The design and specification of Japl’s dual error model, including the
Faileffect and its integration with supervision ([sec:dual,sec:boundary,sec:effects,sec:supervision]).Proofs of error safety, crash containment, and supervision liveness (11).
Comprehensive comparison with five major error handling paradigms (8).
Implementation strategies and performance analysis (9).
Three detailed case studies demonstrating the model in practice (10).
2 Background and Related Work
2.1 Exception Safety and the C++ Legacy
Stroustrup introduced exceptions to C++ as a mechanism for signaling errors across abstraction boundaries . The subsequent development of exception safety guarantees—basic, strong, and no-throw—by Abrahams and Sutter revealed the deep interaction between exceptions and resource management. The RAII (Resource Acquisition Is Initialization) pattern emerged as the C++ community’s solution to ensuring resource cleanup in the presence of exceptions, a concern that Japl addresses through ownership types and linear resource management.
2.2 Monadic Error Handling
Moggi’s seminal work on monads as a structuring mechanism for
computational effects provided the theoretical
foundation for treating errors as first-class values. Wadler popularized
the application of monads in functional programming, showing how the
exception monad \(T(A) = A +
E\) (for a fixed error type \(E\)) could structure error-handling
code. Haskell’s Either monad and its
MonadError typeclass realize this approach,
enabling compositional error handling through
do-notation.
However, the monadic approach in Haskell suffers from practical limitations. The “monad transformer” stack needed to combine multiple effects (errors, state, IO) introduces complexity and performance overhead . More fundamentally, Haskell also has a separate, dynamically-typed exception system for IO errors , creating a dualism that is acknowledged but not cleanly resolved.
2.3 Rust’s Result and Panic
Rust made the
deliberate decision to separate recoverable errors
(represented by Result<T, E>) from
unrecoverable errors (represented by panic!).
The ? operator provides ergonomic error propagation.
The From trait enables automatic error type conversion
during propagation.
Rust’s approach is the closest antecedent to Japl’s dual model. However, Rust lacks
several features that Japl provides:
(1) Rust has no built-in lightweight process model, so
panic! typically aborts the entire program or unwinds a
single thread—there is no supervision. Libraries such as
tokio (asynchronous task runtime) and
rayon (data-parallel work-stealing) provide lightweight
tasks, but these are library abstractions that lack the isolation
and crash-containment primitives of a language runtime: a panic in a
tokio task can poison shared state, and neither library
provides supervision trees or typed crash reasons; (2) Rust’s error
handling is not integrated with an effect system, so the compiler
does not track which functions can fail with which error types as a
composable annotation; (3) Rust’s panic! is designed as
a last resort, not as a deliberate recovery strategy.
2.4 Go’s Error Values
Go represents errors as
values implementing the error interface. Functions
return (T, error) tuples. This approach is explicit but
lacks algebraic structure: Go has no sum types, so error handling
cannot benefit from exhaustive pattern matching. The
if err != nil pattern is pervasive and syntactically
heavy. Go’s panic/recover mechanism exists
but is discouraged for normal error handling.
Recent additions to Go’s error handling—errors.Is,
errors.As, and error wrapping with
fmt.Errorf("%w", err)—are runtime-typed approximations
of what algebraic data types provide at compile time.
2.5 Erlang’s “Let It Crash” Philosophy
Armstrong’s doctoral thesis formalized the philosophy that has made Erlang/OTP the gold standard for fault-tolerant systems. The key insights are:
Process isolation: Processes share no memory. A bug in one process cannot corrupt another.
Crash as recovery: Rather than attempting to handle every conceivable failure, a process that encounters unexpected state simply terminates. A supervisor detects this and restarts the process with known-good initial state.
Error kernel pattern: The system is structured so that a small, well-tested “error kernel” of supervisor processes manages the lifecycle of a larger population of worker processes that may fail .
Supervision trees: Supervisors are organized hierarchically, with strategies (one-for-one, all-for-one, rest-for-one) governing how failures propagate and how restarts occur .
Erlang’s limitation is the absence of static typing for errors. A process may crash with any term as the “reason,” and supervisors make restart decisions based on runtime pattern matching against crash reasons. This works in practice but leaves significant error-handling logic invisible to static analysis.
2.6 Algebraic Effects and Handlers
Plotkin and Pretnar’s algebraic effects and handlers provide a principled framework for structuring computational effects. An algebraic effect declares a set of operations; a handler provides interpretations for those operations. Leijen demonstrated practical effect systems with row-polymorphic effect typing in the Koka language.
Japl’s Fail effect is
an instance of the exception effect in this framework. The algebraic
effects perspective is crucial because it makes error-handling
capability compositional: functions declare which error
effects they require, and handlers can interpret these effects at
appropriate boundaries.
2.7 Failure Detectors
Chandra and Toueg formalized the notion of failure detectors in distributed systems, classifying them by completeness and accuracy properties. This work is relevant to Japl’s supervision model: a supervisor acts as a local failure detector for its children, and the completeness guarantee (every crashed process is eventually detected) is essential for the supervision liveness property we prove in 11.
2.8 Process Algebras
The \(\pi\)-calculus and CSP provide formal foundations for reasoning about concurrent processes and message passing. We extend these frameworks with explicit crash and restart actions to model Japl’s supervision semantics.
3 Formal Framework
We develop a formal framework that unifies typed error handling with crash semantics. The framework has three components: (1) an algebraic treatment of error types as sum types, (2) categorical semantics of failure using Kleisli categories, and (3) a process algebra extended with crash actions.
3.1 Error Types as Sum Types
Definition 1 (Error type). An error type \(E\) is an algebraic data type (sum type) whose constructors represent distinct failure modes: \[E = C_1(\tau_1) \mid C_2(\tau_2) \mid \cdots \mid C_n(\tau_n)\] where each \(C_i\) is a constructor carrying data of type \(\tau_i\).
Definition 2 (Result type). The result type \(\texttt{Result}\langle A, E\rangle\) is a binary sum type: \[\texttt{Result}\langle A, E\rangle = \ensuremath{\mathsf{Ok}}(A) + \ensuremath{\mathsf{Err}}(E)\]
The result type forms a monad, which we call the exception monad for error type \(E\).
Definition 3 (Exception monad). For a fixed error type \(E\), the exception monad \(T_E\) is defined by: \[\begin{align*} T_E(A) &= A + E \\ \eta_A(a) &= \ensuremath{\mathsf{Ok}}(a) \\ \mu_A(\ensuremath{\mathsf{Ok}}(\ensuremath{\mathsf{Ok}}(a))) &= \ensuremath{\mathsf{Ok}}(a) \\ \mu_A(\ensuremath{\mathsf{Ok}}(\ensuremath{\mathsf{Err}}(e))) &= \ensuremath{\mathsf{Err}}(e) \\ \mu_A(\ensuremath{\mathsf{Err}}(e)) &= \ensuremath{\mathsf{Err}}(e) \end{align*}\] The monadic bind (Kleisli composition) is: \[m \mathbin{>\!\!>\!\!=} f = \begin{cases} f(a) & \text{if } m = \ensuremath{\mathsf{Ok}}(a) \\ \ensuremath{\mathsf{Err}}(e) & \text{if } m = \ensuremath{\mathsf{Err}}(e) \end{cases}\]
Proposition 4 (Monad laws). \(T_E\) satisfies the monad laws:
Left identity: \(\eta(a) \mathbin{>\!\!>\!\!=} f = f(a)\)
Right identity: \(m \mathbin{>\!\!>\!\!=} \eta = m\)
Associativity: \((m \mathbin{>\!\!>\!\!=} f) \mathbin{>\!\!>\!\!=} g = m \mathbin{>\!\!>\!\!=} (\lambda a.\, f(a) \mathbin{>\!\!>\!\!=} g)\)
Proof. By case analysis on the structure of \(m\). For (1): \(\eta(a) \mathbin{>\!\!>\!\!=} f = \ensuremath{\mathsf{Ok}}(a) \mathbin{>\!\!>\!\!=} f = f(a)\). For (2): if \(m = \ensuremath{\mathsf{Ok}}(a)\), then \(m \mathbin{>\!\!>\!\!=} \eta = \eta(a) = \ensuremath{\mathsf{Ok}}(a) = m\); if \(m = \ensuremath{\mathsf{Err}}(e)\), then \(m \mathbin{>\!\!>\!\!=} \eta = \ensuremath{\mathsf{Err}}(e) = m\). For (3): the proof proceeds by case analysis on \(m\): if \(m = \ensuremath{\mathsf{Err}}(e)\), both sides reduce to \(\ensuremath{\mathsf{Err}}(e)\); if \(m = \ensuremath{\mathsf{Ok}}(a)\), both sides reduce to \(f(a) \mathbin{>\!\!>\!\!=} g\). ◻
3.2 Kleisli Categories and Partial Functions
The categorical semantics of error handling can be understood through the Kleisli category of the exception monad.
Definition 5 (Kleisli category). The Kleisli category \(\mathbf{Kl}(T_E)\) for the exception monad \(T_E\) has:
Objects: types \(A, B, C, \ldots\)
Morphisms \(A \to B\): functions \(A \to T_E(B) = A \to (B + E)\)
Identity: \(\eta_A : A \to A + E\)
Composition: \(g \circ_K f = \lambda a.\, f(a) \mathbin{>\!\!>\!\!=} g\)
The Kleisli category \(\mathbf{Kl}(T_E)\) is isomorphic to a category of partial functions with typed partiality. A morphism \(f : A \to B + E\) in \(\mathbf{Kl}(T_E)\) is a function that either produces a \(B\) or fails with a reason from \(E\). This gives us a precise categorical characterization of Japl’s fallible functions.
Proposition 6. Kleisli composition of fallible functions is associative and preserves error propagation: if \(f : A \to B + E\) fails, then \(g \circ_K f\) fails with the same error, regardless of \(g\).
This property corresponds directly to Japl’s ? operator semantics:
error propagation is short-circuiting.
3.3 Algebraic Effects for Errors
We model Japl’s Fail
effect using the framework of algebraic effects and handlers .
Definition 7 (Fail effect). The \(\texttt{Fail}\langle E\rangle\) effect is an algebraic effect with a single operation: \[\mathsf{fail} : E \to \bot\] The operation \(\mathsf{fail}(e)\) signals a failure with reason \(e : E\) and does not return (indicated by the return type \(\bot\)).
Definition 8 (Fail handler). A handler for
\(\texttt{Fail}\langle E\rangle\)
interprets the effect, converting an effectful computation into a
Result value: \[\mathsf{catch} : (\forall \alpha.\,
(\texttt{Fail}\langle E\rangle \Rightarrow \alpha) \to
\texttt{Result}\langle \alpha, E\rangle)\] Specifically:
\[\begin{align*}
\mathsf{catch}(\mathsf{return}\;v) &=
\ensuremath{\mathsf{Ok}}(v) \\
\mathsf{catch}(\mathsf{fail}\;e\;k) &=
\ensuremath{\mathsf{Err}}(e)
\end{align*}\] where \(k\)
is the (discarded) continuation.
The algebraic effects framework provides several advantages over the monadic approach:
Composition: Multiple
Faileffects with different error types compose naturally via row-polymorphic effect rows, without the need for monad transformers.Separation of concern: The code that raises an error and the code that handles it are decoupled; the handler can be installed at any point in the call chain.
Compiler tracking: The effect system tracks which functions may fail and with which error types, enabling static analysis.
3.4 Process Algebra with Crash Actions
To model process failures and supervision, we extend a process algebra with explicit crash and restart actions.
Definition 9 (Extended process algebra). A process \(P\) is defined by the grammar: \[\begin{align*} P ::=\; & \mathbf{0} && \text{(terminated)} \\ \mid\; & \alpha.P && \text{(action prefix)} \\ \mid\; & P_1 \parallel P_2 && \text{(parallel composition)} \\ \mid\; & P_1 + P_2 && \text{(choice)} \\ \mid\; & \mathsf{crash}(r) && \text{(crash with reason } r\text{)} \\ \mid\; & \mathsf{sup}(S, [P_1, \ldots, P_n]) && \text{(supervision)} \\ \mid\; & !P && \text{(replication)} \end{align*}\] where \(\alpha\) ranges over actions (send, receive, internal), \(r\) is a crash reason, and \(S\) is a supervision strategy.
Definition 10 (Crash transition). A process may transition to a crashed state: \[\frac{}{\mathsf{crash}(r) \xrightarrow{\mathsf{crash}(r)} \mathbf{0}}\]
Definition 11 (Supervision transition). A supervisor observes a child crash and restarts it: \[\frac{P_i \xrightarrow{\mathsf{crash}(r)} \mathbf{0} \quad S(i, r) = \mathsf{restart}} {\mathsf{sup}(S, [\ldots, P_i, \ldots]) \xrightarrow{\tau} \mathsf{sup}(S, [\ldots, P_i^0, \ldots])}\] where \(P_i^0\) is the initial state of process \(i\).
The supervision strategy \(S\) determines the restart behavior:
Definition 12 (Supervision strategies). \[\begin{align*} \mathsf{OneForOne}(i, r) &= \mathsf{restart}(i) \\ \mathsf{AllForOne}(i, r) &= \mathsf{restart}(1), \ldots, \mathsf{restart}(n) \\ \mathsf{RestForOne}(i, r) &= \mathsf{restart}(i), \mathsf{restart}(i+1), \ldots, \mathsf{restart}(n) \end{align*}\]
3.5 Connecting the Two Layers
The key insight of Japl’s dual model is that the two formalisms—Kleisli categories for domain errors and process algebra for crashes—operate at different levels of abstraction with a well-defined boundary between them.
Definition 13 (Error/crash boundary). A boundary function \(\beta\) maps from the domain error level to the process level: \[\beta : \texttt{Result}\langle A, E\rangle \to \mathsf{Proc}\] such that: \[\begin{align*} \beta(\ensuremath{\mathsf{Ok}}(a)) &= \text{continue with } a \\ \beta(\ensuremath{\mathsf{Err}}(e)) &= \text{continue with error handling for } e \\ \text{(invariant violation)} &\to \mathsf{crash}(r) \end{align*}\]
The boundary is asymmetric: domain errors can always be converted to process actions (either handling the error or choosing to crash), but crashes cannot be converted back to domain errors within the crashed process. Recovery from crashes happens at the supervisor level, not within the failed process itself.
4 JAPL’s Dual Error Model
Japl provides two complementary mechanisms for handling failure, each appropriate for a different class of errors.
4.1 Recoverable Domain Errors: Result Types
Domain errors represent expected failure modes—conditions that are part of the normal operation of a system. A database query may not find a matching row. A JSON parser may encounter malformed input. A user may submit an invalid form. These are not bugs; they are anticipated outcomes that the program must handle.
In Japl, domain errors are
represented as typed sum types and propagated through the
Fail effect:
type AppError =
| NotFound
| Unauthorized
| InvalidInput(String)
| DbError(String)
fn get_user(id: UserId) -> User
with Io, Fail[AppError] =
let row = db_find_user(id)?
decode_user(row)?
Several design properties are notable:
4.1.0.1 Exhaustive handling.
Because error types are algebraic data types, pattern matching on errors is checked for exhaustiveness by the compiler:
fn handle_error(err: AppError) -> Response =
match err with
| NotFound -> response(404, "not found")
| Unauthorized -> response(401, "unauthorized")
| InvalidInput(msg) ->
response(400, "bad input: " ++ msg)
| DbError(msg) ->
log_error(msg)
response(500, "internal error")
If a new variant is added to AppError, every match
expression is flagged as incomplete until updated. This is the same
exhaustiveness checking that Rust and Haskell provide, applied to
error handling.
4.1.0.2 Compositional propagation.
The ? operator propagates errors through the call
chain:
fn process_order(req: Request) -> OrderResult
with Io, Net, Fail[OrderError] =
let user = get_user(req.user_id)?
let items = validate_items(req.items)?
let payment = charge_payment(user, items)?
create_order(user, items, payment)?
Each ? invocation is syntactic sugar for pattern
matching on the result: if \(\mathsf{Ok}\), unwrap; if \(\mathsf{Err}\), propagate immediately.
The type system ensures that the propagated error type is compatible
with the function’s declared Fail effect.
4.1.0.3 Error type composition.
When a function calls subfunctions with different error types,
conversion is required. The idiomatic Japl approach uses a handle
block, which leverages the effect system to intercept and re-raise
with a different error type:
fn get_user_profile(id: UserId) -> Profile
with Io, Fail[AppError] =
let user = get_user(id)?
let prefs = handle get_preferences(id) with
| Fail(e) -> Fail.raise(DbError(show(e)))
{ user, preferences = prefs }
The handle block intercepts the Fail
effect from get_preferences and re-raises it as an
AppError, keeping the conversion visible and explicit.
For simple cases, the pipeline form
|> map_err(fn e -> DbError(show(e)))? is also
available and resembles Rust’s From trait. In both
cases, Japl’s effect system tracks
the error type as part of the function’s signature.
4.2 Process Failures: Crash and Restart
Process failures represent unexpected conditions—situations where the process’s internal state may be corrupted, where an invariant has been violated, or where a resource has been irreparably lost. The appropriate response is not to attempt recovery within the process, but to terminate the process and let a supervisor start a fresh instance.
fn critical_worker(state: State)
-> Never with Process[WorkerMsg] =
match Process.receive() with
| ProcessTask(task) ->
-- If the invariant is violated, the
-- process crashes. The supervisor
-- restarts it with fresh state.
assert valid_invariant(state)
let new_state = handle_task(state, task)
critical_worker(new_state)
| Shutdown ->
cleanup(state)
Process.exit(Normal)
Key properties of crash semantics in Japl:
4.2.0.1 Process isolation.
Processes do not share mutable state. A crash in one process cannot corrupt the memory of another. This is enforced by the language’s type system: values sent between processes must be immutable (and therefore safely shareable) or ownership must be explicitly transferred.
4.2.0.2 Typed crash reasons.
Unlike Erlang, where a process can crash with any term, Japl provides structured crash reasons:
type CrashReason =
| Normal
| AssertionFailed(String, Location)
| ResourceExhausted(String)
| InvariantViolation(String)
| Timeout
| Custom(String)
This allows supervisors to make informed restart decisions based on the crash reason’s type, not just runtime string matching.
4.2.0.3 Deterministic cleanup.
When a process crashes, all its owned resources (file handles, network connections, etc.) are deterministically released through Japl’s ownership system. The linear type system guarantees that every resource is consumed exactly once, even on the crash path.
4.3 The Complementarity Principle
The two error modes are complementary, not overlapping. They address different questions:
| Domain Errors | Process Failures | |
|---|---|---|
| Nature | Expected | Unexpected |
| Mechanism | Result +
Fail |
crash + supervision |
| Scope | Function/call chain | Process |
| Recovery | Caller handles | Supervisor restarts |
| State | Preserved | Discarded |
| Examples | Not found, invalid | Corrupted state, |
| input, timeout | hardware fault, bug |
A well-designed Japl application uses domain errors for all anticipated failure modes and reserves crashes for conditions where the process’s state can no longer be trusted.
5 The Error/Crash Boundary
The most important design decision in Japl’s error model is determining when to
use Result types and when to crash. This section
provides principled guidance.
5.1 The Decision Criterion
Definition 14 (Error/Crash Criterion). A
failure should be handled as a domain error (Result) if
and only if:
The failure is anticipated: it represents a known failure mode of the operation.
The caller can meaningfully respond: there exists a sensible action the caller can take.
The process state remains consistent: the failure does not indicate state corruption.
If any of these conditions is not met, the failure should trigger a process crash.
The following decision table summarizes the criterion for common failure scenarios:
| Failure scenario | Anticipated? | Actionable? | State OK? | Mode |
|---|---|---|---|---|
| Not-found query | Result |
|||
| Parse error | Result |
|||
| Auth rejected | Result |
|||
| Network timeout | Result |
|||
| Invalid user input | Result |
|||
| Assertion failure | \(\times\) | \(\times\) | \(\times\) | Crash |
| Corrupted state | \(\times\) | \(\times\) | \(\times\) | Crash |
| Unrecoverable resource | \(\times\) | \(\times\) | \(\times\) | Crash |
| Index out of bounds | \(\times\) | \(\times\) | \(\times\) | Crash |
| DB conn lost mid-txn | \(\sim\) | \(\sim\) | \(\times\) | Crash |
| OOM in subsystem | \(\sim\) | \(\times\) | \(\times\) | Crash |
Decision rule (informal). Ask: “Can the caller
do something sensible with this failure, and is my state still
trustworthy?” If yes, return a Result. If no, crash and
let the supervisor restore a known-good state.
Example 15 (Database query). A “not found” result from a database query satisfies all three criteria: it is anticipated (queries may not match), the caller can respond (return a 404, use a default value, etc.), and the state is consistent (the database connection is fine). This is a domain error.
A database connection loss during a transaction may violate condition (3): if the process was tracking transaction state, that state is now inconsistent with the database. The appropriate response is to crash and let the supervisor restart with a fresh connection.
5.2 The Error Kernel Pattern
Armstrong’s error kernel pattern structures a system so that a small, well-tested core manages the lifecycle of larger, more complex subsystems:
-- The error kernel: simple, well-tested,
-- rarely crashes
fn app_supervisor() -> Never
with Process[SupervisorMsg] =
Supervisor.start(
strategy = OneForOne,
max_restarts = 10,
max_seconds = 60,
children = [
{ id = "db_pool"
, start = fn -> db_pool_supervisor()
, restart = Permanent
, shutdown = Timeout(5000)
},
{ id = "web"
, start = fn -> web_supervisor()
, restart = Permanent
, shutdown = Timeout(10000)
},
]
)
-- Worker processes: may crash, will be
-- restarted by supervisors
fn request_handler(conn: TcpConn) -> Never
with Process[HttpMsg], Io =
let req = Http.parse_request(conn)
let resp = match route(req) with
| Ok(handler) ->
match handler(req) with
| Ok(resp) -> resp
| Err(e) -> error_response(e)
| Err(_) -> response(404, "not found")
Http.send_response(conn, resp)
request_handler(conn)
In this architecture:
The
app_supervisoranddb_pool_supervisorform the error kernel. They are simple, well-tested, and their only job is to supervise children.The
request_handlerprocesses do the complex work. If one crashes (due to a bug, a malformed request that bypasses validation, or a resource issue), the supervisor restarts it.Within each request handler, domain errors are handled with
Resulttypes: routing failures return 404, handler errors return appropriate HTTP status codes.
5.3 Design Principles for the Boundary
We articulate five principles for managing the error/crash boundary:
Domain errors at API boundaries. Public-facing APIs (HTTP endpoints, library functions, message handlers) should return
Resulttypes so callers can respond appropriately.Crashes for invariant violations. If a condition is reached that “should never happen” according to the system’s design, crash. An
assertthat fails indicates a bug, not a user error.Convert at boundaries. When calling a subsystem that may crash, wrap the call at the boundary: spawn a process, send the request, and handle the possible
ProcessDownmessage as a domain error.Narrow the error kernel. Keep the supervision hierarchy simple and the supervisor code minimal. Complexity belongs in worker processes, which can crash safely.
Escalate when uncertain. If a process encounters a condition it does not know how to handle, crashing is safer than continuing with potentially corrupted state. The supervisor has a broader view and can make a better recovery decision.
-- Principle 3: Converting crashes to
-- domain errors at a boundary
fn safe_compute(input: Data) -> Result[Output, ComputeError]
with Process, Io =
let worker = Process.spawn(fn ->
let result = dangerous_computation(input)
Process.send(Process.self_parent(), Done(result))
)
Process.monitor(worker)
match Process.receive_with_timeout(5000) with
| Done(result) -> Ok(result)
| ProcessDown(^worker, reason) ->
Err(ComputeFailed(show(reason)))
| _ -> Err(ComputeTimeout)
6 Effect System Integration
Japl’s effect system is central to
its error handling model. The Fail effect makes failure
a tracked, composable property of function signatures.
6.1 The Fail Effect
The Fail[E] effect indicates that a function may
fail with an error of type \(E\).
It is one of Japl’s built-in
effects:
| Effect | Meaning |
|---|---|
Pure |
No effects (identity) |
Io |
File system, console, clock |
Net |
Network access |
State[s] |
Mutable state of type \(s\) |
Process |
Process operations |
Fail[e] |
May fail with error type \(e\) |
Async |
Asynchronous operations |
The Fail[E] effect is parameterized by the error
type \(E\), enabling the compiler
to track not just that a function can fail, but
how it can fail.
6.2 Effect Composition
Effects in Japl form a commutative, idempotent monoid under union: \[\begin{align*} \mathcal{E}_1 \cup \mathcal{E}_2 &= \mathcal{E}_2 \cup \mathcal{E}_1 && \text{(commutativity)} \\ \mathcal{E}\cup \mathcal{E}&= \mathcal{E}&& \text{(idempotency)} \\ \mathcal{E}\cup \mathtt{Pure} &= \mathcal{E}&& \text{(identity)} \end{align*}\]
When a function calls subfunctions, its effect set is the union of all callee effects:
-- Effect inference: the compiler computes
-- the union of all effects in the body
fn process_request(req: Request) -> Response
with Io, Net, Fail[AppError] =
-- read_config: Io, Fail[ConfigError]
let config = read_config("/etc/app.conf")
|> map_err(to_app_error)?
-- http_fetch: Net
let data = Http.get(config.api_url)?
-- parse_response: Fail[ParseError]
let parsed = parse_response(data)
|> map_err(to_app_error)?
to_response(parsed)
The declared effects Io, Net, Fail[AppError] must be
a superset of the inferred effects. If the programmer omits a
required effect, the compiler reports an error.
6.3 Multiple Fail Effects
A function may have multiple Fail effects with
different error types. This is supported through effect row
polymorphism:
fn validate_and_store(input: RawInput)
-> StoredData
with Io, Fail[ValidationError],
Fail[StorageError] =
let validated = validate(input)?
-- raises Fail[ValidationError]
store(validated)?
-- raises Fail[StorageError]
The caller must handle (or propagate) both error types. This is more precise than a single unified error type, enabling callers to handle different failure modes differently:
fn handle_input(input: RawInput)
-> Response with Io =
match Fail.catch(fn ->
Fail.catch(fn ->
validate_and_store(input)
)
) with
| Ok(Ok(data)) -> json(200, data)
| Ok(Err(storage_err)) ->
log_error(storage_err)
response(503, "storage unavailable")
| Err(validation_err) ->
response(400, show(validation_err))
Alternatively, the caller can unify the error types:
type InputError =
| Validation(ValidationError)
| Storage(StorageError)
fn handle_input_unified(input: RawInput)
-> Response with Io =
let result = Fail.catch(fn ->
validate_and_store(input)
|> map_fail(Validation)
|> map_fail(Storage)
)
match result with
| Ok(data) -> json(200, data)
| Err(Validation(e)) -> response(400, show(e))
| Err(Storage(e)) -> response(503, show(e))
6.4 Effect Handlers as Error Boundaries
The Fail.catch handler serves as an error boundary,
converting an effectful computation into a pure Result
value. This is the primary mechanism for discharging the
Fail effect:
-- Type: Fail.catch :
-- (fn() -> a with Fail[e]) -> Result[a, e]
fn main() -> Unit with Io =
let result = Fail.catch(fn ->
complex_fallible_computation()
)
match result with
| Ok(value) ->
Io.println("Success: " ++ show(value))
| Err(e) ->
Io.println("Error: " ++ show(e))
The handler eliminates the Fail effect from the
computation’s effect set, converting it into a Result
value in a context with strictly fewer effects. This gives precise
control over where in the call chain errors are handled.
6.5 Compiler Tracking
The Fail effect enables several static analyses:
Unhandled errors: If a function’s effect set includes
Fail[E]but no handler discharges it, the compiler requires the caller to either handle or propagate the effect.Dead error branches: If an error variant is never constructed by any reachable code path, the compiler can warn that a match branch is unreachable.
Error documentation: The effect signature serves as machine-checked documentation of all failure modes, visible in IDEs, generated documentation, and type signatures.
Error compatibility: When composing modules, the compiler checks that error types are compatible across module boundaries.
7 Supervision as Error Recovery
While domain errors are handled within function call chains, process failures are handled by the supervision hierarchy. Supervision is Japl’s mechanism for systematic recovery from unexpected failures.
7.1 Supervision Trees
A supervision tree is a hierarchical organization of processes where each non-leaf node is a supervisor that monitors and manages its children:
fn start_application() -> Pid[SupervisorMsg]
with Process =
Supervisor.start(
strategy = OneForOne,
max_restarts = 5,
max_seconds = 60,
children = [
{ id = "db_pool"
, start = fn -> DbPool.start(db_config)
, restart = Permanent
, shutdown = Timeout(5000)
},
{ id = "cache"
, start = fn -> Cache.start(cache_config)
, restart = Permanent
, shutdown = Timeout(3000)
},
{ id = "web_server"
, start = fn -> WebServer.start(web_config)
, restart = Permanent
, shutdown = Timeout(10000)
},
]
)
7.2 Restart Strategies
Japl provides three restart strategies, following Erlang/OTP:
7.2.0.1 OneForOne.
Only the failed child is restarted. This is appropriate when children are independent.
7.2.0.2 AllForOne.
All children are restarted when any one fails. This is appropriate when children are interdependent and their states must be consistent.
7.2.0.3 RestForOne.
The failed child and all children started after it are restarted. This is appropriate when later children depend on earlier ones.
7.3 Typed Crash Reasons
Japl’s typed crash reasons enable supervisors to make informed decisions:
fn custom_supervisor(children: List[ChildSpec])
-> Never with Process[SupervisorMsg] =
match Process.receive() with
| ChildCrashed(child_id, reason) ->
match reason with
| Normal ->
-- Normal exit, do not restart
custom_supervisor(
remove_child(children, child_id))
| AssertionFailed(_, _) ->
-- Bug detected, restart with
-- fresh state
restart_child(children, child_id)
custom_supervisor(children)
| ResourceExhausted(_) ->
-- Resource issue, wait before
-- restarting
Process.sleep(1000)
restart_child(children, child_id)
custom_supervisor(children)
| InvariantViolation(_) ->
-- Serious issue, escalate
log_critical(child_id, reason)
Process.exit(reason)
This is a significant improvement over Erlang, where crash reasons are untyped terms. The compiler ensures that the supervisor handles all crash reason variants.
7.4 Restart Limits
Supervisors enforce restart limits to prevent infinite restart loops:
-- If more than max_restarts occur within
-- max_seconds, the supervisor itself crashes,
-- escalating to its parent supervisor.
Supervisor.start(
strategy = OneForOne,
max_restarts = 5,
max_seconds = 60,
children = [...]
)
This creates a natural escalation mechanism: if a subsystem cannot recover by restarting, the failure propagates upward through the supervision tree until a higher-level supervisor can respond (perhaps by restarting the entire subsystem with different configuration, or by shutting down gracefully).
7.5 Child Specifications
Each child process is described by a specification that governs its lifecycle:
| Restart type | Meaning |
|---|---|
Permanent |
Always restarted |
Transient |
Restarted only on abnormal exit |
Temporary |
Never restarted |
| Shutdown type | Meaning |
|---|---|
Timeout(ms) |
Send shutdown signal, wait up to \(ms\) |
Brutal |
Terminate immediately |
The combination of restart type, shutdown type, and crash reason gives supervisors fine-grained control over the recovery process.
7.6 Supervision and the Effect System
Supervision interacts with the effect system through the
Process effect. Functions that spawn, monitor, or
communicate with processes must declare the Process
effect. This makes the presence of concurrent, potentially-failing
subcomputations visible in the type signature:
-- The Process effect signals that this
-- function participates in the process model
fn start_worker_pool(config: PoolConfig)
-> WorkerPool
with Process, Io =
let sup = Supervisor.start(
strategy = OneForOne,
max_restarts = config.max_restarts,
max_seconds = config.max_seconds,
children = List.map(
List.range(1, config.pool_size),
fn i -> {
id = "worker-" ++ Int.to_string(i),
start = fn -> worker(config.worker_config),
restart = Permanent,
shutdown = Timeout(5000),
}
)
)
{ supervisor = sup, config }
8 Comparison with Existing Approaches
We now compare Japl’s dual error model with the error handling approaches of five major languages.
8.1 Rust: Result Without Supervision
Rust’s Result<T, E> is the direct inspiration
for Japl’s domain error handling. The
comparison reveals what Rust lacks:
| Feature | Rust | Japl |
|---|---|---|
| Typed recoverable errors | ||
? propagation |
||
| Exhaustive matching | ||
| Effect tracking for errors | \(\times\) | |
| Lightweight processes | \(\times\) | |
| Supervision trees | \(\times\) | |
| Typed crash reasons | \(\times\) | |
| Crash \(\to\) restart recovery | \(\times\) |
Rust’s panic! unwinds the thread stack and either
aborts the program or is caught by
std::panic::catch_unwind, which is explicitly
documented as not a general-purpose error recovery mechanism. There
is no equivalent of supervision—a panicked thread is simply
gone.
In Japl, the equivalent of Rust’s
Result handles the same domain errors with equal type
safety. The difference is that Japl
also provides a systematic answer for Rust’s panic!
cases: crash the process, let the supervisor restart it.
8.2 Go: Error Values Without Structure
Go’s approach is characterized by simplicity and explicitness, but lacks algebraic structure:
| Feature | Go | Japl |
|---|---|---|
| Errors as values | ||
| Sum types for errors | \(\times\) | |
| Exhaustive matching | \(\times\) | |
| Propagation operator | \(\times\) | |
| Effect tracking | \(\times\) | |
| Error type composition | \(\times\) | |
| Process supervision | \(\times\) |
Go’s error handling suffers from two key weaknesses that Japl addresses: (1) the
if err != nil boilerplate, which Japl eliminates with the ?
operator, and (2) the inability to express error types precisely,
which Japl solves with algebraic data
types and the Fail effect.
Go does have goroutines, which are lightweight like
Japl processes, but Go provides no
built-in supervision mechanism. A crashed goroutine is simply lost,
and any panic in a goroutine that is not recovered will
crash the entire program.
8.3 Erlang: Crash Semantics Without Typed Errors
Erlang is the inspiration for Japl’s process model and supervision. The comparison highlights what static typing adds:
| Feature | Erlang | Japl |
|---|---|---|
| Lightweight processes | ||
| Process isolation | ||
| Supervision trees | ||
| “Let it crash” | ||
| Typed error values | \(\times\) | |
| Exhaustive error matching | \(\times\) | |
| Typed crash reasons | \(\times\) | |
| Compile-time error tracking | \(\times\) |
Erlang’s dynamic typing means that error handling relies on
runtime conventions. Common patterns like {ok, Value}
and {error, Reason} are not enforced by the type
system. Dialyzer provides some static
analysis but cannot match the guarantees of Japl’s type system.
Japl preserves Erlang’s supervision model while adding compile-time guarantees about error types, crash reasons, and error handling completeness.
8.4 Java: Checked Exceptions—Right Idea, Wrong Mechanism
Java’s checked exceptions share Japl’s goal of making failure visible in function signatures. The execution was flawed:
| Feature | Java | Japl |
|---|---|---|
| Failure in type signature | ||
| Non-local control flow | \(\times\)* | |
| Composable error types | \(\times\) | |
| No hidden exceptions | \(\times\) | |
| Works with generics | \(\times\) | |
| Works with lambdas | \(\times\) | |
| Process supervision | \(\times\) |
*Japl’s
Fail effect uses early return, not stack unwinding.
Java’s checked exceptions fail with higher-order functions
because Java’s type system cannot express “this lambda may throw
IOException.” The result is pervasive
try-catch blocks inside lambdas, or the use of
unchecked (runtime) exceptions that bypass the checking system
entirely. Japl’s effect system
integrates naturally with higher-order functions and generics: a
function parameter can carry a Fail effect, and the
compiler propagates it correctly.
8.5 Haskell: Pure but Complex
Haskell’s approach is the most theoretically principled but also the most complex:
| Feature | Haskell | Japl |
|---|---|---|
| Typed error values | ||
| Monadic composition | ||
| Effect tracking | Partial* | |
| Simple error propagation | \(\times\) | |
| No hidden IO exceptions | \(\times\) | |
| Process supervision | \(\times\)\(\dagger\) | |
| Reduced syntactic ceremony | \(\times\) |
*Haskell’s IO type bundles all effects,
including exceptions, into a single opaque monad. Libraries like
polysemy and
fused-effects provide finer-grained tracking but
are not part of the standard language.
\(\dagger\)Cloud Haskell provides Erlang-style distribution and supervision, but it is a library, not a language feature.
Haskell’s primary weakness for error handling is the coexistence
of multiple, incompatible error mechanisms: Either,
Maybe, ExceptT, IO exceptions
(throwIO/catch), pure exceptions
(error/undefined), and asynchronous
exceptions. A Haskell programmer must understand all of these and
choose appropriately. Japl provides
exactly two mechanisms—Result and crash—with clear
guidance on when to use each.
8.6 Summary
| Rust | \(\times\) | \(\times\) | \(\times\) | ||
| Go | \(\sim\) | \(\times\) | \(\times\) | \(\times\) | \(\times\) |
| Erlang | \(\times\) | \(\times\) | \(\times\) | \(\times\) | |
| Java | \(\times\) | \(\sim\) | \(\times\) | \(\times\) | |
| Haskell | \(\sim\) | \(\times\) | \(\times\) | ||
| Japl |
Japl is the only language that achieves all five properties: typed errors, exhaustive matching, effect-tracked failures, supervision trees, and typed crash reasons.
9 Implementation
Implementing the dual error model requires careful runtime design. This section discusses the key implementation considerations.
9.1 Domain Error Implementation
Domain errors (Result types) are implemented as
tagged unions in memory. A Result[A, E] value occupies
\(1 + \max(|A|, |E|)\) words: one
tag word plus space for the larger of the two payloads. This is the
standard algebraic data type representation, identical to Rust’s
approach.
The ? operator compiles to a conditional branch:
-- Source
let x = fallible_call()?
-- Desugared (conceptual)
let x = match fallible_call() with
| Ok(v) -> v
| Err(e) -> return Err(e)
This is a local transformation with no runtime overhead beyond
the branch. There is no stack unwinding, no exception table lookup,
and no dynamic dispatch. The cost of error propagation is exactly
one conditional branch per ? invocation.
9.2 Process Failure Implementation
Process crashes are implemented through a combination of mechanisms:
9.2.0.1 Assertion failures and panics.
When an assert fails or a panic is
invoked, the runtime:
Captures the crash reason (a structured
CrashReasonvalue).Unwinds the process’s call stack, running cleanup for any owned resources (similar to C++ RAII or Rust drop).
Marks the process as terminated with the given crash reason.
Notifies all monitors and linked processes.
9.2.0.2 Stack unwinding vs. process termination.
Unlike C++ or Rust, where stack unwinding is the mechanism for both resource cleanup and non-local control flow, Japl uses unwinding only for resource cleanup on the crash path. Domain errors never unwind the stack; they use normal return-value propagation. This simplifies the runtime: the unwinding mechanism need only handle the crash case, and it always terminates the process.
9.2.0.3 Resource cleanup.
Japl’s ownership system ensures that resources are cleaned up deterministically:
On the normal path: resources are consumed by explicit calls (
File.close,Buffer.freeze).On the crash path: the unwinder invokes drop handlers for all live owned resources on the stack.
The linear type system guarantees that every resource has exactly one cleanup path, whether normal or crash.
9.3 Supervision Implementation
Supervisors are implemented as ordinary Japl processes with special runtime support:
Monitoring: The runtime maintains a table of monitor relationships. When a process terminates, the runtime sends a
ProcessDownmessage to all monitors.Restart logic: The supervisor process receives
ProcessDownmessages and applies the configured restart strategy. This is ordinary message-handling code, not special runtime machinery.Restart limits: The supervisor tracks restart timestamps in a circular buffer. If the number of restarts in the configured time window exceeds the limit, the supervisor itself crashes, escalating to its parent.
9.4 Performance Characteristics
| Operation | Cost | Notes |
|---|---|---|
Result creation |
\(O(1)\) | Tag + payload |
? propagation |
\(O(1)\) | Conditional branch |
match on error |
\(O(1)\) | Tag comparison |
| Process crash | \(O(s)\) | \(s\) = stack depth |
| Resource cleanup | \(O(r)\) | \(r\) = owned resources |
| Supervisor restart | \(O(1)\) | Message send |
| Monitor notification | \(O(m)\) | \(m\) = monitor count |
The key performance insight is that domain errors (the common case) are zero-overhead beyond the branch prediction cost. Process crashes (the uncommon case) have a cost proportional to the amount of cleanup required, which is acceptable for an exceptional event. Supervisor restarts involve only message passing and process creation, both of which are designed to be lightweight.
9.5 Comparison with Exception-Based Implementations
Traditional exception implementations use either table-based or setjmp/longjmp-based unwinding:
Table-based (C++, Java): Zero cost on the non-exception path, but exceptions are extremely expensive (hundreds of microseconds) due to table lookup and stack unwinding.
setjmp/longjmp (some C): Moderate cost on both paths due to register saving and restoration.
Japl’s approach avoids both of
these costs for domain errors. The ? operator is a
conditional branch, costing nanoseconds. Only process crashes incur
unwinding cost, and since crash recovery creates a fresh process,
the total recovery time is dominated by process creation, not
unwinding.
10 Case Studies
We present three case studies demonstrating Japl’s dual error model in realistic scenarios.
10.1 Case Study 1: HTTP Server
An HTTP server encounters many failure modes: malformed requests, missing resources, authentication failures, database errors, and network issues. The dual model cleanly separates these:
type HttpError =
| BadRequest(String)
| NotFound
| Unauthorized
| Forbidden
| InternalError(String)
type AppError =
| Http(HttpError)
| Db(DbError)
| Validation(List[FieldError])
fn handle_get_user(req: Request) -> Response
with Io =
match parse_user_id(req) with
| Err(_) -> response(400, "bad user id")
| Ok(id) ->
match Fail.catch(fn -> get_user(id)) with
| Ok(user) -> json(200, user)
| Err(NotFound) ->
response(404, "not found")
| Err(Unauthorized) ->
response(401, "unauthorized")
| Err(InvalidInput(msg)) ->
response(400, msg)
| Err(DbError(msg)) ->
log_error("db error: " ++ msg)
response(500, "internal error")
fn get_user(id: UserId) -> User
with Io, Fail[AppError] =
let row = db_find_user(id)?
decode_user(row)?
Each HTTP request is handled in its own process. If a handler process crashes (due to a bug, not a domain error), the supervisor restarts the request-handling infrastructure:
fn http_listener(port: Int) -> Never
with Process, Io =
let listener = Tcp.listen(port)
accept_loop(listener)
fn accept_loop(listener: TcpListener) -> Never
with Process, Io =
let conn = Tcp.accept(listener)
-- Each connection gets its own process
let _ = Process.spawn(fn ->
handle_connection(conn)
)
accept_loop(listener)
fn handle_connection(conn: TcpConn) -> Unit
with Io =
let req = Http.parse_request(conn)
let resp = route_and_handle(req)
Http.send_response(conn, resp)
Tcp.close(conn)
If handle_connection crashes, only that one
connection is affected. The listener process continues accepting new
connections. If the listener itself crashes, its supervisor restarts
it.
10.2 Case Study 2: Database Connection Pool
A database connection pool must handle connection failures gracefully while providing a reliable interface to consumers:
type PoolError =
| NoAvailableConnection
| QueryFailed(String)
| ConnectionLost
type PoolMsg =
| Checkout(Reply[DbConn])
| Return(DbConn)
| HealthCheck
fn pool_manager(state: PoolState) -> Never
with Process[PoolMsg], Io =
match Process.receive() with
| Checkout(reply) ->
match find_available(state) with
| Some(conn) ->
Reply.send(reply, Ok(conn))
pool_manager(mark_in_use(state, conn))
| None ->
Reply.send(reply, Err(NoAvailableConnection))
pool_manager(state)
| Return(conn) ->
match validate_connection(conn) with
| Ok(_) ->
pool_manager(mark_available(state, conn))
| Err(_) ->
-- Connection is broken, create new one
let new_conn = Db.connect(state.config)
pool_manager(mark_available(state, new_conn))
| HealthCheck ->
let state = check_all_connections(state)
pool_manager(state)
The pool manager handles connection validation failures as domain errors (creating replacement connections). If the pool manager itself encounters an invariant violation (e.g., the pool state becomes inconsistent), it crashes and the supervisor restarts it:
fn db_pool_supervisor(config: DbConfig)
-> Never with Process =
Supervisor.start(
strategy = OneForOne,
max_restarts = 3,
max_seconds = 30,
children = [
{ id = "pool_manager"
, start = fn ->
let conns = initialize_connections(config)
pool_manager(initial_state(conns))
, restart = Permanent
, shutdown = Timeout(5000)
},
{ id = "health_checker"
, start = fn -> health_check_loop(config)
, restart = Permanent
, shutdown = Timeout(1000)
},
]
)
10.3 Case Study 3: Distributed Task Worker
A distributed task worker receives tasks from a coordinator, executes them, and reports results. It must handle task failures, network partitions, and coordinator unavailability:
type TaskError =
| InvalidTask(String)
| ExecutionFailed(String)
| DependencyMissing(String)
type WorkerMsg =
| ExecuteTask(Task, Reply[TaskResult])
| CancelTask(TaskId)
| UpdateConfig(WorkerConfig)
fn task_worker(state: WorkerState) -> Never
with Process[WorkerMsg], Io, Net =
match Process.receive() with
| ExecuteTask(task, reply) ->
let result = Fail.catch(fn ->
validate_task(task)?
fetch_dependencies(task)?
execute(task)?
)
match result with
| Ok(output) ->
Reply.send(reply, TaskSuccess(output))
| Err(InvalidTask(msg)) ->
Reply.send(reply, TaskRejected(msg))
| Err(ExecutionFailed(msg)) ->
Reply.send(reply, TaskFailed(msg))
| Err(DependencyMissing(dep)) ->
Reply.send(reply, TaskPending(dep))
task_worker(state)
| CancelTask(id) ->
let state = cancel_if_running(state, id)
task_worker(state)
| UpdateConfig(config) ->
task_worker({ state | config })
The worker uses Result types for task-level failures
(which are reported to the coordinator) and relies on supervision
for worker-level failures (which trigger a restart):
fn worker_supervisor(config: WorkerConfig)
-> Never with Process =
Supervisor.start(
strategy = OneForOne,
max_restarts = 10,
max_seconds = 60,
children = List.map(
List.range(1, config.worker_count),
fn i -> {
id = "worker-" ++ Int.to_string(i),
start = fn ->
let state = WorkerState.init(config)
task_worker(state),
restart = Permanent,
shutdown = Timeout(10000),
}
)
)
-- Coordinator connection with reconnection
fn coordinator_link(url: String) -> Never
with Process, Io, Net =
match Fail.catch(fn -> connect_coordinator(url)) with
| Ok(conn) ->
register_with_coordinator(conn)
message_loop(conn)
| Err(e) ->
log_warning("coordinator connection "
++ "failed: " ++ show(e))
Process.sleep(5000)
coordinator_link(url)
In this design:
Task validation, dependency resolution, and execution failures are domain errors, reported back to the coordinator via typed
TaskResultmessages.Worker crashes (due to bugs or resource exhaustion) are handled by the supervisor, which restarts the worker with fresh state.
Coordinator connectivity issues are handled by the
coordinator_linkprocess, which retries connection. If it fails too many times, it crashes and escalates to its supervisor.
11 Formal Properties
We now state and prove three key properties of Japl’s dual error model.
11.1 Error Safety Theorem
Theorem 16 (Error Safety). In a well-typed Japl program, every domain error is either:
Explicitly handled (matched and acted upon), or
Explicitly propagated (via
?or effect signature).
No domain error can be silently discarded.
Proof. The proof proceeds by structural induction on the typing derivation.
Base case: A function that calls a fallible
operation (one with a Fail[E] effect) must either:
Use
?to propagate the error, which requires the calling function’s effect signature to includeFail[E](or a supertype).Use
Fail.catchto handle the error, which eliminates theFail[E]effect and produces aResult[A, E]value.Use
matchon theResultvalue, which requires exhaustive coverage of bothOkandErrvariants.
In all three cases, the error is either propagated or handled.
The type system prevents ignoring a Result value: the
Result type is not Unit, so assigning it
to an unused binding triggers a compiler warning, and the
Fail[E] effect prevents calling the function in a
context that does not expect failure.
Inductive step: If function \(f\) calls function \(g\) with effect Fail[E],
then \(f\)’s effect signature must
include Fail[E] (propagation) or \(f\) must handle the effect (handling).
By the inductive hypothesis, \(g\)’s errors are properly handled or
propagated. The composition preserves the property.
Boundary condition: At the top level
(main or a process entry point), if
Fail[E] remains in the effect set, the compiler reports
an error: unhandled failure effect. The programmer must install a
handler before the program is accepted.
Therefore, in any well-typed program, every domain error is either handled or propagated, with no possibility of silent discard. 0◻ ◻
11.2 Crash Containment Theorem
Theorem 17 (Crash Containment). If process \(P_i\) crashes in a system of processes \(P_1 \parallel P_2 \parallel \cdots \parallel P_n\), the state of every other process \(P_j\) (\(j \neq i\)) is unaffected by the crash.
Proof. The proof relies on three properties enforced by Japl’s type system and runtime:
Property 1: Memory isolation. Processes do not share mutable state. Values passed between processes are either:
Immutable values on the shared heap (which cannot be modified by any process), or
Linearly-typed resources whose ownership is transferred (the sending process loses access).
By construction, no process can hold a mutable reference to memory owned by another process.
Property 2: Communication isolation. Processes communicate exclusively through typed message passing. Messages are values (immutable or ownership-transferred). The runtime’s message delivery mechanism copies immutable values and transfers ownership atomically.
Property 3: Crash scope. When \(P_i\) crashes:
The runtime terminates \(P_i\)’s execution.
All resources owned by \(P_i\) are released (deterministic cleanup).
Monitor notifications are sent as messages to monitoring processes.
No write is performed to any memory accessible by \(P_j\).
Since \(P_j\)’s state consists
of its local stack, its owned resources, and its mailbox, and since
the crash of \(P_i\) can only add a
ProcessDown message to \(P_j\)’s mailbox (if \(P_j\) monitors \(P_i\)), the operational state of \(P_j\) is unaffected.
The only observable effect of \(P_i\)’s crash on \(P_j\) is the arrival of a
ProcessDown message, which is a normal message that
\(P_j\) processes through its
regular message-handling logic. 0◻ ◻
11.3 Supervision Liveness Theorem
Theorem 18 (Supervision Liveness). For a
process \(P\) with restart type
Permanent supervised by a supervisor \(S\) with restart limit \((m, t)\) (at most \(m\) restarts in \(t\) seconds): if \(P\) crashes and fewer than \(m\) restarts have occurred in the last
\(t\) seconds, then \(P\) is restarted within a bounded
time.
Proof. We model the supervisor as a process \(S\) that repeatedly performs:
Wait for a
ProcessDownmessage (blocking receive).Check the restart counter against the limit.
If within limits, spawn a new instance of the child process.
If limits exceeded, crash (escalate to parent supervisor).
Crash detection: The runtime guarantees that
when \(P\) terminates, a
ProcessDown message is enqueued in \(S\)’s mailbox. This follows from the
completeness property of Japl’s
monitoring mechanism, which is implemented in the runtime
scheduler.
Message delivery: The runtime guarantees that
messages in a process’s mailbox are eventually delivered (processes
are fairly scheduled). Therefore, \(S\) will eventually receive the
ProcessDown message.
Restart execution: Upon receiving the message and determining that restart limits are not exceeded, \(S\) invokes the child’s start function, creating a new process. Process creation is an \(O(1)\) runtime operation (allocate a process structure, add to the scheduler’s run queue).
Bounded time: The total time from crash to restart is bounded by: \[T_{\mathit{restart}} \leq T_{\mathit{detect}} + T_{\mathit{schedule}} + T_{\mathit{start}}\] where \(T_{\mathit{detect}}\) is the monitor notification time (one scheduler tick), \(T_{\mathit{schedule}}\) is the time until the supervisor is scheduled (bounded by the scheduler’s fairness guarantee), and \(T_{\mathit{start}}\) is the child creation time (bounded by process allocation cost).
Under the assumption that the scheduler provides fair scheduling with bounded delay, \(T_{\mathit{restart}}\) is bounded.
Caveat (\(T_{\mathit{schedule}}\) under load). The bound on \(T_{\mathit{schedule}}\) depends on the scheduler’s fairness guarantee, which is sensitive to system load. Under heavy load, the number of runnable processes may cause \(T_{\mathit{schedule}}\) to grow, becoming effectively non-deterministic from the application’s perspective. In the worst case, a slow supervisor response can itself trigger cascading timeouts: if a child’s clients time out while waiting for the restart, those clients may themselves fail, causing their supervisors to restart them, amplifying the load. Japl’s runtime mitigates this by giving supervisor processes elevated scheduling priority (supervisors are scheduled before workers), but the liveness bound should be understood as holding under a well-provisioned scheduler, not as an absolute real-time guarantee. Deployers should account for scheduler latency when choosing restart-limit windows. 0◻ ◻
Corollary 19 (Escalation termination). The escalation chain (child \(\to\) supervisor \(\to\) parent supervisor \(\to \cdots\)) terminates: either a supervisor successfully restarts the failed subtree, or the escalation reaches the root supervisor, which terminates the application.
Proof. The supervision tree has finite depth \(d\). At each level, a supervisor either restarts (terminating the escalation) or exceeds its restart limit and crashes (escalating). Since the tree has finite depth, escalation terminates in at most \(d\) steps. ◻
11.4 Composability of Error Safety
Proposition 20 (Compositional error safety). If modules \(M_1\) and \(M_2\) independently satisfy error safety, then their composition \(M_1 \circ M_2\) (where \(M_1\) calls functions from \(M_2\)) also satisfies error safety.
Proof. By the effect system’s composition rules: if
\(M_2\) exports a function \(f\) with effect Fail[E],
then any function in \(M_1\) that
calls \(f\) must either handle
Fail[E] or include it in its own effect signature. The
type checker enforces this at the module boundary. Since both
modules independently satisfy error safety (all their internal error
handling is complete), the composition inherits this property.
0◻ ◻
12 Discussion
12.1 When the Model Breaks Down
No error model is perfect. We acknowledge several limitations:
12.1.0.1 FFI boundaries and asynchronous exceptions.
When Japl calls foreign functions
via FFI, the type system cannot track errors in the foreign code.
Foreign functions may throw C++ exceptions, segfault, or corrupt
memory in ways that violate process isolation. Japl mitigates this through
capability-gated FFI (requiring the Unsafe capability)
and by running FFI calls in dedicated processes that can crash
without affecting the rest of the system.
A subtle hazard arises when a supervisor decides to kill a child
process that is currently executing inside an FFI call. The foreign
code may hold external resources (file locks, GPU memory, database
transactions) that the Japl runtime
cannot release through its normal ownership-based cleanup. To
address this, Japl enforces the
following protocol at the unsafe FFI boundary: (1) the runtime marks
the process as “in-FFI” for the duration of the foreign call, during
which asynchronous kill signals are deferred rather than
delivered immediately; (2) when the foreign call returns, the
runtime checks for a pending kill signal and, if present, invokes a
user-supplied ffi_cleanup callback before crashing the
process; (3) if the foreign call does not return within a
configurable timeout, the runtime forcibly terminates the process
and logs a resource-leak warning—the Crash Containment theorem (17) holds for Japl-managed resources but not for
external resources held across a stuck FFI boundary. Programmers
must therefore treat long-running or resource-holding FFI calls as
inherently unsafe and design their cleanup callbacks
accordingly.
12.1.0.2 Divergence (non-termination).
The dual error model addresses failures that produce a value
(Result) and failures that terminate a process (crash),
but a third failure mode exists: divergence, where a
computation enters an infinite loop and produces neither a result
nor a crash. A divergent process bypasses both the Fail
effect and the crash mechanism, silently consuming resources while
its supervisor receives no ProcessDown notification.
Japl addresses this through two
BEAM-inspired mechanisms. First, each process has a configurable
watchdog timer: if a process does not yield (via message
receive, explicit yield, or function return) within a configurable
time bound, the runtime forces a crash with reason
Timeout. Second, the runtime employs reduction
counting: each function call and loop iteration consumes a
“reduction credit,” and when the credit is exhausted the process is
preempted, giving the scheduler an opportunity to detect runaway
computation. Together, these mechanisms convert divergence into a
detectable crash, bringing it within the scope of supervision.
However, divergence detection is inherently best-effort—a process
that performs useful work slowly is indistinguishable from one that
is divergent—and the watchdog timeout must be tuned to the
application’s expected latency profile.
12.1.0.3 Resource exhaustion.
System-level resource exhaustion (out of memory, file descriptor limits) can affect all processes simultaneously, potentially violating crash containment. Japl’s runtime attempts to reserve emergency memory for crash handling, but this is a best-effort mechanism.
12.1.0.4 Error type proliferation.
In large systems, the number of distinct error types can grow,
leading to complex error type hierarchies. Japl mitigates this through error type
composition (wrapping sub-errors in higher-level types) and the
map_err pattern, but discipline is still required.
12.2 Relationship to Algebraic Effects Research
Japl’s Fail effect is
a specific instance of the general algebraic effects framework . The
effect handlers for Fail are relatively simple (the
handler either returns the error or discards the continuation),
which means that Japl does not
require the full generality of effect handlers with resumable
continuations.
This restriction is deliberate: the full algebraic effects model with multi-shot continuations introduces complexity (both conceptual and in implementation) that is not justified by the error handling use case. Japl uses one-shot continuations for error handling, which can be implemented without heap-allocated continuation frames.
Future work may extend Japl’s
effect system with more general algebraic effects for other use
cases (generators, async/await, transactional memory), using the
Fail effect as a proving ground.
12.3 The Category-Theoretic Perspective
The Kleisli category perspective provides more than just a formal framework; it suggests design principles. In \(\mathbf{Kl}(T_E)\):
Composition of fallible functions is automatically associative.
Error propagation is a natural consequence of the monad structure.
The relationship between pure functions (in \(\mathbf{Set}\)) and fallible functions (in \(\mathbf{Kl}(T_E)\)) is captured by the adjunction between the free and forgetful functors.
The process algebra layer adds a second category: the category of processes and communication channels. The boundary between the two categories (domain errors and process failures) is not a functor—there is no structure-preserving map from process crashes to domain errors—which reinforces the design decision to keep them separate.
12.4 Educational Considerations
One of Japl’s goals is accessibility. The dual error model is designed to be explainable without category theory:
Domain errors: “Sometimes things don’t work out. Your function says so in its return type, and you have to deal with it.” This is familiar to Rust and Go programmers.
Process crashes: “Sometimes things go really wrong. Your process dies, and a supervisor starts a fresh one.” This is familiar to Erlang programmers and to anyone who has restarted a crashed application.
The effect system adds the compiler-checked documentation aspect: “The type signature tells you exactly what can go wrong and how.” This is the lesson of Java’s checked exceptions, done correctly.
13 Related Work: Extended Discussion
13.1 Algebraic Effects in Practice
Koka
demonstrates that algebraic effects can be practical in a
general-purpose language. Koka’s effect system tracks all
computational effects including exceptions, divergence, and IO.
Japl’s effect system is inspired by
Koka’s but differs in two important ways: (1) Japl adds process-level effects
(Process), and (2) Japl’s Fail effect is
specialized for error handling rather than being a general exception
effect.
The Eff programming language provides a direct implementation of algebraic effects and handlers with a clean operational semantics. Frank explores effect typing with a focus on multi-handler patterns. Helium and Effekt explore efficient implementations of effect handlers.
13.2 Process Calculi and Typed Communication
The \(\pi\)-calculus provides the foundation for reasoning about mobile processes and name-passing. Session types extend this with typed communication protocols, ensuring that processes follow agreed-upon message exchange patterns. Japl’s typed mailboxes can be viewed as a simplified form of session typing where the session protocol is determined by the mailbox type.
Singularity OS explored software-isolated processes with typed channels in a systems programming context, demonstrating that process isolation can be enforced by the type system rather than hardware memory protection.
13.3 Error Handling in Distributed Systems
The eight fallacies of distributed computing highlight that network failures are fundamentally different from local failures. Waldo et al. argue that distributed computing cannot be made transparent—the programmer must be aware of distribution boundaries.
Japl takes this critique
seriously: the Net effect explicitly marks functions
that involve network communication, and network failures are domain
errors (represented as Result types) rather than
crashes, since they are anticipated and recoverable. However, Japl allows process IDs to be
location-transparent, accepting Waldo’s critique for failure
handling while maintaining uniformity for the communication
model.
13.4 Gradual Typing and Error Handling
Gradual typing offers a middle ground between
static and dynamic typing. Similarly, one could imagine a “gradual
error handling” approach where error types are optionally specified.
Japl deliberately does not take this
path: the value of the Fail effect comes precisely from
its completeness. A “gradual” Fail effect that allows
untracked errors would undermine the error safety theorem.
14 Future Work
Several directions for future research emerge from this work:
14.0.0.1 Automatic error type inference across module boundaries.
Currently, Japl requires explicit error type annotations at module boundaries. Research into inferring error types across modules while maintaining modular compilation could reduce annotation burden.
14.0.0.2 Supervision strategy verification.
Given typed crash reasons and supervision strategies, it may be possible to statically verify that a supervision tree handles all crash scenarios appropriately—that no crash reason goes unhandled and no restart strategy is inappropriately permissive.
14.0.0.3 Performance-guided error mode selection.
Profiling data could guide the compiler in optimizing error
handling paths—for example, converting Result checks to
branch-free code when profiling shows the error path is extremely
rare.
14.0.0.4 Cross-node supervision.
Extending the supervision model to distributed settings, where supervisor and child may be on different nodes, introduces new challenges around failure detection timing and network partition handling.
14.0.0.5 Error type evolution.
As systems evolve, error types change. Research into backward-compatible error type evolution (adding variants without breaking existing handlers) could leverage Japl’s row-polymorphic type system.
14.0.0.6 Integration with property-based testing.
Japl’s built-in property testing
could be extended to automatically generate test cases that exercise
error paths, using the Fail effect annotations to guide
test generation.
15 Conclusion
Error handling is not a peripheral concern—it is central to the design of reliable software. The history of programming languages shows that no single error handling mechanism is sufficient. Exceptions conflate control flow with failure signaling. Error codes lack algebraic structure. Result types handle domain errors well but provide no answer for unexpected failures. Crash-and-restart semantics handle unexpected failures well but lack compile-time error typing.
Japl’s dual error model synthesizes the best insights from each tradition:
From Rust: typed result values for recoverable errors, the
?propagation operator, exhaustive pattern matching on error types.From Erlang: process isolation, crash-and-restart semantics, supervision trees, the error kernel pattern.
From algebraic effects: the
Faileffect for compiler-tracked, composable failure handling.From Java (corrected): the insight that failure should be visible in type signatures, implemented through effects rather than checked exceptions.
The key contribution is the principled boundary between domain errors and process failures. Domain errors are expected, handled as values, tracked by the effect system, and resolved within the call chain. Process failures are unexpected, handled by supervision, and resolved by restarting with fresh state. The two modes are complementary, not competing.
We have shown that this model satisfies three important formal properties: error safety (domain errors cannot be silently discarded), crash containment (a process crash cannot corrupt another process), and supervision liveness (crashed permanent processes are eventually restarted). These properties hold compositionally, enabling modular reasoning about error handling in large systems.
Through case studies of an HTTP server, a database connection pool, and a distributed task worker, we have demonstrated that the dual model handles real-world error scenarios with minimal ceremony and maximal type safety. The result is a language that makes failure a first-class, well-typed, well-understood aspect of program design—because failures are normal, and they should be typed.
40
D. Abrahams. Exception safety in generic components. In Generic Programming, volume 1766 of LNCS, pages 69–79. Springer, 2000.
J. Armstrong. Making Reliable Distributed Systems in the Presence of Software Errors. PhD thesis, Royal Institute of Technology, Stockholm, 2003.
J. Armstrong. A history of Erlang. In Proceedings of the Third ACM SIGPLAN Conference on History of Programming Languages (HOPL III), pages 6-1–6-26, 2007.
A. Bauer and M. Pretnar. Programming with algebraic effects and handlers. Journal of Logical and Algebraic Methods in Programming, 84(1):108–123, 2015.
D. Biernacki, M. Piróg, P. Polesiuk, and F. Sieczkowski. Abstracting algebraic effects. Proceedings of the ACM on Programming Languages, 3(POPL):6:1–6:28, 2019.
J. Bloch. Effective Java. Addison-Wesley, 2nd edition, 2008.
J. I. Brachthäuser, P. Schuster, and K. Ostermann. Effects as capabilities: Effect handlers and lightweight effect polymorphism. Proceedings of the ACM on Programming Languages, 4(OOPSLA):126:1–126:30, 2020.
T. D. Chandra and S. Toueg. Unreliable failure detectors for reliable distributed systems. Journal of the ACM, 43(2):225–267, 1996.
P. Deutsch. The eight fallacies of distributed computing. Technical report, Sun Microsystems, 1994.
A. A. Donovan and B. W. Kernighan. The Go Programming Language. Addison-Wesley, 2015.
B. Eckel. Does Java need checked exceptions? http://www.mindview.net/Etc/Discussions/CheckedExceptions, 2003.
J. Epstein, A. P. Black, and S. Peyton Jones. Towards Haskell in the cloud. In Proceedings of the 4th ACM Symposium on Haskell, pages 118–129, 2011.
J. Gosling, B. Joy, G. Steele, G. Bracha, and A. Buckley. The Java Language Specification, Java SE 8 Edition. Addison-Wesley, 2014.
C. A. R. Hoare. Communicating Sequential Processes. Prentice Hall, 1985.
K. Honda. Types for dyadic interaction. In CONCUR’93, volume 715 of LNCS, pages 509–523. Springer, 1993.
K. Honda, V. T. Vasconcelos, and M. Kubo. Language primitives and type discipline for structured communication-based programming. In ESOP’98, volume 1381 of LNCS, pages 122–138. Springer, 1998.
G. C. Hunt and J. R. Larus. Singularity: Rethinking the software stack. ACM SIGOPS Operating Systems Review, 41(2):37–49, 2007.
M. P. Jones. Functional programming with overloading and higher-order polymorphism. In Advanced Functional Programming, volume 925 of LNCS, pages 97–136. Springer, 1995.
S. King. polysemy: Higher-order, no-boilerplate, zero-cost free monads. Hackage library, 2019.
O. Kiselyov, A. Sabry, and C. Swords. Extensible effects: An alternative to monad transformers. In Proceedings of the 2013 ACM SIGPLAN Symposium on Haskell, pages 59–70, 2013.
S. Klabnik and C. Nichols. The Rust Programming Language. No Starch Press, 2019.
D. Leijen. Type directed compilation of row-typed algebraic effects. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), pages 486–499, 2017.
T. Lindahl and K. Sagonas. Practical type inference based on success typings. In Proceedings of the 8th ACM SIGPLAN International Conference on Principles and Practice of Declarative Programming, pages 167–178, 2006.
S. Lindley, C. McBride, and C. McLaughlin. Do be do be do. In Proceedings of the 44th ACM SIGPLAN Symposium on Principles of Programming Languages (POPL), pages 500–514, 2017.
M. Logan, E. Merritt, and R. Carlsson. Erlang and OTP in Action. Manning Publications, 2010.
S. Marlow, editor. Haskell 2010 Language Report. https://www.haskell.org/onlinereport/haskell2010/, 2010.
S. Marlow, S. Peyton Jones, A. Moran, and J. Reppy. Asynchronous exceptions in Haskell. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), pages 274–285, 2001.
N. D. Matsakis and F. S. Klock II. The Rust language. In Proceedings of the 2014 ACM SIGAda Annual Conference on High Integrity Language Technology, pages 103–104, 2014.
R. Milner. Communicating and Mobile Systems: the \(\pi\)-Calculus. Cambridge University Press, 1999.
E. Moggi. Notions of computation and monads. Information and Computation, 93(1):55–92, 1991.
R. Pike. Go at Google: Language design in the service of software engineering. In Proceedings of the 3rd Annual Conference on Systems, Programming, and Applications: Software for Humanity (SPLASH), 2012.
G. D. Plotkin and M. Pretnar. Handlers of algebraic effects. In Programming Languages and Systems (ESOP), volume 5502 of LNCS, pages 80–94. Springer, 2009.
G. D. Plotkin and M. Pretnar. Handling algebraic effects. Logical Methods in Computer Science, 9(4):1–36, 2013.
J. G. Siek and W. Taha. Gradual typing for functional languages. In Scheme and Functional Programming Workshop, pages 81–92, 2006.
B. Stroustrup. The Design and Evolution of C++. Addison-Wesley, 1994.
H. Sutter. Exceptional C++: 47 Engineering Puzzles, Programming Problems, and Solutions. Addison-Wesley, 1999.
P. Wadler. The essence of functional programming. In Proceedings of the 19th ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages (POPL), pages 1–14, 1992.
P. Wadler. Monads for functional programming. In Advanced Functional Programming, volume 925 of LNCS, pages 24–52. Springer, 1995.
J. Waldo, G. Wyant, A. Wollrath, and S. Kendall. A note on distributed computing. In Mobile Object Systems: Towards the Programmable Internet, volume 1222 of LNCS, pages 49–64. Springer, 1997.
N. Wu, T. Schrijvers, and R. Hinze. Effect handlers in scope. Journal of Functional Programming, 29:e16, 2019.
16 Extended Typing Rules
We present the full typing rules for Japl’s error handling constructs.
16.1 Judgment Forms
The main typing judgment has the form: \[\Gamma \vdash e : \tau \mid \mathcal{E}\] where \(\Gamma\) is the type environment, \(e\) is an expression, \(\tau\) is the result type, and \(\mathcal{E}\) is the effect set.
16.2 Rules for Fail
16.2.0.1 Fail introduction.
\[\inferrule {\Gamma \vdash e : E} {\Gamma \vdash \mathsf{fail}(e) : \tau \mid \mathcal{E}\cup \{\texttt{Fail}[E]\}}\]
16.2.0.2 Fail propagation
(? operator).
\[\inferrule {\Gamma \vdash e : \texttt{Result}\langle A, E\rangle \mid \mathcal{E}} {\Gamma \vdash e\texttt{?} : A \mid \mathcal{E}\cup \{\texttt{Fail}[E]\}}\]
16.2.0.3 Fail handling.
\[\inferrule {\Gamma \vdash e : A \mid \mathcal{E}\cup \{\texttt{Fail}[E]\}} {\Gamma \vdash \mathsf{Fail.catch}(\lambda().\, e) : \texttt{Result}\langle A, E\rangle \mid \mathcal{E}}\]
16.2.0.4 Effect subsumption.
\[\inferrule {\Gamma \vdash e : \tau \mid \mathcal{E}_1 \quad \mathcal{E}_1 \subseteq \mathcal{E}_2} {\Gamma \vdash e : \tau \mid \mathcal{E}_2}\]
16.3 Rules for Processes and Crashes
16.3.0.1 Process spawn.
\[\inferrule {\Gamma \vdash e : \mathsf{Never} \mid \mathcal{E}\cup \{\texttt{Process}[M]\}} {\Gamma \vdash \mathsf{Process.spawn}(\lambda().\, e) : \mathsf{Pid}[M] \mid \mathcal{E}' \cup \{\texttt{Process}\}}\]
16.3.0.2 Assert (crash on failure).
\[\inferrule
{\Gamma \vdash e : \mathsf{Bool}}
{\Gamma \vdash \mathsf{assert}\; e : \mathsf{Unit} \mid
\mathcal{E}}\] Note: assert does not add an
effect—it either succeeds (producing Unit) or crashes
the process. Process crashes are not tracked in the effect system
because they are by definition unexpected.
16.3.0.3 Process monitor.
\[\inferrule {\Gamma \vdash p : \mathsf{Pid}[M]} {\Gamma \vdash \mathsf{Process.monitor}(p) : \mathsf{MonitorRef} \mid \mathcal{E}\cup \{\texttt{Process}\}}\]
16.4 Linear Resource Typing Rules
The following rules formalize the linear typing discipline that underpins deterministic resource cleanup and the Crash Containment proof (17). A linear type \(\tau^{!}\) denotes a resource that must be consumed (used or released) exactly once.
16.4.0.1 Resource acquisition.
\[\inferrule {\Gamma \vdash e_{\mathit{args}} : \tau_{\mathit{args}} \mid \mathcal{E}\cup \{\texttt{Io}\}} {\Gamma \vdash \mathsf{acquire}(e_{\mathit{args}}) : \tau^{!} \mid \mathcal{E}\cup \{\texttt{Io}\}}\] The \(\mathsf{acquire}\) operation produces a linearly-typed resource handle. The handle must appear in exactly one subsequent consumption site.
16.4.0.2 Resource release.
\[\inferrule {\Gamma, x : \tau^{!} \vdash e_{\mathit{body}} : \sigma \mid \mathcal{E}} {\Gamma \vdash \mathsf{release}(x, e_{\mathit{body}}) : \sigma \mid \mathcal{E}}\] The \(\mathsf{release}\) operation consumes the linear resource \(x\), removing it from the environment. After release, \(x\) is no longer available.
16.4.0.3 Linear variable usage.
\[\inferrule {x : \tau^{!} \in \Gamma} {\Gamma \vdash x : \tau^{!} \mid \mathcal{E}} \quad\text{(exactly once in } \Gamma\text{)}\] A linearly-typed binding must be used exactly once. The type checker rejects programs that use a linear variable zero times (resource leak) or more than once (use-after-free).
16.4.0.4 Ownership transfer (send).
\[\inferrule {\Gamma_1 \vdash p : \mathsf{Pid}[M] \quad \Gamma_2 \vdash v : \tau^{!} \quad \Gamma_1 \cap \Gamma_2 \text{ contains no linear bindings}} {\Gamma_1, \Gamma_2 \vdash \mathsf{Process.send}(p, v) : \mathsf{Unit} \mid \mathcal{E}\cup \{\texttt{Process}\}}\] Sending a linearly-typed value to another process transfers ownership: the sender’s environment no longer contains the binding, and the receiver becomes the sole owner.
16.4.0.5 Crash cleanup obligation.
When a process crashes, the runtime walks its stack and invokes the drop handler for every live linear binding. The type system guarantees that every \(\tau^{!}\) in scope has a corresponding drop handler registered at allocation time. This is the formal basis for the “deterministic cleanup” claim in the Crash Containment proof: no resource is leaked, because every live linear binding is released by the unwinder.
16.5 Result Type Rules
16.5.0.1 Ok introduction.
\[\inferrule {\Gamma \vdash e : A} {\Gamma \vdash \ensuremath{\mathsf{Ok}}(e) : \texttt{Result}\langle A, E\rangle \mid \mathcal{E}}\]
16.5.0.2 Err introduction.
\[\inferrule {\Gamma \vdash e : E} {\Gamma \vdash \ensuremath{\mathsf{Err}}(e) : \texttt{Result}\langle A, E\rangle \mid \mathcal{E}}\]
16.5.0.3 Match elimination.
\[\inferrule {\Gamma \vdash e : \texttt{Result}\langle A, E\rangle \mid \mathcal{E}\\ \Gamma, x : A \vdash e_1 : \tau \mid \mathcal{E}\\ \Gamma, y : E \vdash e_2 : \tau \mid \mathcal{E}} {\Gamma \vdash \mathsf{match}\; e\; \{\ensuremath{\mathsf{Ok}}(x) \Rightarrow e_1;\; \ensuremath{\mathsf{Err}}(y) \Rightarrow e_2\} : \tau \mid \mathcal{E}}\]
17 Supervision State Machine
A supervisor process can be modeled as a finite state machine with the following states:
Running: All children are alive and operational.
Restarting: A child has crashed and is being restarted.
Escalating: Restart limits have been exceeded; the supervisor is crashing.
ShuttingDown: The supervisor has received a shutdown signal and is terminating children.
The transitions are: \[\begin{align*} \mathsf{Running} &\xrightarrow{\mathsf{ChildCrash}(i)} \mathsf{Restarting}(i) \\ \mathsf{Restarting}(i) &\xrightarrow{\mathsf{WithinLimits}} \mathsf{Running} \\ \mathsf{Restarting}(i) &\xrightarrow{\mathsf{LimitsExceeded}} \mathsf{Escalating} \\ \mathsf{Running} &\xrightarrow{\mathsf{Shutdown}} \mathsf{ShuttingDown} \\ \mathsf{Escalating} &\xrightarrow{\tau} \mathsf{crash}(\mathsf{MaxRestartsExceeded}) \end{align*}\]
The restart limit check uses a sliding window:
fn check_restart_limit(
history: List[Timestamp],
max_restarts: Int,
max_seconds: Int,
now: Timestamp
) -> Bool =
let recent = List.filter(history,
fn t -> Timestamp.diff(now, t) < max_seconds)
List.length(recent) < max_restarts
18 Error Handling Patterns Catalog
We catalog common error handling patterns in Japl.
18.1 Pattern: Error Mapping
Convert between error types at module boundaries:
fn get_user(id: UserId) -> User
with Io, Fail[AppError] =
let row = db_query(id)
|> map_err(fn e -> DbError(show(e)))?
decode_user(row)
|> map_err(fn e -> InvalidInput(show(e)))?
18.2 Pattern: Error Accumulation
Collect multiple errors instead of failing at the first one:
fn validate_form(form: Form)
-> Result[ValidForm, List[FieldError]] =
let errors = []
let errors = match validate_name(form.name) with
| Err(e) -> [e, ..errors]
| Ok(_) -> errors
let errors = match validate_email(form.email) with
| Err(e) -> [e, ..errors]
| Ok(_) -> errors
let errors = match validate_age(form.age) with
| Err(e) -> [e, ..errors]
| Ok(_) -> errors
if List.is_empty(errors) then
Ok(to_valid_form(form))
else
Err(List.reverse(errors))
18.3 Pattern: Retry with Backoff
Retry a fallible operation with exponential backoff:
fn retry(
max_attempts: Int,
base_delay_ms: Int,
action: fn() -> a with Io, Fail[e]
) -> a with Io, Fail[e] =
fn go(attempt: Int) -> a with Io, Fail[e] =
match Fail.catch(action) with
| Ok(result) -> result
| Err(e) ->
if attempt >= max_attempts then
Fail.raise(e)
else
let delay = base_delay_ms
* Int.pow(2, attempt)
Process.sleep(delay)
go(attempt + 1)
go(0)
18.4 Pattern: Circuit Breaker
Prevent cascading failures by stopping calls to a failing service:
type CircuitState =
| Closed(failure_count: Int)
| Open(until: Timestamp)
| HalfOpen
fn circuit_breaker(
state: CircuitState,
threshold: Int,
timeout_ms: Int,
action: fn() -> a with Io, Net, Fail[e]
) -> a with Io, Net, Fail[CircuitError[e]] =
match state with
| Open(until) ->
if Timestamp.now() < until then
Fail.raise(CircuitOpen)
else
circuit_breaker(HalfOpen,
threshold, timeout_ms, action)
| HalfOpen ->
match Fail.catch(action) with
| Ok(result) -> result
| Err(e) ->
Fail.raise(ServiceError(e))
| Closed(count) ->
match Fail.catch(action) with
| Ok(result) -> result
| Err(e) ->
if count + 1 >= threshold then
Fail.raise(CircuitOpen)
else
Fail.raise(ServiceError(e))
18.5 Pattern: Crash Boundary
Convert between domain errors and process crashes at a well-defined boundary:
-- Wrap a potentially-crashing computation
-- as a domain error
fn safe_call(f: fn() -> a with Io)
-> Result[a, String] with Process, Io =
let worker = Process.spawn(fn ->
let result = f()
Process.send(Process.self_parent(),
WorkerDone(result))
)
Process.monitor(worker)
match Process.receive_with_timeout(10000) with
| WorkerDone(result) -> Ok(result)
| ProcessDown(^worker, reason) ->
Err("worker crashed: " ++ show(reason))
| _ -> Err("timeout waiting for worker")