Developer Guide

Welcome to the developer guide for the miden node :)

This is intended to serve as a basic introduction to the codebase as well as covering relevant concepts and recording architectural decisions.

This is not intended for dApp developers or users of the node, but for development of the node itself.

It is also a good idea to familiarise yourself with the operator manual.

Living documents go stale - the code is the final arbitrator of truth.

If you encounter any outdated, incorrect or misleading information, please open an issue.

Contributing to Miden Node

First off, thanks for taking the time to contribute!

We want to make contributing to this project as easy and transparent as possible.

Before you begin..

Start by commenting your interest in the issue you want to address - this let's us assign the issue to you and prevents multiple people from repeating the same work. This also lets us add any additional information or context you may need.

We use the next branch as our active development branch. This means your work should fork off the next branch (and not main).

Typos and low-effort contributions

We don't accept PRs for typo fixes as these are often scanned for AI "contributors". If you find typos please open an issue instead.

Commits

Try keep your commit names and messages related to the content. This provides reviewers with context if they need to step through your changes by commit.

This does not need to be perfect because we generally squash merge a PR - the commit naming is therefore only relevant for the review process.

Pre-PR checklist

Before submitting a PR, ensure that you're up to date by rebasing onto next, and that tests and lints pass by running:

# Runs the various lints
make lint
# Runs the test suite
make test

Post-PR

Please don't rebase your branch once the PR has been opened. In other words - only append new commits. This lets reviewers have a consistent view of your changes for follow-up reviews. Reviewers may request a rebase once they're ready in order to merge your changes in.

Any contributions you make will be under the MIT Software License

In short, when you submit code changes, your submissions are understood to be under the same MIT License that covers the project. Feel free to contact the maintainers if that's a concern.

Navigating the codebase

The code is organised using a Rust workspace with separate crates for the node and remote prover binaries, a crate for each node component, a couple of gRPC-related codegen crates, and a catch-all utilities crate.

The primary artifacts are the node and remote prover binaries. The library crates are not intended for external usage, but instead simply serve to enforce code organisation and decoupling.

Crate	Description
`node`	The node executable. Configure and run the node and its components.
`remote-prover`	Remote prover executables. Includes workers and proxies.
`remote-prover-client`	Remote prover client implementation.
`block-producer`	Block-producer component implementation.
`store`	Store component implementation.
`ntx-builder`	Network transaction builder component implementation.
`rpc`	RPC component implementation.
`proto`	Contains and exports all protobuf definitions.
`rpc-proto`	Contains the RPC protobuf definitions. Currently this is an awkward clone of `proto` because we re-use the definitions from the internal protobuf types.
`utils`	Variety of utility functionality.
`test-macro`	Provides a procedural macro to enable tracing in tests.

[!NOTE] > miden-base is an important dependency which contains the core Miden protocol definitions e.g. accounts, notes, transactions etc.

Monitoring

Developer level overview of how we aim to use tracing and open-telemetry to provide monitoring and telemetry for the node.

Please begin by reading through the monitoring operator guide as this will provide some much needed context.

Approach and philosophy

We want to trace important information such that we can quickly recognise issues (monitoring & alerting) and identify the cause. Conventionally this has been achieved via metrics and logs respectively, however a more modern approach is using wide-events/traces and post-processing these instead. We're using the OpenTelemetry standard for this, however we are only using the trace pillar and avoid metrics and logs.

We wish to emit these traces without compromising on code quality and readability. This is also a downside to including metrics - these are usually emitted inline with the code, causing noise and obscuring the business logic. Ideally we want to rely almost entirely on tracing::#[instrument] to create spans as these live outide the function body.

There are of course exceptions to the rule - usually the root span itself is created manually e.g. a new root span for each block building iteration. Inner spans should ideally keep to #[instrument] where possible.

Relevant crates

We've attempted to lock most of the OpenTelemetry crates behind our own abstractions in the utils crate. There are a lot of these crates and it can be difficult to keep them all separate when writing new code. We also hope this will provide a more consistent result as we build out our monitoring.

tracing is the defacto standard for logging and tracing within the Rust ecosystem. OpenTelemetry has decided to avoid fracturing the ecosystem and instead attempts to bridge between tracing and the OpenTelemetry standard in-so-far as is possible. All this to say that there are some rough edges where the two combine - this should improve over time.

crate	description
`tracing`	Emits tracing spans and events.
`tracing-subscriber`	Provides the conventional `tracing` stdout logger (no interaction with OpenTelemetry).
`tracing-forest`	Logs span trees to stdout. Useful to visualize span relations, but cannot trace across RPC boundaries as it doesn't understand remote tracing context.
`tracing-opentelemetry`	Bridges the gaps between `tracing` and the OpenTelemetry standard.
`opentelemetry`	Defines core types and concepts for OpenTelemetry.
`opentelemetry-otlp`	gRPC exporter for OpenTelemetry traces.
`opentelemetry_sdk`	Provides the OpenTelemetry abstractions for metrics, logs and traces.
`opentelemetry-semantic-conventions`	Constants for naming conventions as per OpenTelemetry standard.

Important concepts

OpenTelemetry standards & documentation

https://opentelemetry.io/docs

There is a lot. You don't need all of it - look things up as and when you stumble into confusion.

It is probably worth reading through the naming conventions to get a sense of style.

Footguns and common issues

tracing requires data to be known statically e.g. you cannot add span attributes dynamically. tracing-opentelemetry provides a span extension trait which works around this limitation - however this dynamic information is only visible to the OpenTelemetry processing i.e. tracing_subscriber won't see this at all.

In general, you'll find that tracing subscribers are blind to any extensions or OpenTelemetry specific concepts. The reverse is of course not true because OpenTelemetry is integrating with tracing.

Another pain point is error stacks - or rather lack thereof. #[tracing::instrument(err)] correctly marks the span as an error, however unfortunately the macro only uses the Display or Debug implementation of the error. This means you are missing the error reports entirely. tracing_opentelemetry reuses the stringified error data provided by tracing so currently there is no work-around for this. Using Debug via ?err at least shows some information but one still misses the actual error messages which is quite bad.

Manually instrumenting code (i.e. without #[instrument]) can be rather error prone because async calls must be manually instrumented each time. And non-async code also requires holding the span.

Distributed context

We track traces across our components by injecting the parent span ID into the gRPC client's request metadata. The server side then extracts this and uses this as the parent span ID for its processing.

This is an OpenTelemetry concept - conventional tracing cannot follow these relations.

Read more in the official OpenTelemetry documentation.

Choosing spans

A root span should represent a set of operations that belong together. It also shouldn't live forever as span information is usually only sent once the span closes i.e. a root span around the entire node makes no sense as the operation runs forever.

A good convention to follow is creating child spans for timing information you may want when debugging a failure or slow operation. As an example, it may make sense to instrument a mutex locking function to visualize the contention on it. Or separating the database file IO from the sqlite statement creation. Essentially operations which you would otherwise consider logging the timings for should be separate spans. While you may find this changes the code you might otherwise create, we've found this actually results in fairly good structure since it follows your business logic sense.

Inclusions and naming conventions

Where possible, attempt to find and use the naming conventions specified by the standard, ideally via the opentelemetry-semantic-conventions crate.

Include information you'd want to see when debugging - make life easy for your future self looking at data at 3AM on a Saturday. Also consider what information may be useful when correlating data e.g. client IP.

Node components

The node is split into three distinct components that communicate via gRPC. See the Operator guide#architecture chapter for an overview of each component.

The following sections will describe the inner architecture of each component.

RPC Component

This is by far the simplest component. Essentially this is a thin gRPC server which proxies all requests to the store and block-producer components.

Its main function is to pre-validate all requests before sending them on. This means malformed or non-sensical requests get rejected before reaching the store and block-producer, reducing their load. Notably this also includes verifying the proofs of submitting transactions. This allows the block-producer to skip proof verification (it trusts the RPC component), reducing the load in this critical component.

RPC Versioning

The RPC server enforces version requirements against connecting clients that provide the HTTP ACCEPT header. When this header is provided, its corresponding value must follow this format: application/vnd.miden.0.9.0+grpc.

If there is a mismatch in version, clients will encounter an error while executing gRPC requests against the RPC server with the following details:

gRPC status code: 3 (Invalid Argument)
gRPC message: Missing required ACCEPT header

The server will reject any version that does not have the same major and minor version to it. This behaviour will change after v1.0.0., at which point only the major version will be taken into account.

Error Handling

The RPC component uses domain-specific error enums for structured error reporting instead of proto-generated error types. This provides better control over error codes and makes error handling more maintainable.

Error Architecture

Error handling follows this pattern:

Domain Errors: Business logic errors are defined in domain-specific enums
gRPC Conversion: Domain errors are converted to gRPC Status objects with structured details
Error Details: Specific error codes are embedded in Status.details as single bytes

SubmitProvenTransaction Errors

Transaction submission errors are:

#![allow(unused)]
fn main() {
enum SubmitProvenTransactionGrpcError {
    Internal = 0,
    DeserializationFailed = 1,
    InvalidTransactionProof = 2,
    IncorrectAccountInitialCommitment = 3,
    InputNotesAlreadyConsumed = 4,
    UnauthenticatedNotesNotFound = 5,
    OutputNotesAlreadyExist = 6,
    TransactionExpired = 7,
}
}

Error codes are embedded as single bytes in Status.details

Store component

This component persists the chain state in a sqlite database. It also stores each block's raw data as a file.

Mekle data structures are kept in-memory and are rebuilt on startup. Other data like account, note and nullifier information is always read from disk. We will need to revisit this in the future but for now this is performant enough.

Migrations

We have database migration support in place but don't actively use it yet. There is only the latest schema, and we reset chain state (aka nuke the existing database) on each release.

Note that the migration logic includes both a schema number and a hash based on the sql schema. These are both checked on node startup to ensure that any existing database matches the expected schema. If you're seeing database failures on startup its likely that you created the database before making schema changes resulting in different schema hashes.

Architecture

The store consists mainly of a gRPC server which answers requests from the RPC and block-producer components, as well as new block submissions from the block-producer.

A lightweight background process performs database query optimisation by analysing database queries and statistics.

Block Producer Component

The block-producer is responsible for ordering transactions into batches, and batches into blocks, and creating the proofs for these. Proving is usually outsourced to a remote prover but can be done locally if throughput isn't essential, e.g. for test purposes on a local node.

It hosts a single gRPC endpoint to which the RPC component can forward new transactions.

The core of the block-producer revolves around the mempool which forms a DAG of all in-flight transactions and batches. It also ensures all invariants of the transactions are upheld e.g. account's current state matches the transaction's initial state, that all input notes are valid and unconsumed and that the transaction hasn't expired.

Batch production

Transactions are selected from the mempool periodically to form batches. This batch is then proven and submitted back to the mempool where it can be included in a block.

Block production

Proven batches are selected from the mempool periodically to form the next block. The block is then proven and committed to the store. At this point all transactions and batches in the block are removed from the mempool as committed.

Transaction lifecycle

Transaction arrives at RPC component
Transaction proof is verified
Transaction arrives at block-producer
Transaction delta is verified
- Does the account state match
- Do all input notes exist and are unconsumed
- Output notes are unique
- Transaction is not expired
Wait until all parent transactions are in a batch
Be selected as part of a batch
Proven as part of a batch
Wait until all parent batches are in a block
Be selected as part of a block
Committed

Note that its possible for transactions to be rejected/dropped even after they've been accepted, at any point in the above lifecycle (which effectively shows the happy path). This can occur if:

The transaction expires before being included in a block.
Any parent transaction is dropped (which will revert the state, invalidating child transactions).
It causes proving or any part of block/batch creation to fail. This is a fail-safe against unforeseen bugs, removing problematic (but potentially valid) transactions from the mempool to prevent outages.

Network Transaction Builder Component

The network transaction builder (NTB) is responsible for driving the state of network accounts.

What is a network account

Network accounts are a special type of fully public account which contains no authentication and whose state can therefore be updated by anyone (in theory). Such accounts are required when publicly mutable state is needed.

The issue with publicly mutable state is that transactions against an account must be sequential and require the previous account commitment in order to create the transaction proof. This conflicts with Miden's client side proving and concurrency model since users would race each other to submit transactions against such an account.

Instead the solution is to have the network be responsible for driving the account state forward, and users can interact with the account using notes. Notes don't require a specific ordering and can be created concurrently without worrying about conflicts. We call these network notes and they always target a specific network account.

A network transaction is a transaction which consumes and applies a set of network notes to a network account. There is nothing special about the transaction itself - it can only be identified by the fact that it updates the state of a network account.

Limitations

At present, we artificially limit this such that only this component may create transactions against network accounts. This is enforced at the RPC layer by disallowing network transactions entirely in that component. The NTB skirts around this by submitting its transactions directly to the block-producer.

This limitation is there to prevent complicating the NTBs implementation while the protocol and definitions of network accounts, notes and transactions mature.

Implementation

On startup the mempool loads all unconsumed network notes from the store. From there it monitors the mempool for events which would impact network account state. This communication takes the form of an event stream via gRPC.

The NTB periodically selects an arbitrary network account with available network notes and creates a network transaction for it.

The block-producer remains blissfully unaware of network transactions. From its perspective a network transaction is simply the same as any other.

Oddities and FAQs

Common questions and head scratchers.

Chain MMR

The chain MMR always lags behind the blockchain by one block because otherwise there would be a cyclic dependency between the chain MMR and the block hash:

chain MMR contains each block's hash as a leaf
block hash calculation includes the chain MMR's root

To work-around this the inclusion of a block hash in the chain MMR is delayed by one block. Or put differently, block N is responsible for inserting block N-1 into the chain MMR. This does not break blockchain linkage because the block header (and therefore hash) still includes the previous block's hash.

Keyboard shortcuts

The Miden Node Developer Guide