Reference

Configuration Reference

Marmot v2 uses a TOML configuration file (default: config.toml). All settings have sensible defaults.

Core Configuration

node_id = 0  # 0 = auto-generate
data_dir = "./marmot-data"

Transaction Manager

[transaction]
heartbeat_timeout_seconds = 10  # Transaction timeout without heartbeat
conflict_window_seconds = 10    # Conflict resolution window
lock_wait_timeout_seconds = 50  # Lock wait timeout (MySQL: innodb_lock_wait_timeout)

Note: Transaction log garbage collection is managed by the replication configuration to coordinate with anti-entropy. See replication.gc_min_retention_hours and replication.gc_max_retention_hours.

Connection Pool

[connection_pool]
pool_size = 4              # Number of SQLite connections
max_idle_time_seconds = 10 # Max idle time before closing
max_lifetime_seconds = 300 # Max connection lifetime (0 = unlimited)

gRPC Client

[grpc_client]
keepalive_time_seconds = 10    # Keepalive ping interval
keepalive_timeout_seconds = 3  # Keepalive ping timeout
max_retries = 3                # Max retry attempts
retry_backoff_ms = 100         # Retry backoff duration
compression_level = 1          # zstd compression (0=disabled, 1-4)

Compression Levels:

LevelSpeedRatioUse Case
0--Disabled
1~318 MB/s2.88xDefault - best for most deployments
2~134 MB/s3.0xBalanced speed/compression
3~67 MB/s3.2xBetter compression, slower
4~12 MB/s3.5xBandwidth-constrained networks

Decompression is always fast (~1600 MB/s) regardless of compression level. Uses zstd (opens in a new tab) via klauspost/compress (opens in a new tab).

Coordinator

[coordinator]
prepare_timeout_ms = 2000 # Prepare phase timeout
commit_timeout_ms = 2000  # Commit phase timeout
abort_timeout_ms = 2000   # Abort phase timeout

Cluster

[cluster]
grpc_bind_address = "0.0.0.0"
grpc_port = 8080
seed_nodes = []                # List of seed node addresses
cluster_secret = ""            # PSK for cluster authentication (see Security section)
gossip_interval_ms = 1000      # Gossip interval
gossip_fanout = 3              # Number of peers to gossip to
suspect_timeout_ms = 5000      # Suspect timeout
dead_timeout_ms = 10000        # Dead timeout

Replication

[replication]
default_write_consistency = "QUORUM"      # Write consistency level: ONE, QUORUM, ALL
default_read_consistency = "LOCAL_ONE"    # Read consistency level
write_timeout_ms = 5000                   # Write operation timeout
read_timeout_ms = 2000                    # Read operation timeout
 
# Anti-Entropy: Background healing for eventual consistency
# - Detects and repairs divergence between replicas
# - Uses delta sync for small lags, snapshot for large lags
# - Includes gap detection to prevent incomplete data after GC
enable_anti_entropy = true                 # Enable automatic catch-up for lagging nodes
anti_entropy_interval_seconds = 30         # How often to check for lag (default: 30s)
gc_interval_seconds = 60                   # GC interval (MUST be >= anti_entropy_interval)
delta_sync_threshold_transactions = 10000  # Delta sync if lag < 10K txns
delta_sync_threshold_seconds = 3600        # Snapshot if lag > 1 hour
 
# Garbage Collection: Reclaim disk space by deleting old transaction records
# - gc_interval must be >= anti_entropy_interval (validated at startup)
# - gc_min must be >= delta_sync_threshold (validated at startup)
# - gc_max should be >= 2x delta_sync_threshold (recommended)
# - Set gc_max = 0 for unlimited retention
gc_min_retention_hours = 2   # Keep at least 2 hours (>= 1 hour delta threshold)
gc_max_retention_hours = 24  # Force delete after 24 hours

Anti-Entropy Tuning:

  • Small clusters (2-3 nodes): Use default settings (30s AE, 60s GC)
  • Large clusters (5+ nodes): Consider increasing AE interval to 60-120s and GC to 2x that value
  • High write throughput: Increase delta_sync_threshold_transactions to 50000+
  • Long-running clusters: Keep gc_max_retention_hours at 24+ to handle extended outages

GC Configuration Rules (Validated at Startup):

  • gc_min_retention_hours must be >= delta_sync_threshold_seconds (in hours)
  • gc_max_retention_hours should be >= 2x delta_sync_threshold_seconds
  • Violating these rules will cause startup failure with helpful error messages

Query Pipeline

[query_pipeline]
transpiler_cache_size = 10000  # LRU cache for MySQL→SQLite transpilation

Vector Index

[vector_index]
enabled = false
data_dir = ""     # Reserved for a future explicit local vector-state root override

Settings:

  • enabled: Enables vector-index DDL, background workers, and query rewrite support.
  • data_dir: Reserved for a future explicit local vector-state root override. Current builds colocate .vecseg state next to the database file.

Runtime note:

  • Current builds store vector derived state in local .vecseg files next to the database and use the local segment store plus overlay journal for ANN reads, with exact rerank from the base table shortlist.

Batch Commit

Controls SQLite write batching for improved throughput. Batches multiple transactions into single SQLite commits, reducing fsync overhead.

[batch_commit]
enabled = true                       # Enable batch committing
max_batch_size = 128                 # Max transactions per batch
max_wait_ms = 10                     # Max wait before flush (ms)
 
# WAL Checkpoint: Manages WAL file size by checkpointing after batches
checkpoint_enabled = true            # Enable automatic checkpointing
checkpoint_passive_thresh_mb = 4.0   # PASSIVE checkpoint when WAL > 4MB
checkpoint_restart_thresh_mb = 16.0  # RESTART checkpoint when WAL > 16MB
 
# Incremental Vacuum: Reclaims freelist pages in time-limited background task
# Requires sqlite_vacuum_incr build tag (included in default build)
incremental_vacuum_enabled = true    # Enable incremental vacuum after checkpoint
incremental_vacuum_pages = 100       # Pages to vacuum per iteration
incremental_vacuum_time_limit_ms = 10 # Max time budget for vacuum (ms)

How Incremental Vacuum Works:

  1. After successful WAL checkpoint, vacuum is triggered in background goroutine
  2. Runs PRAGMA incremental_vacuum(N) in loop until time limit or freelist empty
  3. Non-blocking: doesn't delay commit responses to clients
  4. Prevents goroutine pileup with atomic flag (only one vacuum runs at a time)

Tuning Tips:

  • High write workload: Decrease incremental_vacuum_time_limit_ms to 5-10ms
  • Large deletes: Increase incremental_vacuum_pages to 500+ for faster reclamation
  • Disable: Set incremental_vacuum_enabled = false if using external vacuum scheduling

MySQL Protocol Server

[mysql]
enabled = true
bind_address = "0.0.0.0"
port = 3306
max_connections = 1000
local_infile_enabled = true         # Enable LOAD DATA LOCAL INFILE
auto_id_mode = "compact"  # "compact" (53-bit, default) or "extended" (64-bit)

Auto ID Mode:

  • "compact" (default): Generates 53-bit IDs that are safe for all clients including JavaScript (Number.MAX_SAFE_INTEGER). Format: 41 bits timestamp + 6 bits node + 6 bits sequence. Supports ~69 years from epoch (Jan 2, 2025) with 64K IDs/sec per node.
  • "extended": Generates full 64-bit HLC-based IDs for maximum uniqueness and ordering guarantees. Use when all clients support 64-bit integers natively.

LOAD DATA LOCAL INFILE Notes:

  • Supported variant is LOAD DATA LOCAL INFILE (client-uploaded bytes).
  • LOAD DATA INFILE (server-side file read) is not supported.
  • Client capability must be enabled (for example, mysql --local-infile=1 or driver-specific local infile flags).

Logging

[logging]
verbose = false          # Enable verbose logging (debug level)
format = "json"          # Log format: "console" or "json" (json is 21% faster)
file = ""                # Log file path (empty = stdout only)
max_size_mb = 100        # Max size in MB before rotation (default: 100)
max_backups = 5          # Number of old log files to keep (default: 5)
compress = true          # Compress rotated files with gzip (default: true)

File Logging with Rotation:

When file is set, logs are written to both stdout and the specified file. The file is automatically rotated when it reaches max_size_mb, keeping the last max_backups files.

[logging]
file = "/var/log/marmot/marmot.log"
max_size_mb = 100
max_backups = 5
compress = true

CLI Flags

Marmot supports the following command-line flags that override config file settings:

FlagDescription
-configPath to configuration file (default: config.toml)
-data-dirData directory (overrides data_dir in config)
-node-idNode ID (overrides node_id in config, 0=auto)
-grpc-portgRPC port (overrides cluster.grpc_port in config)
-mysql-portMySQL port (overrides mysql.port in config)
-follow-addressesComma-separated addresses for replica mode
-daemonRun as background daemon (detached from terminal)
-pid-filePID file path (used with -daemon)

Example:

./marmot -config=/etc/marmot/config.toml -data-dir=/var/lib/marmot -grpc-port=8081 -mysql-port=3307

Running as Daemon:

# Start as daemon with PID file
./marmot -config=/etc/marmot/config.toml -daemon -pid-file=/var/run/marmot.pid
 
# Check if running
cat /var/run/marmot.pid
 
# Stop daemon
kill $(cat /var/run/marmot.pid)

Note: When running as daemon, configure logging.file in config to capture logs since stdout/stderr are detached.