Skip to main content
This document provides a practical guide for operating StableNet nodes, including metric collection, logging, disk space management, database maintenance, and common troubleshooting scenarios.
The tools and procedures described here are intended to maintain node stability and to diagnose issues that arise in production environments.
For validator-specific operations, refer to Validator Operations. For initial network deployment and configuration, refer to Network Deployment.

Metrics Collection

StableNet nodes expose a wide range of internal metrics to observe performance and operational status.
The metrics system is implemented using the go-metrics library and provides real-time indicators for consensus processing, transaction handling, network status, and database behavior.
In production environments, metric collection enables operators to monitor:
  • Block production and consensus latency
  • Transaction pool congestion
  • Peer connection stability
  • Database and state processing performance
  • Resource bottlenecks (CPU, memory, disk I/O)

Enabling Metrics

Metrics collection is controlled via command-line flags.
FlagDescriptionDefault
--metricsEnable metric collection and exposureDisabled
--metrics.expensiveEnable expensive metrics (not recommended in production)Disabled
--metrics.addrMetrics HTTP server bind addressNone
--metrics.portMetrics HTTP server port6060
Example: running the default metrics server
gstable --metrics --metrics.addr 0.0.0.0 --metrics.port 6060
The metrics HTTP endpoint exposes Prometheus-compatible data at: http://<addr>:<port>/debug/metrics In production environments, it is recommended not to expose the metrics server directly to the public network.
Instead, restrict access via an internal network or a proxy.

Exporting to InfluxDB

Metrics can be exported to InfluxDB for long-term storage and time-series analysis.
StableNet supports both InfluxDB v1 and v2.

InfluxDB v1 Configuration

FlagDescription
--metrics.influxdbEnable InfluxDB v1 export
--metrics.influxdb.endpointInfluxDB API endpoint
--metrics.influxdb.databaseDatabase name
--metrics.influxdb.usernameAuthentication username
--metrics.influxdb.passwordAuthentication password
--metrics.influxdb.tagsComma-separated key/value tags

InfluxDB v2 Configuration

FlagDescription
--metrics.influxdbv2Enable InfluxDB v2 export
--metrics.influxdb.tokenAuthentication token
--metrics.influxdb.bucketBucket name
--metrics.influxdb.organizationOrganization name
Examples:
# InfluxDB v1
gstable --metrics --metrics.influxdb \
  --metrics.influxdb.endpoint "http://localhost:8086" \
  --metrics.influxdb.database "gstable_metrics" \
  --metrics.influxdb.username "admin" \
  --metrics.influxdb.password "secret" \
  --metrics.influxdb.tags "host=node01,network=mainnet"

# InfluxDB v2
gstable --metrics --metrics.influxdbv2 \
  --metrics.influxdb.endpoint "http://localhost:8086" \
  --metrics.influxdb.token "my-token" \
  --metrics.influxdb.bucket "gstable" \
  --metrics.influxdb.organization "my-org"

Available Metrics

Key Metrics by Category

CategoryMetric NameTypeDescriptionSource
WBFT Consensusconsensus/wbft/core/commitworkTimerBlock commit processing timeminer/worker.go
Workerminer.newTxsCounterNumber of incoming transactionsminer/worker.go
Workerminer.runningBoolWhether the block production worker is activeminer/worker.go
Workerminer.syncingBoolWhether the node is syncingminer/worker.go
ChainBlock insertion rateMeterBlock insertion throughputcore/blockchain.go
ChainReorg depthCounterChain reorganization depthcore/blockchain.go
TxPoolPending transactionsGaugeExecutable transactionscore/txpool/legacypool
TxPoolQueued transactionsGaugeQueued (non-executable) transactionscore/txpool/legacypool
P2PPeer countGaugeNumber of connected peersp2p/server.go
P2PIngress / EgressMeterNetwork trafficp2p/server.go
StateTrie cache hitsCounterState cache hit countcore/state/statedb.go
StateCommit timeTimerState commit durationcore/state/statedb.go

StableNet / Anzeon-Specific Metrics

In Anzeon (WBFT)-based networks, additional consensus-specific metrics are available:
  • Gas tip change events via governance
  • Epoch-based validator set changes
  • BLS signature verification latency
  • Round changes and timeout occurrences

Worker State Monitoring

The worker structure exposes several atomic variables that reflect block production and consensus processing state.
These are critical for diagnosing why block production may have stalled on validator nodes.

Logging

StableNet uses a structured logging system, with configurable verbosity levels to control output detail.

Log Levels

LevelValueDescription
Critical1Fatal errors requiring immediate action
Error2Errors that may cause functional failure
Warn3Warning conditions
Info4General operational information (default)
Debug5Detailed debugging logs
Trace6Very detailed trace logs

Log Output Examples

Node startup logs:
INFO Starting Gstable on StableNet
INFO Maximum peer count total=50
INFO Set global gas cap cap=50,000,000
Disk space warning logs:
WARN Disk space is running low available=50GiB
ERROR Low disk space. Shutting down to prevent database corruption

Disk Space Management

StableNet includes automatic disk space monitoring to prevent database corruption caused by insufficient disk space.

Disk Space Thresholds

The effective threshold is determined by one of the following:
  • Default: 2 * TrieDirtyCache
  • Cache-based: 2 * --cache * --cache.gc / 100
  • Explicit configuration: --datadir.minfreedisk

Behavior

  1. Free space ≥ 2× threshold: normal operation
  2. Between 1× and 2×: periodic warning logs
  3. < 1×: node shutdown is triggered

Platform-Specific Implementations

PlatformImplementationSystem Call
Linux / Unixcmd/utils/diskusage.gosyscall.Statfs()
Windowscmd/utils/diskusage_windows.goGetDiskFreeSpaceEx()
OpenBSDcmd/utils/diskusage_openbsd.gosyscall.Statfs()

Database Maintenance

Database Backends

StableNet supports the following database backends:
  • LevelDB
  • Pebble

Compaction

  • Automatic compaction runs in the background during normal operation.
  • Completion is logged.
INFO Database compaction finished elapsed=5m30s

State Pruning

  • Offline pruning is supported only for the hash-based state schema
  • Requires the node to be stopped
  • If interrupted, recovery is attempted on the next startup

Ancient Data (Freezer)

  • Data older than a certain block height is moved to ancient storage
  • Read-only with high compression efficiency
  • Grows continuously as the chain advances

Node Health Monitoring

StableNet tracks abnormal shutdowns.
  • A clean marker is written on normal shutdown
  • A warning log is emitted on restart after an abnormal shutdown

Monitoring Checklist

Daily
  • Synchronization completed
  • Peer count ≥ 10
  • Available disk space check
Weekly
  • Database size growth trend
  • Occurrence of abnormal shutdowns
Monthly
  • Review log retention policies
  • Manage ancient data growth
  • Evaluate version upgrades

Troubleshooting

Node Synchronization Failure

  • Verify peer connectivity
  • Check firewall configuration
  • Confirm network ID and bootnode settings

High Memory Usage

  • Review cache configuration
  • Check whether archive mode is enabled
  • Inspect system OOM logs

Slow Block Processing

  • Check disk I/O bottlenecks
  • Inspect the commitwork metric
  • Review cache settings and CPU core availability

Database Corruption

  • Stop the node immediately
  • Restore from backup or resynchronize
  • Consider switching to the Pebble backend for long-term stability
By following these monitoring and maintenance practices, StableNet nodes can be operated reliably, and production issues can be diagnosed and resolved efficiently.