Persistence

Configure durable storage for patterns, execution history, and audit logs.

Overview

Parallax Enterprise supports persistent storage backends for production deployments where data durability is required.

Storage Backends

PostgreSQL (Recommended)

Production-grade relational database:

# parallax.config.yaml
storage:
  type: postgresql
  postgresql:
    host: postgres.example.com
    port: 5432
    database: parallax
    username: parallax
    password: ${POSTGRES_PASSWORD}

    # Connection pool
    pool:
      minConnections: 5
      maxConnections: 20
      idleTimeout: 60000

    # SSL
    ssl:
      enabled: true
      mode: verify-full
      ca: /etc/ssl/certs/ca.crt

Features:

ACID transactions
Concurrent access
Full-text search
Point-in-time recovery
Replication support

SQLite

Embedded database for simple deployments:

storage:
  type: sqlite
  sqlite:
    path: /data/parallax.db

    # Write-ahead logging for better concurrency
    walMode: true

    # Busy timeout (ms)
    busyTimeout: 5000

Features:

Zero configuration
Single file storage
Good for development
Limited concurrent writes

File System

Simple file-based storage:

storage:
  type: file
  file:
    basePath: /data/parallax

    # File format
    format: yaml  # or json

    # Compression
    compress: true

Features:

Human-readable files
Easy backup
Limited querying
Not recommended for production

What's Stored

Patterns

Pattern definitions and versions:

-- PostgreSQL schema
CREATE TABLE patterns (
  id UUID PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  version VARCHAR(50) NOT NULL,
  description TEXT,
  definition JSONB NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE,
  created_by VARCHAR(255),
  UNIQUE(name, version)
);

CREATE INDEX idx_patterns_name ON patterns(name);
CREATE INDEX idx_patterns_created_at ON patterns(created_at);

Stored data:

Pattern YAML/JSON definition
Version history
Metadata (author, description)
Creation timestamp

Executions

Execution records and results:

CREATE TABLE executions (
  id UUID PRIMARY KEY,
  pattern_name VARCHAR(255) NOT NULL,
  pattern_version VARCHAR(50) NOT NULL,
  status VARCHAR(50) NOT NULL,
  input JSONB,
  output JSONB,
  error JSONB,
  started_at TIMESTAMP WITH TIME ZONE,
  completed_at TIMESTAMP WITH TIME ZONE,
  duration_ms INTEGER,
  metadata JSONB
);

CREATE INDEX idx_executions_pattern ON executions(pattern_name);
CREATE INDEX idx_executions_status ON executions(status);
CREATE INDEX idx_executions_started_at ON executions(started_at);

Stored data:

Input/output data
Execution status and errors
Timing information
Agent assignments
Step-by-step results

Audit Logs

Security and compliance logging:

CREATE TABLE audit_logs (
  id UUID PRIMARY KEY,
  timestamp TIMESTAMP WITH TIME ZONE NOT NULL,
  action VARCHAR(100) NOT NULL,
  actor VARCHAR(255),
  resource_type VARCHAR(100),
  resource_id VARCHAR(255),
  details JSONB,
  ip_address INET,
  user_agent TEXT
);

CREATE INDEX idx_audit_timestamp ON audit_logs(timestamp);
CREATE INDEX idx_audit_action ON audit_logs(action);
CREATE INDEX idx_audit_actor ON audit_logs(actor);

Stored data:

All API actions
User/system identification
Request details
IP addresses

Configuration

Environment Variables

Variable	Description
`PARALLAX_STORAGE_TYPE`	Storage backend (postgresql, sqlite, file)
`PARALLAX_DATABASE_URL`	Full database connection URL
`PARALLAX_DATABASE_HOST`	Database host
`PARALLAX_DATABASE_PORT`	Database port
`PARALLAX_DATABASE_NAME`	Database name
`PARALLAX_DATABASE_USER`	Database user
`PARALLAX_DATABASE_PASSWORD`	Database password

Helm Values

# values.yaml
persistence:
  enabled: true

postgresql:
  enabled: true
  auth:
    username: parallax
    password: ${POSTGRES_PASSWORD}
    database: parallax
  primary:
    persistence:
      enabled: true
      size: 50Gi
      storageClass: fast-ssd
  readReplicas:
    replicaCount: 2

External PostgreSQL

Connect to existing PostgreSQL:

persistence:
  enabled: true

postgresql:
  enabled: false  # Don't deploy PostgreSQL

externalDatabase:
  host: postgres.example.com
  port: 5432
  database: parallax
  username: parallax
  existingSecret: postgres-credentials
  existingSecretPasswordKey: password

Data Retention

Configure Retention

storage:
  retention:
    # Execution history
    executions:
      enabled: true
      maxAge: 30d
      maxCount: 100000

    # Audit logs
    auditLogs:
      enabled: true
      maxAge: 90d

    # Cleanup schedule
    cleanupSchedule: "0 2 * * *"  # 2 AM daily

Manual Cleanup

# Delete old executions
parallax storage cleanup executions --older-than 30d

# Delete old audit logs
parallax storage cleanup audit-logs --older-than 90d

# Dry run
parallax storage cleanup executions --older-than 30d --dry-run

Archive Before Delete

# Archive to S3 before deletion
parallax storage archive executions \
  --older-than 30d \
  --destination s3://backups/parallax/executions/

Backup and Restore

PostgreSQL Backup

# Full backup
pg_dump -h localhost -U parallax -d parallax > backup.sql

# Compressed backup
pg_dump -h localhost -U parallax -d parallax | gzip > backup.sql.gz

# Specific tables
pg_dump -h localhost -U parallax -d parallax \
  -t patterns -t executions > backup.sql

PostgreSQL Restore

# Restore full backup
psql -h localhost -U parallax -d parallax < backup.sql

# Restore compressed
gunzip -c backup.sql.gz | psql -h localhost -U parallax -d parallax

Kubernetes CronJob

apiVersion: batch/v1
kind: CronJob
metadata:
  name: parallax-backup
spec:
  schedule: "0 3 * * *"  # 3 AM daily
  jobTemplate:
    spec:
      template:
        spec:
          containers:
            - name: backup
              image: postgres:15
              command:
                - /bin/sh
                - -c
                - |
                  pg_dump -h $PGHOST -U $PGUSER -d $PGDATABASE | \
                  gzip | \
                  aws s3 cp - s3://backups/parallax/$(date +%Y%m%d).sql.gz
              env:
                - name: PGHOST
                  value: parallax-postgresql
                - name: PGUSER
                  valueFrom:
                    secretKeyRef:
                      name: postgres-credentials
                      key: username
                - name: PGPASSWORD
                  valueFrom:
                    secretKeyRef:
                      name: postgres-credentials
                      key: password
                - name: PGDATABASE
                  value: parallax
          restartPolicy: OnFailure

Point-in-Time Recovery

Configure PostgreSQL for PITR:

postgresql:
  primary:
    configuration: |
      wal_level = replica
      archive_mode = on
      archive_command = 'aws s3 cp %p s3://wal-archive/%f'

Restore to specific time:

# Set recovery target
recovery_target_time = '2024-01-15 10:30:00'

# Restore
pg_restore --target-time='2024-01-15 10:30:00'

Schema Migrations

Automatic Migrations

Migrations run automatically on startup:

storage:
  migrations:
    enabled: true
    autoRun: true

Manual Migrations

# Check current version
parallax storage migrate status

# Run pending migrations
parallax storage migrate up

# Rollback last migration
parallax storage migrate down

# Migrate to specific version
parallax storage migrate to 20240115

Migration Files

-- migrations/20240115_add_execution_tags.sql
-- +migrate Up
ALTER TABLE executions ADD COLUMN tags JSONB;
CREATE INDEX idx_executions_tags ON executions USING GIN (tags);

-- +migrate Down
DROP INDEX idx_executions_tags;
ALTER TABLE executions DROP COLUMN tags;

Performance Optimization

Connection Pooling

storage:
  postgresql:
    pool:
      minConnections: 10
      maxConnections: 50
      maxIdleTime: 300000
      acquireTimeout: 30000

Indexing

Important indexes for query performance:

-- Execution queries
CREATE INDEX idx_executions_pattern_status ON executions(pattern_name, status);
CREATE INDEX idx_executions_created_at_desc ON executions(started_at DESC);

-- Pattern queries
CREATE INDEX idx_patterns_name_version ON patterns(name, version);

-- Audit queries
CREATE INDEX idx_audit_actor_timestamp ON audit_logs(actor, timestamp);

Partitioning

For large execution tables:

-- Create partitioned table
CREATE TABLE executions (
  id UUID,
  started_at TIMESTAMP WITH TIME ZONE,
  -- ... other columns
) PARTITION BY RANGE (started_at);

-- Create monthly partitions
CREATE TABLE executions_2024_01 PARTITION OF executions
  FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE executions_2024_02 PARTITION OF executions
  FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

Read Replicas

Route read queries to replicas:

storage:
  postgresql:
    primary:
      host: postgres-primary.example.com
    replicas:
      - host: postgres-replica-1.example.com
      - host: postgres-replica-2.example.com
    readPreference: replica  # primary, replica, any

Monitoring

Metrics

# Storage operation latency
parallax_storage_operation_duration_seconds{operation="write",table="executions"}

# Connection pool
parallax_storage_pool_connections_active
parallax_storage_pool_connections_idle
parallax_storage_pool_waiting_requests

# Storage size
parallax_storage_table_rows{table="executions"}
parallax_storage_table_size_bytes{table="executions"}

Alerts

groups:
  - name: parallax-storage
    rules:
      - alert: StorageConnectionPoolExhausted
        expr: parallax_storage_pool_waiting_requests > 10
        for: 1m
        labels:
          severity: warning

      - alert: StorageHighLatency
        expr: parallax_storage_operation_duration_seconds{quantile="0.99"} > 1
        for: 5m
        labels:
          severity: warning

      - alert: StorageDiskSpaceLow
        expr: parallax_storage_disk_free_bytes < 10737418240  # 10GB
        for: 5m
        labels:
          severity: critical

Troubleshooting

Connection Issues

# Test connection
parallax storage test-connection

# Check from pod
kubectl exec -it parallax-control-plane-0 -- \
  psql -h postgres -U parallax -d parallax -c "SELECT 1"

Migration Failures

# Check migration status
parallax storage migrate status

# View migration logs
kubectl logs parallax-control-plane-0 | grep migration

# Manual fix
psql -h postgres -U parallax -d parallax
# Fix the issue, then mark migration as complete
INSERT INTO schema_migrations (version) VALUES ('20240115');

Performance Issues

-- Find slow queries
SELECT query, calls, mean_time, total_time
FROM pg_stat_statements
ORDER BY total_time DESC
LIMIT 10;

-- Check table bloat
SELECT relname, n_dead_tup, last_vacuum
FROM pg_stat_user_tables
ORDER BY n_dead_tup DESC;

-- Run vacuum
VACUUM ANALYZE executions;

Best Practices

Use PostgreSQL for production - SQLite and file storage are for development only
Enable connection pooling - Prevents connection exhaustion under load
Set up automated backups - Test restore procedures regularly
Configure retention policies - Don't let data grow unbounded
Monitor storage metrics - Alert on latency and disk space
Use read replicas - Scale read-heavy workloads
Partition large tables - Improve query performance and maintenance

Next Steps

High Availability - HA configuration
Multi-Region - Geographic distribution
Security - Secure storage access

Overview​

Storage Backends​

PostgreSQL (Recommended)​

SQLite​

File System​

What's Stored​

Patterns​

Executions​

Audit Logs​

Configuration​

Environment Variables​

Helm Values​

External PostgreSQL​

Data Retention​

Configure Retention​

Manual Cleanup​

Archive Before Delete​

Backup and Restore​

PostgreSQL Backup​

PostgreSQL Restore​

Kubernetes CronJob​

Point-in-Time Recovery​

Schema Migrations​

Automatic Migrations​

Manual Migrations​

Migration Files​

Performance Optimization​

Connection Pooling​

Indexing​

Partitioning​

Read Replicas​

Monitoring​

Metrics​

Alerts​

Troubleshooting​

Connection Issues​

Migration Failures​

Performance Issues​

Best Practices​

Next Steps​

Overview

Storage Backends

PostgreSQL (Recommended)

SQLite

File System

What's Stored

Patterns

Executions

Audit Logs

Configuration

Environment Variables

Helm Values

External PostgreSQL

Data Retention

Configure Retention

Manual Cleanup

Archive Before Delete

Backup and Restore

PostgreSQL Backup

PostgreSQL Restore

Kubernetes CronJob

Point-in-Time Recovery

Schema Migrations

Automatic Migrations

Manual Migrations

Migration Files

Performance Optimization

Connection Pooling

Indexing

Partitioning

Read Replicas

Monitoring

Metrics

Alerts

Troubleshooting

Connection Issues

Migration Failures

Performance Issues

Best Practices

Next Steps