FireFly Analytics LogoFireFly Analytics
Architecture / Scalability

Scalability Architecture

A comprehensive guide to how FireFly Analytics scales to handle growing workloads, from application tier auto-scaling to Databricks Serverless SQL and isolated compute environments.

Overview

FireFly Analytics is designed to scale seamlessly from a handful of users to thousands, leveraging cloud-native patterns and Databricks' elastic compute capabilities. The platform scales at multiple levels: application tier, database tier, and compute tier.

This document covers the scalability architecture, including auto-scaling patterns, serverless compute, workspace isolation, and Databricks Apps that enable high performance at any scale.

Scalability Highlights

  • Application auto-scaling: Next.js and Go proxy scale horizontally based on demand
  • Serverless SQL: Databricks warehouses scale from 0 to N clusters automatically (nodes per cluster are fixed)
  • Isolated compute: Databricks Apps run in containerized, isolated environments (2 vCPU, 6GB RAM)
  • Workspace per organization: Each organization is configured with its own Databricks workspace by default
  • Pay-per-use: Serverless architecture means you only pay for what you use

Scaling Architecture Overview

The following diagram shows how FireFly scales across all tiers, from user traffic through application processing to Databricks compute:

Application Tier

Next.js and Go proxy auto-scale based on CPU, memory, and request rate metrics.

Data Tier

PostgreSQL scales horizontally with read replicas for session and configuration data.

Databricks Tier

Serverless SQL warehouses scale vertically and horizontally. Containerized apps provide isolation between users for the code and notebook editors.

Application Tier Scaling

The application tier consists of Next.js API routes and Go proxy servers, both designed for horizontal scaling with zero shared state.

Scaling Architecture

Next.js Application Scaling

Next.js can be deployed in multiple modes, each with different scaling characteristics:

Serverless Mode

Deploy as serverless functions (Vercel, AWS Lambda, Cloud Functions)

  • Auto-scales to zero when idle
  • Instant scaling on traffic spikes
  • Pay-per-invocation pricing
  • Cold start latency (~100-500ms)

Container Mode

Deploy as containers (Kubernetes, ECS, Cloud Run)

  • Horizontal Pod Autoscaler (HPA)
  • Predictable performance
  • No cold starts with warm pods
  • More control over resources

Stateless Design

Next.js instances are completely stateless, enabling seamless horizontal scaling:

  • No in-memory sessions - all sessions stored in PostgreSQL
  • No shared state between instances
  • Any instance can handle any request
  • Load balancer distributes traffic evenly

Go Proxy Scaling

The Go proxy is optimized for high concurrency and low resource usage:

Resource Efficiency

  • Binary size: ~15MB
  • Memory: ~50MB per instance
  • Startup time: <1 second

Concurrency

  • 10,000+ concurrent connections
  • Goroutines for parallelism
  • Efficient WebSocket handling

Performance

  • Token decrypt: <1ms
  • Request latency: <5ms overhead
  • Low GC pause times (<1ms)
Kubernetes HPA Configuration
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
  name: go-proxy-hpa
spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: go-proxy
  minReplicas: 2
  maxReplicas: 20
  metrics:
  - type: Resource
    resource:
      name: cpu
      target:
        type: Utilization
        averageUtilization: 70
  - type: Resource
    resource:
      name: memory
      target:
        type: Utilization
        averageUtilization: 80
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 30
      policies:
      - type: Pods
        value: 4
        periodSeconds: 60
    scaleDown:
      stabilizationWindowSeconds: 300

Databricks Serverless SQL

Databricks Serverless SQL Warehouses provide elastic compute that automatically scales based on query workload. This is the recommended compute option for FireFly Analytics.

Serverless SQL Architecture

Key Features

Cluster vs Node Scaling

Serverless SQL scales by adding more clusters, not by adding nodes to existing clusters. Each warehouse size has a fixed number of nodes per cluster. When query load increases, Databricks spins up additional clusters to handle concurrent queries.

Cluster Auto-Scaling

Warehouses scale clusters automatically based on query load:

  • Scale to zero: No charges when idle (0 clusters)
  • Instant scale-up: ~5 second cold start for new cluster
  • Parallel queries: Multiple clusters for concurrent users
  • Workload isolation: Heavy queries get dedicated clusters

Cost Optimization

Pay only for the compute you actually use:

  • Per-second billing: Charges stop when queries complete
  • No idle costs: Zero charges when scaled to zero
  • Shared infrastructure: Databricks manages underlying clusters
  • Predictable performance: SLA-backed query latency

Warehouse Sizing

Choose the right warehouse size based on your workload:

2X-Small
Light queries
Metadata, LIMIT queries
Small
Standard BI
Dashboards, reports
Medium
Analytics
Complex joins, aggregations
Large+
Heavy workloads
Full table scans, ML prep

Query Performance Tips

Use LIMIT for Previews

When previewing data, always use LIMIT to avoid scanning entire tables. FireFly automatically applies LIMIT 1000 for data previews.

Leverage Delta Lake Caching

Delta Lake automatically caches frequently accessed data. Repeated queries on the same tables benefit from cached results.

Filter Early

Apply WHERE clauses as early as possible in your queries. Predicates on partition columns are especially efficient.

Select Only Needed Columns

Avoid SELECT * when possible. Selecting only needed columns reduces data scanned and improves query performance.

Workspace Scaling & Isolation

By default, FireFly is designed to configure each organization with its own dedicated Databricks workspace. This provides the strongest isolation guarantees and simplifies access control management.

Default: One Workspace Per Organization

FireFly's default architecture maps each organization to its own Databricks workspace. This design provides:

  • Complete isolation: No risk of data leakage between organizations
  • Simple auditing: All activity in a workspace belongs to one org
  • Independent scaling: Each org's compute is fully separate
  • Clear billing: Costs are naturally separated by workspace

Multi-Org Per Workspace (Advanced)

FireFly can be modified to support multiple organizations sharing the same workspace, but this requires significant additional safeguards:

  • Rigorous Unity Catalog permissions: Catalog-level grants must be carefully managed per SPN
  • Enhanced auditing: Additional logging to track which org accessed what data
  • Code review practices: All changes must be reviewed for potential cross-org data leakage
  • SPN isolation: Each org must still have its own SPN with strict permission boundaries
  • Regular security audits: Periodic reviews to ensure no permission drift

This configuration is not recommended unless you have specific requirements that necessitate shared workspace infrastructure.

Multi-Workspace Architecture

Common Workspace Patterns

Environment Separation

Separate workspaces for different environments:

  • Production: Business-critical data access
  • Staging: Pre-production testing
  • Development: Experimentation and feature dev

Geographic Distribution

Workspaces in different regions for:

  • Data residency: GDPR, data sovereignty
  • Latency: Users closer to data
  • Disaster recovery: Cross-region redundancy

Team Isolation

Separate workspaces for different teams:

  • Cost allocation: Chargeback by workspace
  • Access control: Team-specific permissions
  • Resource limits: Per-workspace quotas

Workload Isolation

Separate workspaces for different workloads:

  • ETL: Heavy batch processing
  • BI/Analytics: Interactive queries
  • ML/AI: GPU-intensive workloads

Workspace Isolation Guarantees

Each Databricks workspace provides strong isolation:

  • Network isolation: Separate VPC/VNet per workspace (optional)
  • Compute isolation: Separate clusters and warehouses
  • Storage isolation: Separate managed storage
  • Identity isolation: Separate user/SPN namespaces

Databricks Apps Isolation

Databricks Apps (like the VSCode code editor) run in isolated containers, providing secure compute for interactive workloads. FireFly embeds these apps using the Go proxy for transparent authentication.

Fixed Container Resources

All Databricks Apps run with fixed, standardized resources:

2 vCPU
Processing power
6 GB RAM
Memory allocation

Custom resource configurations are not currently supported by Databricks Apps.

Container Isolation Architecture

Isolation Features

Process Isolation

Each app instance runs in its own container with:

  • Separate PID namespace (processes isolated)
  • Fixed resource limits (2 vCPU, 6GB RAM)
  • No access to other containers or host system

Network Isolation

Network access is strictly controlled:

  • Sandboxed network namespace
  • Outbound access only to approved endpoints
  • No inbound connections except through proxy

Ephemeral Storage

Storage is ephemeral - all data is lost on container restart:

  • Filesystem cleared on every restart or timeout
  • No persistent storage available
  • Files read from Unity Catalog volumes (read-only)
  • Local changes exist only during active session

Data Loss Warning

Important: Any files created or modified within a Databricks App are lost when the container restarts. Users should save important work to Unity Catalog volumes or external storage before ending their session.

Future Improvement: Volume Sync

A planned enhancement is to implement automatic file synchronization with Databricks Volumes:

  • Auto-backup workspace files to user's Unity Catalog volume
  • Restore files on container startup
  • Periodic sync during active sessions
  • Versioned backups for recovery

Embedded Apps Use Cases

Code Editor

VSCode-based editor for notebooks, Python, SQL with full IDE features (IntelliSense, debugging, Git).

Notebook Viewer

Read-only notebook rendering for viewing outputs and visualizations without execution capability.

Custom Apps

Build custom Databricks Apps for specialized workflows (data quality tools, ML experiments, dashboards).

Performance Monitoring

Effective scaling requires visibility into system performance. FireFly recommends monitoring these key metrics:

Application Metrics

  • Request rate (req/sec)
  • Response time (P50, P95, P99)
  • Error rate (4xx, 5xx)
  • Instance count (scaling)
  • CPU/memory utilization

Database Metrics

  • Query duration
  • Connection pool usage
  • Active connections
  • Rows read/written
  • Replication lag

Databricks Metrics

  • Warehouse uptime/utilization
  • Query queue depth
  • Query duration by type
  • SPN token refresh rate
  • API error rate

User Experience

  • Time to first byte (TTFB)
  • Largest contentful paint (LCP)
  • Page load time
  • Query completion time
  • Error page views

Conclusion

FireFly Analytics is designed for scalability at every layer. By combining auto-scaling application infrastructure, Databricks Serverless SQL, and workspace-per-organization isolation, the platform can grow seamlessly from small teams to enterprise deployments.

Horizontal Scaling

Stateless Next.js and Go proxy instances scale horizontally based on demand with zero manual intervention.

Elastic Compute

Databricks Serverless SQL scales from zero to handle any query workload with automatic resource management.

Cost Efficiency

Pay-per-use pricing and scale-to-zero capabilities ensure you only pay for what you actually use.

Explore More

Learn about other aspects of the FireFly Analytics architecture.