FireFly Analytics Architecture
A comprehensive overview of FireFly Analytics using the SSO-SPN authentication model - where users authenticate via your identity provider (Okta, Azure AD, Auth0) and all Databricks API calls use organization-specific Service Principals. No Databricks accounts required for end users.
What is FireFly Analytics?
FireFly Analytics is an analytics platform built on top of Databricks using the SSO-SPN (Single Sign-On to Service Principal) architecture. Users authenticate via your existing identity provider, while all Databricks operations are performed using organization-specific Service Principals.
This architecture allows organizations to provide their end users with a modern, customizable data experience while leveraging the full power of the Databricks Lakehouse - without requiring users to have individual Databricks accounts.
SSO-SPN Architecture Benefits
- No Databricks accounts required: Users authenticate via your existing identity provider (Okta, Azure AD, Auth0, etc.) - not Databricks
- Centralized access control: Each organization has a dedicated Service Principal with specific Unity Catalog permissions
- Simplified onboarding: Add users to your identity provider, they immediately have access - no Databricks provisioning needed
- Clear audit trail: All API calls are traced to the organization's SPN, with user identity preserved in application logs
- Multi-tenant isolation: Each organization has its own SPN, workspace mappings, and Unity Catalog permissions
10,000 Foot View
At the highest level, FireFly Analytics sits between your end users and Databricks, acting as an intelligent proxy that handles authentication, authorization, and request orchestration.
The Big Picture
Users interact with FireFly's modern web interface. FireFly translates their actions into Databricks API calls using organization-specific Service Principals. Users never need to know that Databricks exists under the hood.
End Users
Data analysts, business users, and data scientists who need to access and analyze data without learning Databricks.
FireFly Analytics
The platform that provides a beautiful, customizable interface for data exploration, SQL queries, and file management.
Databricks
The powerful Lakehouse platform that stores data, executes queries, and provides Unity Catalog governance.
1,000 Foot View
Zooming in a bit, we can see the main components that make up the FireFly platform and how they interact with external services.
Core Components
Next.js Frontend
A modern React application with server-side rendering, TanStack Query for data fetching, and shadcn/ui components.
- • Catalog browser with tree navigation
- • SQL editor with syntax highlighting
- • File explorer for volumes and DBFS
- • Organization management UI
Next.js API Routes
Server-side endpoints that handle all Databricks communication, ensuring credentials never reach the browser.
- • Session validation middleware
- • SPN token management
- • Request proxying to Databricks
- • Response caching
Better-Auth
A flexible authentication framework that integrates with any OAuth 2.0 or OIDC provider.
- • Session management
- • Token validation
- • Organization context
Lakebase (PostgreSQL)
Persistent storage for all platform data, with encrypted credentials and proper isolation.
- • User sessions
- • Organizations & members
- • Encrypted SPN credentials
SSO-SPN Authentication Model
Two-Layer Authentication (SSO-SPN)
The SSO-SPN model uses a two-layer authentication that completely separates user identity from Databricks access:
- Layer 1 - User SSO: Users authenticate via your OAuth 2.0/OIDC provider (Okta, Azure AD, Auth0, etc.). They never interact with Databricks directly.
- Layer 2 - Service Principal: All Databricks API calls use the organization's Service Principal. The SPN credentials are stored encrypted in Lakebase (PostgreSQL) and tokens are managed server-side.
User Authentication (SSO)
- • OAuth 2.0 / OIDC flow with your IDP
- • Session managed by Better-Auth
- • Organization context stored in session
- • No Databricks credentials exposed
Databricks Access (SPN)
- • One Service Principal per organization
- • OAuth client_credentials grant
- • Token caching with auto-refresh
- • All API calls use SPN bearer token
Detailed Architecture
This diagram shows the complete architecture with all layers and their interactions. Understanding this flow is essential for customizing and extending FireFly.
Layer Breakdown
Client Layer
The browser-based frontend built with Next.js and React. Uses TanStack Query for efficient data fetching with caching and automatic background refetching.
Authentication Layer
Handles user authentication via OAuth 2.0/OIDC. Better-Auth manages sessions, validates tokens, and maintains organization context for each user.
Backend Layer
Next.js API routes with middleware for session validation and SPN token management. All Databricks communication happens here, keeping credentials server-side.
Data Layer
Lakebase (PostgreSQL) stores users, organizations, sessions, and encrypted SPN credentials. Uses Drizzle ORM for type-safe database operations.
Databricks Platform
The Databricks Lakehouse providing Unity Catalog, SQL Warehouses, DBFS, and more. FireFly accesses these via REST APIs using Service Principal tokens.
Technology Stack
FireFly is built on modern, battle-tested technologies that enable rapid development and reliable operation.
Frontend Technologies
Documentation Sections
Dive deeper into specific aspects of the SSO-SPN architecture with these detailed documentation sections.
Request Flow
Follow a request from user SSO authentication through SPN token retrieval, Databricks API call, and response caching.
IAM & Organizations
Understand how organizations, Service Principals, and permissions work together to provide multi-tenant isolation.
Security
Deep dive into authentication, encryption, multi-tenant isolation, access control, and comprehensive audit trails.
Scalability
Learn how FireFly scales with auto-scaling apps, Serverless SQL, workspace isolation, and intelligent caching.
Apps Proxy
Learn how to embed Databricks Apps without exposing Databricks login flows to your end users.
Ready to Dive Deeper?
Start with the Request Flow documentation to understand how SSO authentication and SPN token management work together, or explore the IAM docs to set up organizations and Service Principals.