Architecture Overview

Datalinx AI is a data integration platform designed with security, multi-tenancy, and scalability as core principles. This document provides a high-level overview of the system architecture.

Design Principles

Security First - Multi-tenancy and isolation built into every layer
Separation of Concerns - Clear boundaries between control and data planes
Scalability - Horizontal scaling of data processing workers
Maintainability - Three-layer architecture for clear code organization
Testability - Comprehensive test framework with isolation
Observability - Detailed logging and monitoring throughout

System Architecture

Core Components

Control Plane

The Control Plane handles all interactive operations:

Component	Responsibility
Web Interface	User interactions, visual mapping, monitoring
REST API	Programmatic access, integrations
Authentication	User identity, session management, RBAC
Configuration	Workspace, pipeline, and connection settings
Command Queue	Queuing and dispatching work to Data Plane

The Control Plane runs as a FastAPI server serving both the API and React frontend.

Data Plane

The Data Plane executes actual data operations in isolation:

Component	Responsibility
Data Workers	Execute transformation and movement jobs
dbt Processing	Run dbt models against configured databases
Dagster Pipelines	Orchestrate complex data workflows
Resource Isolation	Each task runs with minimal privileges

Data Plane workers poll the Control Plane for tasks, execute them, and report results back.

Storage Layer

Store	Purpose
System Database	Users, organizations, configurations, audit logs
Customer Data	Source and transformed data in customer warehouses
File Storage	Schema definitions, mapping configurations, logs

Communication Flow

Application Layers

Datalinx AI follows a strict three-layer architecture:

Layer 1: Interface Layer (Thin)

External interfaces to the system:

src/datalinx_demo/api/routes/    # FastAPI endpoints
src/datalinx_demo/api/tools/     # MCP tool interfaces
src/datalinx_demo/agents/        # AI agent implementations

Interface layer responsibilities:

Input validation using Pydantic models
Permission checking via decorators
Delegation to business logic layer
Response formatting

warning

Interface layer should never contain business logic - only validation and delegation.

Layer 2: Business Logic Layer (Core)

All business logic and orchestration:

src/datalinx_demo/core/
├── mapping/           # Mapping manager
├── workspace/         # Workspace manager
├── schema/           # Schema manager
├── auth/             # Authentication manager
├── monitoring/       # Monitoring manager
└── ...

Managers coordinate between DAOs, implement workflows, and handle state management.

Layer 3: Data Access Layer (DAO)

Direct data operations:

src/datalinx_demo/core/dao/
├── system/           # System database DAOs
├── customer/         # Customer database DAOs
└── file_dao.py       # File system operations

DAOs provide simple CRUD operations and return raw data.

Security Model

Authentication

Users: Email/password with JWT tokens
Service Accounts: API key authentication
Sessions: Secure cookie-based sessions

Authorization

Data Protection

At Rest: Database encryption, encrypted credentials
In Transit: TLS 1.2+ for all connections
Secrets: Encrypted using master key, no plaintext storage

Deployment Architecture

Scalability

Horizontal Scaling

Control Plane: Add more server instances behind load balancer
Data Plane: Add more workers to handle increased load
Database: Use read replicas for read-heavy workloads

Resource Isolation

Each workspace can have isolated:

Compute resources (warehouses)
Storage quotas
Rate limits

Next Steps

Control vs Data Plane - Deep dive into the split architecture
Multi-Tenancy - Understand tenant isolation
Security - Security model details

Design Principles​

System Architecture​

Core Components​

Control Plane​

Data Plane​

Storage Layer​

Communication Flow​

Application Layers​

Layer 1: Interface Layer (Thin)​

Layer 2: Business Logic Layer (Core)​

Layer 3: Data Access Layer (DAO)​

Security Model​

Authentication​

Authorization​

Data Protection​

Deployment Architecture​

Scalability​

Horizontal Scaling​

Resource Isolation​

Next Steps​