Quick Start Guide
This guide walks you through setting up Datalinx AI and creating your first data pipeline in about 15 minutes.
Prerequisites
Before you begin, ensure you have:
- Access credentials for your Datalinx AI instance
- Connection details for at least one data source (database, API, or file storage)
- Target schema requirements (what data structure you want to create)
Step 1: Log In and Create a Workspace
- Navigate to your Datalinx AI instance URL
- Log in with your credentials
- Click Create Workspace in the top navigation
- Enter a workspace name (e.g.,
customer-analytics) - Click Create
Workspace: customer-analytics
Status: Active
Created: Just now
Workspaces provide isolated environments for different projects. You can have multiple workspaces for development, staging, and production.
Step 2: Connect a Data Source
- Navigate to Data Sources in the sidebar
- Click Add Source
- Select your source type:
- Database: PostgreSQL, Snowflake, Databricks
- API: REST endpoints with OpenAPI schemas
- Files: CSV, JSON, Parquet from cloud storage
Example: Connecting PostgreSQL
Source Name: production-db
Type: PostgreSQL
Host: db.example.com
Port: 5432
Database: customers
Schema: public
Username: readonly_user
# Password stored securely
- Click Test Connection to verify
- Click Save Source
Step 3: Select a Target Schema
Target schemas define the structure of your output data. Datalinx AI includes pre-built templates for common use cases.
-
Navigate to Schema Configuration
-
Click Select Template
-
Choose from available templates:
- E-Commerce: Orders, customers, products, transactions
- Marketing: Campaigns, audiences, touchpoints
- Identity: Unified customer profiles
-
Review the schema structure and click Apply
{
"target_schema": "Commerce",
"tables": [
"customers",
"orders",
"products",
"order_items"
]
}
Step 4: Map Your Data
The visual mapper helps you connect source fields to target schema fields.
- Navigate to Mapping
- You'll see source tables on the left and target tables on the right
- Drag and drop fields to create mappings
- Use the AI Assist button for automatic mapping suggestions
Mapping Options
| Option | Description |
|---|---|
| Direct Map | 1:1 field mapping |
| Expression | SQL expression for transformation |
| CTE | Create derived tables from complex queries |
| Decorator | Apply standardization functions |
Example Mapping
-- Source: raw_customers.email_address
-- Target: customers.email
-- Direct mapping with lowercase decorator
LOWER(source.email_address) AS email
Step 5: Validate and Preview
Before running your pipeline, validate the mappings:
- Click Validate Mappings
- Review any warnings or errors
- Click Preview Data to see sample output
Validation Results:
✅ All required fields mapped
✅ Data types compatible
⚠️ 2 optional fields unmapped (can be ignored)
Step 6: Run Your First Pipeline
-
Navigate to Operations
-
Click Run Pipeline
-
Select options:
- Full Refresh: Process all data
- Incremental: Process only new/changed records
-
Click Execute
Monitor progress in real-time:
Pipeline: customer-analytics
Status: Running
Progress: ████████░░ 80%
Tables processed: 3/4
Step 7: Verify Results
Once the pipeline completes:
- Navigate to Data Preview
- Select a target table
- Review the transformed data
SELECT * FROM customers LIMIT 10;
-- Results:
-- | id | email | name | created_at |
-- |-----|---------------------|-------------|------------|
-- | 001 | john@example.com | John Smith | 2024-01-15 |
-- | 002 | jane@example.com | Jane Doe | 2024-01-16 |
What's Next?
Congratulations! You've created your first Datalinx AI pipeline. Here are some next steps:
- Configure Monitoring - Set up alerts for data quality
- Schedule Pipelines - Automate recurring runs
- Explore with AI - Query your data using natural language
- Set Up Reverse ETL - Push data to operational systems
Troubleshooting
Connection Failed
Error: Connection refused
Check:
- Network connectivity and firewall rules
- Correct hostname and port
- Valid credentials
- IP allowlisting on your database
Mapping Validation Errors
Error: Type mismatch - cannot map VARCHAR to INTEGER
Solutions:
- Add a CAST expression:
CAST(source_field AS INTEGER) - Use a decorator for type conversion
- Check source data for unexpected values
Pipeline Timeout
Error: Pipeline exceeded timeout (30 minutes)
Consider:
- Breaking into smaller batches
- Optimizing complex CTEs
- Increasing timeout in workspace settings
Need help? Check our Troubleshooting Guide or contact support.