Skip to main content

Quick Start Guide

This guide walks you through setting up Datalinx AI and creating your first data pipeline in about 15 minutes.

Prerequisites

Before you begin, ensure you have:

  • Access credentials for your Datalinx AI instance
  • Connection details for at least one data source (database, API, or file storage)
  • Target schema requirements (what data structure you want to create)

Step 1: Log In and Create a Workspace

  1. Navigate to your Datalinx AI instance URL
  2. Log in with your credentials
  3. Click Create Workspace in the top navigation
  4. Enter a workspace name (e.g., customer-analytics)
  5. Click Create
Workspace: customer-analytics
Status: Active
Created: Just now
tip

Workspaces provide isolated environments for different projects. You can have multiple workspaces for development, staging, and production.

Step 2: Connect a Data Source

  1. Navigate to Data Sources in the sidebar
  2. Click Add Source
  3. Select your source type:
    • Database: PostgreSQL, Snowflake, Databricks
    • API: REST endpoints with OpenAPI schemas
    • Files: CSV, JSON, Parquet from cloud storage

Example: Connecting PostgreSQL

Source Name: production-db
Type: PostgreSQL
Host: db.example.com
Port: 5432
Database: customers
Schema: public
Username: readonly_user
# Password stored securely
  1. Click Test Connection to verify
  2. Click Save Source

Step 3: Select a Target Schema

Target schemas define the structure of your output data. Datalinx AI includes pre-built templates for common use cases.

  1. Navigate to Schema Configuration

  2. Click Select Template

  3. Choose from available templates:

    • E-Commerce: Orders, customers, products, transactions
    • Marketing: Campaigns, audiences, touchpoints
    • Identity: Unified customer profiles
  4. Review the schema structure and click Apply

{
"target_schema": "Commerce",
"tables": [
"customers",
"orders",
"products",
"order_items"
]
}

Step 4: Map Your Data

The visual mapper helps you connect source fields to target schema fields.

  1. Navigate to Mapping
  2. You'll see source tables on the left and target tables on the right
  3. Drag and drop fields to create mappings
  4. Use the AI Assist button for automatic mapping suggestions

Mapping Options

OptionDescription
Direct Map1:1 field mapping
ExpressionSQL expression for transformation
CTECreate derived tables from complex queries
DecoratorApply standardization functions

Example Mapping

-- Source: raw_customers.email_address
-- Target: customers.email

-- Direct mapping with lowercase decorator
LOWER(source.email_address) AS email

Step 5: Validate and Preview

Before running your pipeline, validate the mappings:

  1. Click Validate Mappings
  2. Review any warnings or errors
  3. Click Preview Data to see sample output
Validation Results:
✅ All required fields mapped
✅ Data types compatible
⚠️ 2 optional fields unmapped (can be ignored)

Step 6: Run Your First Pipeline

  1. Navigate to Operations

  2. Click Run Pipeline

  3. Select options:

    • Full Refresh: Process all data
    • Incremental: Process only new/changed records
  4. Click Execute

Monitor progress in real-time:

Pipeline: customer-analytics
Status: Running
Progress: ████████░░ 80%
Tables processed: 3/4

Step 7: Verify Results

Once the pipeline completes:

  1. Navigate to Data Preview
  2. Select a target table
  3. Review the transformed data
SELECT * FROM customers LIMIT 10;

-- Results:
-- | id | email | name | created_at |
-- |-----|---------------------|-------------|------------|
-- | 001 | john@example.com | John Smith | 2024-01-15 |
-- | 002 | jane@example.com | Jane Doe | 2024-01-16 |

What's Next?

Congratulations! You've created your first Datalinx AI pipeline. Here are some next steps:

Troubleshooting

Connection Failed

Error: Connection refused

Check:

  • Network connectivity and firewall rules
  • Correct hostname and port
  • Valid credentials
  • IP allowlisting on your database

Mapping Validation Errors

Error: Type mismatch - cannot map VARCHAR to INTEGER

Solutions:

  • Add a CAST expression: CAST(source_field AS INTEGER)
  • Use a decorator for type conversion
  • Check source data for unexpected values

Pipeline Timeout

Error: Pipeline exceeded timeout (30 minutes)

Consider:

  • Breaking into smaller batches
  • Optimizing complex CTEs
  • Increasing timeout in workspace settings

Need help? Check our Troubleshooting Guide or contact support.