Skip to main content

Key Concepts

Before diving into each module, here are the core terms you'll encounter throughout Datalinx.

Workspace

An isolated environment for a specific data project. Each workspace has its own sources, targets, mappings, and configurations. Think of it as a project folder — everything within a workspace is self-contained.

When you first log in, you'll either select an existing workspace or create a new one. You can switch between workspaces at any time from the top of the screen.

Source

A database, schema, or table where your raw data lives. This could be a Snowflake warehouse, a Databricks catalog, a PostgreSQL database, or other supported connectors. You can connect multiple sources to a single workspace.

Target Schema

The clean, unified schema you're building — the structure your downstream analytics, dashboards, and tools will consume. You define what the output should look like (tables, columns, data types), and then map your source data into it.

Mapping

A connection between a source field and a target field. Mappings can be simple (direct copy) or include transformations like:

  • Converting data types or formats
  • Combining multiple source columns
  • Applying SQL expressions
  • Filtering or aggregating data

CTE (Common Table Expression)

A reusable SQL building block that lets you pre-process, filter, or join source data before it flows into your target schema. If you need to combine two source tables or apply complex logic before mapping, you'd create a CTE. Datalinx lets you create and manage CTEs visually or through the AI agent.

Foundations (Semantic Layer)

The business-level definitions of what your data means. This includes:

  • Object Types — Business entities like "Customer", "Order", or "Campaign"
  • Relationships — How entities connect (e.g., an Order belongs to a Customer)
  • Metrics — Business calculations (e.g., "Total Revenue = sum of completed order amounts")
  • Business Logic — Governance rules about data access and quality
  • Data Types — Semantic types like Email, Phone Number, or Currency that carry meaning beyond just "string" or "number"

Identity Resolution

The process of stitching together records that belong to the same real-world person across different data sources and identifiers. For example, matching:

  • A device ID from your website analytics
  • An email address from your CRM
  • A mobile advertising ID from your app

Datalinx builds an identity graph that links these identifiers together, so you get a single unified view of each person regardless of which device or channel they came from.

Pipeline

The end-to-end flow of data from source to target. A pipeline includes the discovery of sources, the mappings you've configured, any transformations (CTEs), and the execution that moves and transforms the data. Pipelines can be run manually or on a schedule.

dbt (Data Build Tool)

Under the hood, Datalinx generates dbt models from your mappings. dbt is an industry-standard tool for transforming data in your warehouse. You don't need to know dbt to use Datalinx — the platform handles the code generation — but if you're familiar with dbt, you can inspect and customize the generated models.