Welcome to Phase 3 — Advanced Tableau. Real-world analytics rarely involve a single clean table. You'll work with data from multiple systems: sales from a SQL database, targets from Excel, web analytics from Google Sheets, customer data from a CRM. This lesson teaches you Tableau's three approaches to combining multi-source data — and critically, when to use each one.
Module 11 of 15 · Phase 3 Progress52%
📊 Section 1
Joins vs. Relationships vs. Data Blending
Tableau offers three distinct ways to combine data from multiple tables or sources. Understanding when to use each is one of the most important skills at the advanced level.
Feature
Joins
Relationships
Data Blending
How it works
Combines rows at the row level before analysis
Smart joins resolved at query time per sheet
Separate queries per source, results combined in view
Data sources
Same data source only
Same data source, different tables
Different data sources (Excel + SQL, etc.)
Granularity risk
Row duplication possible (fan-out)
Handles many-to-many automatically
Each source queried independently
When to use
Single-source, specific join logic needed
Default for multi-table analysis (Tableau 2020.2+)
Combining Google Sheets + SQL, Excel + CSV, etc.
Performance
Can be slow on large many-to-many joins
Optimized — context-aware query per sheet
Slower — multiple queries, merged in Tableau memory
🔗 Section 2
Tableau Relationships — The New Data Model
Introduced in Tableau 2020.2, Relationships are now the recommended way to combine tables in Tableau. Unlike traditional SQL joins (which merge tables into a flat row structure), Relationships define logical connections between tables that Tableau resolves at query time, depending on what fields are used in the view.
💡
The "Noodle" Interface
In the Data Source view, Tableau displays Relationships as curved lines ("noodles") connecting table boxes. Each noodle represents a logical relationship between two tables. You define the matching fields (like Customer ID) and Tableau handles the rest — no join type to choose, no row duplication worry.
Why Relationships Beat Traditional Joins
Consider a classic problem: you have an Orders table and a Targets table. Orders has one row per order line. Targets has one row per product per month. If you JOIN them, every order gets joined to every matching target row — potentially creating duplicated rows and inflated totals. Relationships solve this by querying each table at the appropriate granularity for each visualization.
1
Open Data Source View
Connect to your primary table (e.g., Orders). It appears in the canvas area. To add another table, drag it from the left panel onto the canvas next to Orders.
2
Define the Relationship
A noodle appears connecting the two tables. A dialog prompts you to select matching fields: Orders.[Customer ID] = Customers.[Customer ID]. Click OK.
3
Set Performance Options
Click the noodle to access options. Set Cardinality (Many-to-One, One-to-One, etc.) and Referential Integrity (records guaranteed to match). These hints let Tableau optimize queries.
4
Use Fields from Both Tables
In a worksheet, both tables' fields appear in the Data pane under their table names. Drag fields from either table — Tableau automatically generates the right query for the view's context.
⚠️
Context: When Joins Still Make Sense
Use Joins (not Relationships) when: (1) You need to filter one table based on the existence of records in another table (semi-join / anti-join logic). (2) You need custom join expressions using calculations, not just field equality. (3) You're using stored procedures or custom SQL. In these specific cases, click "Open" inside a table in the canvas to switch to the physical layer and add joins there.
🔀 Section 3
Data Blending — Combining Different Source Types
Data Blending is Tableau's method for combining data from fundamentally different source types in the same view — for example, sales data from SQL Server and marketing budget data from an Excel file. Unlike Relationships (which work within a single connection), Blending sends separate queries to each source and merges the results in Tableau's in-memory engine.
Primary vs. Secondary Data Sources
In a blended view, one source is designated the Primary source and the other is the Secondary source. The primary source drives the level of detail in the view. The secondary source contributes aggregated values that are joined to the primary source's dimension members.
🔗
How to Identify the Primary Source
In the Data pane, the primary data source has a blue checkmark icon. The secondary source shows an orange link chain icon. When you build a view, the first source you use fields from becomes primary. To switch primary/secondary, right-click the data source name and select "Use as Primary".
Link Fields — The Blend Join
Data Blending joins on Link Fields — fields with matching names and data types in both sources. Tableau automatically identifies these and enables them. You can manually turn link fields on/off by clicking the chain link icon next to fields in the Data pane. Active link fields create the join between the two sources at the current granularity level.
Limitations of Data Blending
Limitation
Explanation
Workaround
Left-join only
Blending always performs a LEFT join — primary source rows always appear, secondary source may have nulls
Filter nulls or use cross-database joins
No row-level detail from secondary
Secondary source fields aggregate before joining — you can't use individual secondary source rows
Use Relationships or cross-database join for row-level detail
Performance
Two separate queries plus in-memory merge — slower than single-source analysis
Build extracts of both sources for faster blending
Table calcs on secondary
Some table calculations behave differently on secondary source fields
Bring calculated fields into primary source instead
⚡ Section 4
Cross-Database Joins
Cross-database joins let you join tables from different data sources at the row level — like a SQL JOIN but across different database engines. For example, join a Salesforce CRM table with a MySQL orders table that can't be joined in SQL because they're on different servers. Tableau does the join in-memory after querying each source separately.
🔥
Why This Matters
In traditional BI, when two data systems can't be connected at the database level, analysts resort to exporting both to Excel, manually VLOOKUPing them, and then importing the result. Tableau's cross-database joins eliminate this workflow entirely — you join them visually in Tableau, and the result refreshes automatically when either source updates.
Setting Up a Cross-Database Join
1
Connect to First Source
Connect to your first data source normally (e.g., MySQL Orders table). It appears in the Data Source canvas.
2
Add Another Connection
In the Data Source view, click Add in the Connections panel on the left. Connect to your second source (e.g., an Excel file with sales targets). A new connection appears in the Connections panel.
3
Drag Table from Second Connection
In the Connections panel, switch to the second connection. Drag its table onto the canvas next to your first table. Tableau creates a JOIN (not a Relationship) between them.
4
Configure the Join Clause
A join icon appears between the tables. Click it to set the join type (Inner, Left, Right, Full Outer) and the matching fields. Tableau fetches from both sources and merges in memory.
⚠️
Use Extracts for Cross-Database Joins
Cross-database joins require Tableau to bring all data into memory before joining. This can be slow on large datasets. Always create an Extract (.hyper file) when using cross-database joins. The extract pre-joins the data once and stores the result in Tableau's fast Hyper engine — subsequent queries become very fast.
🚀 Section 5
Building a Multi-Source Analytics System
Practical exercise: you have three data sources that need to be combined — sales data in MySQL, marketing spend in Excel, and customer survey scores in Google Sheets. Here's the architecture:
1
Relationships for Same-Source Tables
Connect to MySQL. In Data Source view, add Orders, Customers, and Products tables. Create Relationships: Orders → Customers on Customer ID. Orders → Products on Product ID. These stay in one connection.
2
Cross-Database Join for Excel Targets
Add the Excel connection. Drag the Targets sheet onto the canvas. Create a cross-database join on [Product Category] = [Category]. Now each sales row has the corresponding budget target.
3
Data Blend for Survey Scores
Add Google Sheets as a second data source in the workbook. In a new sheet, build a view using the primary SQL data source. Add customer satisfaction score from Google Sheets as a secondary source field — it blends on Customer ID automatically.
4
Build the Dashboard
Create a scatter plot: X = Revenue (SQL primary), Y = Marketing Spend (Excel joined), Color = Customer Satisfaction (Google Sheets blended). This single view synthesizes three data systems simultaneously.
🧠 Knowledge Check
1. What is the key advantage of Tableau Relationships over traditional Joins for multi-table analysis?
2. In Data Blending, what type of join does Tableau always perform between the primary and secondary source?
3. What is a "Link Field" in Tableau Data Blending?
4. When should you use Data Blending instead of Relationships or Joins?
5. What is recommended when using Cross-Database Joins to improve performance?
🏆
✅ Lesson Summary
What You Learned
📋
Lesson 11 — Key Takeaways
✅ Relationships are the default multi-table approach in Tableau 2020.2+ — resolves at query time, no row duplication
✅ Traditional Joins work within a single connection — use for specific join logic or custom SQL
✅ Data Blending combines different source types (SQL + Excel + Google Sheets) in the same view
✅ Blending always uses a LEFT join and can only aggregate secondary source fields
✅ Link Fields define how blending joins the primary and secondary sources
✅ Cross-database joins combine row-level data from different database engines in-memory
✅ Always use Extracts with cross-database joins for acceptable performance
You can now combine data from any combination of sources. Next: LOD Expressions — the single most powerful feature in Tableau and the one that separates advanced analysts from everyone else.