Connecting Data Sources
Connecting to Excel & CSV Files
Excel and CSV files are the most common starting points for Tableau users. Whether you received a report from a colleague or exported data from your accounting software, Tableau connects to flat files in seconds with no configuration needed.
Connecting Step by Step
Microsoft Excel for .xlsx/.xls files, or Text File for .csv and .txt files.Orders) to the canvas area. Tableau previews the first 1,000 rows of data.Abc = String, # = Number, 📅 = Date. Click any type icon to change it if Tableau guessed incorrectly.Understanding Field Types at Connection
| Icon | Data Type | Examples | Tableau Role |
|---|---|---|---|
Abc | String (Text) | Names, Categories, Cities | Dimension |
# | Number (Float/Int) | Sales, Quantity, Profit | Measure |
📅 | Date | Order Date, Ship Date | Dimension |
🕐 | Date & Time | Timestamps, Login Times | Dimension |
T|F | Boolean | Is Returned, Is Active | Dimension |
🌍 | Geographic | Country, State, ZIP Code | Dimension |
Abc instead of 📅). Right-click the column header → Change Data Type → Date. If the format is unusual (e.g., 2024_01_15), you may need a calculated field with DATEPARSE() to convert it properly.Database Connections
For production-level analytics, data lives in relational databases — MySQL, PostgreSQL, SQL Server, Oracle, and more. Tableau connects natively to all of them. You'll need connection credentials from your database administrator (or your own if you're the admin).
Connecting to MySQL
MySQL. A connection dialog opens. If you don't see MySQL listed, click "More..." to see all connectors.localhost or db.mycompany.com). Port: 3306 (MySQL default). Database: your database name. Username & Password: your DB credentials.PostgreSQL Connection
PostgreSQL works identically to MySQL but uses port 5432 by default. One key difference: PostgreSQL is case-sensitive for schema and table names, so Orders and orders are different tables. Always confirm the exact casing with your DBA.
| Database | Default Port | Notes |
|---|---|---|
| MySQL | 3306 | Most common in web apps and startups |
| PostgreSQL | 5432 | Case-sensitive; strong JSON support |
| SQL Server | 1433 | Common in enterprises; Windows Auth available |
| Oracle | 1521 | Requires Oracle JDBC driver installation |
| Amazon Redshift | 5439 | PostgreSQL-compatible cloud data warehouse |
Cloud Data Sources
Modern data doesn't just live in local databases — it lives in the cloud. Tableau connects natively to Google Sheets, Salesforce, cloud data warehouses, and dozens of SaaS platforms. Cloud connections often require OAuth authentication rather than username/password.
Google Sheets
Google Sheets under "To a Server". A browser window opens for Google OAuth authentication — sign in with the Google account that owns the Sheet.Salesforce
Tableau integrates deeply with Salesforce CRM data. From the Connect panel, select Salesforce and authenticate via your Salesforce credentials. You can connect to standard Salesforce objects (Accounts, Opportunities, Leads, Cases) or write SOQL queries for custom objects. This is especially powerful for sales pipeline and customer analytics dashboards.
AWS Athena & Cloud Warehouses
Amazon Athena lets you query data stored in S3 buckets using standard SQL. In Tableau, connect using the "Amazon Athena" connector — you'll need your AWS Access Key, Secret Key, S3 staging directory (for query results), and the AWS region. Similarly, Google BigQuery and Snowflake are first-class connectors with full OAuth support and high-performance query pushdown.
Live vs Extract Connections
This is one of the most important architectural decisions in Tableau. Every connection is either Live (queries the source in real-time) or an Extract (a snapshot of data stored locally in a .hyper file). Understanding the trade-offs is essential for building reliable, performant dashboards.
| Aspect | Live Connection | Extract (.hyper) |
|---|---|---|
| Data Freshness | Always current | Snapshot (refresh needed) |
| Performance | Depends on database speed | Very fast (columnar storage) |
| Network Dependency | Requires DB connection | Works offline |
| Best For | Real-time ops, live KPIs | Large datasets, slow DBs, sharing |
| Sharing | Recipient needs DB access | Packaged .twbx includes data |
| Row Limit | Unlimited (server-side) | Billions of rows supported |
When to Use Live
Choose Live connections when your business requires real-time data — think operations dashboards monitoring live inventory, fraud detection, or manufacturing line metrics. Also use Live when your database already has great performance and the extra extract step creates unnecessary overhead.
When to Use Extract
Choose Extract when: (1) your database is slow or heavily loaded, (2) you need to share the workbook with someone who doesn't have database access, (3) you're working with cloud sources that charge per-query (minimize costs!), or (4) you need to work offline while traveling.
Scheduling Extract Refreshes
When you publish a workbook with an Extract to Tableau Server or Cloud, you can set an automatic refresh schedule — hourly, daily, weekly. Tableau connects to the original data source at the scheduled time and rebuilds the extract. This gives you the performance of extracts with the freshness of live connections for most business use cases.
Managing & Cleaning Data Sources
Once connected, Tableau gives you several tools to shape and clean your data before analysis. These are applied at the Data Source level, so they affect every worksheet using that connection.
Data Source Filters
A Data Source Filter permanently restricts which rows are available in the entire workbook — it runs before any other filter. Click "Add" in the Filters panel (top-right of the Data Source page) to add one. Use cases: filter to only the current year's data to reduce extract size, or restrict sensitive data so users only see their region's records.
Renaming & Hiding Fields
Database column names are often cryptic (cust_acq_dt instead of "Customer Acquisition Date"). On the Data Source page, right-click any column header to rename it. You can also hide fields you never use — they disappear from the Data pane, reducing clutter without deleting any data.
Splitting Columns
If a column contains combined values (e.g., "John Smith" in a single Name field, or "2024-Q1" in a Period field), use the Split option: right-click the column → Transform → Split. Tableau creates new calculated fields for each part. You can also use Custom Split to specify your delimiter.
Multiple Connections in One Workbook
A single Tableau workbook can connect to multiple data sources simultaneously. Click the small cylinder icon (or "Add" in the Data pane) to add a second connection. Each data source gets its own entry in the Data pane. You can then use Data Blending (covered in Lesson 11) to combine them when a direct join isn't possible because the data lives in different database systems.