Connecting Data Sources
flAPI leverages DuckDB's powerful extension ecosystem to connect to 20+ data sources. You can mix and match different sources in a single flAPI instance, enabling powerful data integration scenarios.
Basic Configuration
Connections are defined in your flapi.yaml
configuration file:
connections:
connection-name:
init: |
# SQL commands to initialize (load extensions)
INSTALL extension_name;
LOAD extension_name;
properties:
# Connection-specific properties
property1: value1
property2: value2
Supported Data Sources
Cloud Data Warehouses
- BigQuery - Google's cloud data warehouse with millisecond caching
- Snowflake - Cloud data platform with cost-optimized access
- Databricks - Unified analytics platform
Databases
- PostgreSQL - Popular open-source database
- MySQL - Widely-used relational database
- SQLite - Embedded database
- ODBC - Connect to ANY database (Oracle, Teradata, DB2, and more)
File Formats
- Parquet - Columnar storage format
- CSV - Comma-separated values
- JSON - JavaScript Object Notation
- Apache Iceberg - Table format for huge datasets
Enterprise & BI Systems
- SAP ERP - Enterprise resource planning (via ERPL extension)
- SAP BW - Business warehouse
- Power BI - Query PowerBI data models without opening PowerBI
- MSOLAP - Microsoft SQL Server Analysis Services (SSAS/OLAP cubes)
No-Code & Collaborative
- Google Sheets - Turn spreadsheets into APIs (no database needed!)
- Airtable - Collaborative database platform (via API)
AI/ML & Advanced
- Vector Search - Semantic search with Faiss/VSS for RAG applications
- Arrow Flight - Connect to ML feature stores and model servers
- Redis - In-memory data structures and pub/sub
Real-Time & Streaming
- WebSocket Streams - Query evented data sources
- Kafka - Distributed event streaming
- Redis Queues - Real-time message queues
Quick Start Examples
Local Parquet File
connections:
my-data:
properties:
path: './data/customers.parquet'
-- In your SQL template
SELECT * FROM '{{{conn.path}}}'
BigQuery
connections:
bigquery-warehouse:
init: |
INSTALL 'bigquery';
LOAD 'bigquery';
properties:
project_id: 'my-project-id'
-- In your SQL template
SELECT * FROM bigquery_scan('project.dataset.table')
PostgreSQL
connections:
postgres-db:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: localhost
port: 5432
database: mydb
username: ${DB_USER}
password: ${DB_PASSWORD}
-- In your SQL template
SELECT * FROM postgres_scan('mydb', 'public', 'users')
Environment Variables
Use environment variables for sensitive data:
connections:
secure-db:
properties:
host: ${DB_HOST}
username: ${DB_USER}
password: ${DB_PASSWORD}
Environment whitelist (in main config):
template:
environment-whitelist:
- '^DB_.*'
- '^GOOGLE_.*'
Multiple Connections
Connect to multiple sources in one config:
connections:
# Production warehouse
bigquery-prod:
init: |
INSTALL 'bigquery';
LOAD 'bigquery';
properties:
project_id: 'prod-project'
# Reference data
customers-parquet:
properties:
path: './data/customers.parquet'
# Operational database
postgres-ops:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: ops.example.com
database: operations
Using Connections in Endpoints
Specify which connection to use in your endpoint YAML:
# Single connection
url-path: /customers/
template-source: customers.sql
connection:
- customers-parquet
# Multiple connections (join across sources!)
url-path: /enriched-orders/
template-source: enriched_orders.sql
connection:
- bigquery-warehouse
- customers-parquet
Unconventional Data Sources: Why They Matter
flAPI's strength is making any data source accessible via REST APIs. Here's why unconventional sources are powerful:
🎯 Google Sheets as Database
Perfect for non-technical teams, prototyping, or collaborative data management:
- Marketing teams manage content without touching code
- Forms & surveys become instant APIs
- No database setup required
🤖 Vector Search for AI
Build RAG (Retrieval Augmented Generation) applications:
- Semantic search over documentation
- Similar product recommendations
- AI agents with memory
📊 Power BI Integration
Reuse existing BI models without rebuilding:
- Expose dashboard data as APIs
- Mobile apps access BI logic
- Automated reporting pipelines
🏢 ODBC for Legacy Systems
Connect to enterprise databases without native drivers:
- Oracle, Teradata, DB2, Informix
- Proprietary databases
- Mainframe data sources
⚡ Arrow Flight for ML
Real-time ML model serving:
- Feature store integration
- Model prediction APIs
- High-performance data exchange
Next Steps
Popular Sources:
- Google Sheets: Turn spreadsheets into APIs
- BigQuery: Connect to Google BigQuery with caching
- PostgreSQL: Connect to PostgreSQL
- Parquet Files: Work with local/cloud files
AI/ML:
- Vector Search: Build RAG applications
- Snowflake: Cloud data warehouse integration
Enterprise:
Examples:
- Google Sheets API: Collaborative data API
- Parquet API: Local file APIs
- BigQuery Caching: Cloud warehouse optimization