Connecting Data Sources
flAPI leverages DuckDB's extension ecosystem to attach data sources. The ten sources below have first-class examples and dedicated guides in this documentation. Because connections are configured via an arbitrary init: SQL block, any additional DuckDB community extension can also be wired up — the list is not exhaustive.
Basic Configuration
Connections are defined in your flapi.yaml configuration file:
connections:
connection-name:
init: |
# SQL commands to initialize (load extensions)
INSTALL extension_name;
LOAD extension_name;
properties:
# Connection-specific properties
property1: value1
property2: value2
The init: block runs once per worker and can install community extensions, attach databases, or create DuckDB secrets — anything DuckDB SQL can do.
Supported Data Sources (with dedicated guides)
Ten connection types are verified and have a dedicated guide:
Cloud Data Warehouses
- BigQuery — Google's cloud data warehouse with DuckLake caching
- Snowflake — Cloud data platform with cost-optimized access
Databases & Connectivity
- PostgreSQL — Open-source relational database
- ODBC — Universal database connector (Oracle, Teradata, DB2, and more)
File Formats
- Parquet — Columnar storage format (local or via httpfs on S3/GCS/Azure)
Enterprise & BI Systems
- SAP ERP — SAP NetWeaver / S/4HANA via the ERPL extension
- SAP BW — SAP Business Warehouse via ERPL
- Power BI — Query Power BI semantic models
No-Code & Collaborative
- Google Sheets — Turn spreadsheets into APIs
AI / ML
- Vector Search — Semantic search with the DuckDB VSS extension
Other Sources via Community Extensions
flAPI doesn't ship integrations beyond the ten above, but anything you can INSTALL ... FROM community; LOAD ...; in DuckDB works inside a connection's init: block. Examples that the community has built and that you can wire up yourself include MySQL, SQLite, Iceberg, Delta, MotherDuck, Excel, and the various httpfs-based file readers. These are unverified by flAPI — you'll need to test them against your version of DuckDB and follow the extension's own docs.
Quick Start Examples
Local Parquet File
connections:
my-data:
properties:
path: './data/customers.parquet'
-- In your SQL template
SELECT * FROM '{{{conn.path}}}'
BigQuery (community extension)
connections:
bigquery-warehouse:
init: |
INSTALL 'bigquery' FROM community;
LOAD 'bigquery';
properties:
project_id: 'my-project-id'
SELECT * FROM bigquery_scan('project.dataset.table')
PostgreSQL
connections:
postgres-db:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: localhost
port: 5432
database: mydb
username: ${DB_USER}
password: ${DB_PASSWORD}
SELECT * FROM postgres_scan('mydb', 'public', 'users')
SQLite via Community Extension
connections:
northwind-sqlite:
init: |
INSTALL sqlite;
LOAD sqlite;
ATTACH IF NOT EXISTS './examples/data/northwind.sqlite' AS nw (TYPE sqlite);
SELECT * FROM nw.customers
Environment Variables
Use environment variables for sensitive data:
connections:
secure-db:
properties:
host: ${DB_HOST}
username: ${DB_USER}
password: ${DB_PASSWORD}
Environment whitelist (in flapi.yaml):
template:
environment-whitelist:
- '^DB_.*'
- '^GOOGLE_.*'
Multiple Connections
Connect to multiple sources in one project:
connections:
# Production warehouse
bigquery-prod:
init: |
INSTALL 'bigquery' FROM community;
LOAD 'bigquery';
properties:
project_id: 'prod-project'
# Reference data
customers-parquet:
properties:
path: './data/customers.parquet'
# Operational database
postgres-ops:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: ops.example.com
database: operations
Using Connections in Endpoints
Specify which connection to use in your endpoint YAML:
# Single connection
url-path: /customers/
template-source: customers.sql
connection:
- customers-parquet
# Multiple connections (join across sources!)
url-path: /enriched-orders/
template-source: enriched_orders.sql
connection:
- bigquery-warehouse
- customers-parquet
Why Diverse Data Sources Matter
flAPI's value is making any data source accessible via REST APIs and MCP tools. A few examples of why uncommon sources are powerful:
Google Sheets as a Database
Perfect for non-technical teams, prototyping, or collaborative data management — marketing teams can manage content without touching code, and forms become instant APIs.
Vector Search for AI
Build RAG (Retrieval Augmented Generation) applications: semantic search over documentation, similar-product recommendations, and AI agents with memory.
Power BI Integration
Reuse existing BI models without rebuilding them — expose dashboard data as APIs and feed mobile apps from the same logic that powers Power BI reports.
ODBC for Legacy Systems
Connect to Oracle, Teradata, DB2, Informix, mainframe data sources and proprietary databases — anywhere you have an ODBC driver.
Next Steps
Popular Sources:
- Google Sheets: Turn spreadsheets into APIs
- BigQuery: Connect to Google BigQuery with caching
- PostgreSQL: Connect to PostgreSQL
- Parquet Files: Work with local/cloud files
AI / ML:
- Vector Search: Build RAG applications
- Snowflake: Cloud data warehouse integration
Enterprise:
Examples:
- Google Sheets API: Collaborative data API
- Parquet API: Local file APIs
- BigQuery Caching: Cloud warehouse optimization