Skip to main content

Connecting Data Sources

flAPI leverages DuckDB's extension ecosystem to attach data sources. The ten sources below have first-class examples and dedicated guides in this documentation. Because connections are configured via an arbitrary init: SQL block, any additional DuckDB community extension can also be wired up — the list is not exhaustive.

Basic Configuration

Connections are defined in your flapi.yaml configuration file:

connections:
connection-name:
init: |
# SQL commands to initialize (load extensions)
INSTALL extension_name;
LOAD extension_name;
properties:
# Connection-specific properties
property1: value1
property2: value2

The init: block runs once per worker and can install community extensions, attach databases, or create DuckDB secrets — anything DuckDB SQL can do.

Supported Data Sources (with dedicated guides)

Ten connection types are verified and have a dedicated guide:

Cloud Data Warehouses

  • BigQuery — Google's cloud data warehouse with DuckLake caching
  • Snowflake — Cloud data platform with cost-optimized access

Databases & Connectivity

  • PostgreSQL — Open-source relational database
  • ODBC — Universal database connector (Oracle, Teradata, DB2, and more)

File Formats

  • Parquet — Columnar storage format (local or via httpfs on S3/GCS/Azure)

Enterprise & BI Systems

  • SAP ERP — SAP NetWeaver / S/4HANA via the ERPL extension
  • SAP BW — SAP Business Warehouse via ERPL
  • Power BI — Query Power BI semantic models

No-Code & Collaborative

AI / ML

  • Vector Search — Semantic search with the DuckDB VSS extension

Other Sources via Community Extensions

flAPI doesn't ship integrations beyond the ten above, but anything you can INSTALL ... FROM community; LOAD ...; in DuckDB works inside a connection's init: block. Examples that the community has built and that you can wire up yourself include MySQL, SQLite, Iceberg, Delta, MotherDuck, Excel, and the various httpfs-based file readers. These are unverified by flAPI — you'll need to test them against your version of DuckDB and follow the extension's own docs.

Quick Start Examples

Local Parquet File

connections:
my-data:
properties:
path: './data/customers.parquet'
-- In your SQL template
SELECT * FROM '{{{conn.path}}}'

BigQuery (community extension)

connections:
bigquery-warehouse:
init: |
INSTALL 'bigquery' FROM community;
LOAD 'bigquery';
properties:
project_id: 'my-project-id'
SELECT * FROM bigquery_scan('project.dataset.table')

PostgreSQL

connections:
postgres-db:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: localhost
port: 5432
database: mydb
username: ${DB_USER}
password: ${DB_PASSWORD}
SELECT * FROM postgres_scan('mydb', 'public', 'users')

SQLite via Community Extension

connections:
northwind-sqlite:
init: |
INSTALL sqlite;
LOAD sqlite;
ATTACH IF NOT EXISTS './examples/data/northwind.sqlite' AS nw (TYPE sqlite);
SELECT * FROM nw.customers

Environment Variables

Use environment variables for sensitive data:

connections:
secure-db:
properties:
host: ${DB_HOST}
username: ${DB_USER}
password: ${DB_PASSWORD}

Environment whitelist (in flapi.yaml):

template:
environment-whitelist:
- '^DB_.*'
- '^GOOGLE_.*'

Multiple Connections

Connect to multiple sources in one project:

connections:
# Production warehouse
bigquery-prod:
init: |
INSTALL 'bigquery' FROM community;
LOAD 'bigquery';
properties:
project_id: 'prod-project'

# Reference data
customers-parquet:
properties:
path: './data/customers.parquet'

# Operational database
postgres-ops:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: ops.example.com
database: operations

Using Connections in Endpoints

Specify which connection to use in your endpoint YAML:

# Single connection
url-path: /customers/
template-source: customers.sql
connection:
- customers-parquet

# Multiple connections (join across sources!)
url-path: /enriched-orders/
template-source: enriched_orders.sql
connection:
- bigquery-warehouse
- customers-parquet

Why Diverse Data Sources Matter

flAPI's value is making any data source accessible via REST APIs and MCP tools. A few examples of why uncommon sources are powerful:

Google Sheets as a Database

Perfect for non-technical teams, prototyping, or collaborative data management — marketing teams can manage content without touching code, and forms become instant APIs.

Vector Search for AI

Build RAG (Retrieval Augmented Generation) applications: semantic search over documentation, similar-product recommendations, and AI agents with memory.

Power BI Integration

Reuse existing BI models without rebuilding them — expose dashboard data as APIs and feed mobile apps from the same logic that powers Power BI reports.

ODBC for Legacy Systems

Connect to Oracle, Teradata, DB2, Informix, mainframe data sources and proprietary databases — anywhere you have an ODBC driver.

Next Steps

Popular Sources:

AI / ML:

Enterprise:

  • Power BI: Query BI models programmatically
  • ODBC: Universal database connector

Examples:

🍪 Cookie Settings