Parquet File API Example
The simplest way to get started with flAPI: turn a Parquet file into a REST API in minutes.
What You'll Build
A filterable customer API that:
- ✅ Serves data from a local Parquet file
- ✅ Supports query parameters
- ✅ Validates input
- ✅ Generates automatic documentation
- ✅ Starts in milliseconds
Prerequisites
- flAPI installed (Quickstart)
- A Parquet file (we'll use sample customer data)
How It Works
Key advantage: DuckDB reads Parquet files directly with no ETL, database, or import needed!
Step-by-Step
1. Project Structure
my-api/
├── flapi.yaml
├── data/
│ └── customers.parquet
└── sqls/
├── customers.yaml
└── customers.sql
2. Configuration
flapi.yaml:
project-name: customers-api
project-description: Simple customer API from Parquet
template:
path: './sqls'
connections:
customers-data:
properties:
path: './data/customers.parquet'
duckdb:
access_mode: READ_WRITE
3. Endpoint Configuration
sqls/customers.yaml:
url-path: /customers/
request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1
- field-name: segment
field-in: query
description: Market segment (AUTOMOBILE, BUILDING, FURNITURE)
required: false
validators:
- type: enum
allowedValues: [AUTOMOBILE, BUILDING, FURNITURE, HOUSEHOLD, MACHINERY]
- field-name: min_balance
field-in: query
description: Minimum account balance
required: false
validators:
- type: int
min: 0
template-source: customers.sql
connection:
- customers-data
4. SQL Template
sqls/customers.sql:
SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance,
c_address as address,
c_phone as phone,
c_comment as notes
FROM '{{{conn.path}}}'
WHERE 1=1
{{#params.id}}
AND c_custkey = {{{params.id}}}
{{/params.id}}
{{#params.segment}}
AND c_mktsegment = '{{{params.segment}}}'
{{/params.segment}}
{{#params.min_balance}}
AND c_acctbal >= {{{params.min_balance}}}
{{/params.min_balance}}
ORDER BY c_acctbal DESC
LIMIT 100
Running the API
Start the Server
$ ./flapi -c flapi.yaml
✓ Loaded 1 endpoints
✓ Server listening on :8080
⚡ Ready in 1.2ms
Test the Endpoints
Get all customers:
$ curl http://localhost:8080/customers/
{
"data": [
{
"id": 1,
"name": "Customer#000000001",
"segment": "BUILDING",
"balance": 711.56,
"address": "IVhzIApeRb ot,c,E",
"phone": "25-989-741-2988",
"notes": "Regular requests..."
},
...
]
}
Filter by segment:
$ curl "http://localhost:8080/customers/?segment=AUTOMOBILE"
Filter by minimum balance:
$ curl "http://localhost:8080/customers/?min_balance=5000"
Get specific customer:
$ curl "http://localhost:8080/customers/?id=12345"
Combine filters:
$ curl "http://localhost:8080/customers/?segment=AUTOMOBILE&min_balance=5000"
Automatic Documentation
flAPI automatically generates OpenAPI (Swagger) documentation:
# Visit in your browser
http://localhost:8080/docs
You'll see:
- All endpoints listed
- Parameter descriptions
- Try-it-out functionality
- Request/response examples
Testing with CLI
Use the flapii CLI to test locally:
# Validate endpoint
$ flapii endpoints validate /customers/
✓ Configuration valid
✓ SQL template found
✓ All validators configured
# Test template expansion
$ flapii templates expand /customers/ \
--params '{"segment": "AUTOMOBILE", "min_balance": 5000}'
Expanded SQL:
SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance
FROM './data/customers.parquet'
WHERE 1=1
AND c_mktsegment = 'AUTOMOBILE'
AND c_acctbal >= 5000
ORDER BY c_acctbal DESC
LIMIT 100
# Run query and see results
$ flapii query run /customers/ \
--params '{"segment": "AUTOMOBILE"}' \
--limit 5
Why This Works
Performance
DuckDB reads Parquet files incredibly fast:
- Columnar storage: Only reads needed columns
- Compression: Efficient data storage
- Predicate pushdown: Filters applied at read time
- Parallel processing: Uses all CPU cores
No Database Required
- ✅ No PostgreSQL/MySQL to install
- ✅ No server to maintain
- ✅ Just files + flAPI
- ✅ Perfect for prototypes and small-to-medium datasets
Portability
# Everything in one directory
$ tar -czf my-api.tar.gz my-api/
$ scp my-api.tar.gz server:
# On server
$ tar -xzf my-api.tar.gz
$ cd my-api && ./flapi -c flapi.yaml
# API is running!
Scaling Up
Multiple Files
connections:
all-customers:
properties:
path: './data/customers/*.parquet' # All files in directory
Adding More Endpoints
# sqls/orders.yaml
url-path: /orders/
connection:
- orders-data
Adding Caching
If you get high traffic, add caching:
cache:
enabled: true
table: customers_cache
schedule: 60m
Reuse field definitions with includes
Once you have more than one endpoint over the same dataset, you usually want
the validators, rate limit, and connection settings in one place. flAPI
supports YAML section includes with the syntax
{{include:section_name from path/to/file.yaml}}, which inlines the named
top-level section from the referenced file. The canonical example lives in
examples/sqls/customers/customer-common.yaml
and consolidates the shared pieces:
# Shared request validators
request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1
max: 1000000
- field-name: segment
field-in: query
required: false
validators:
- type: enum
allowedValues: [AUTOMOBILE, BUILDING, FURNITURE, HOUSEHOLD, MACHINERY]
- field-name: email
field-in: query
required: false
validators:
- type: email
# Shared connection
connection:
- customers-data
# Shared rate limiting
rate-limit:
enabled: true
max: 100
interval: 60
The customer endpoint can now declare just what's specific to itself and pull the rest from the common file:
url-path: /customers/
{{include:request from common/customer-common.yaml}}
{{include:rate-limit from common/customer-common.yaml}}
{{include:connection from common/customer-common.yaml}}
template-source: customers.sql
A second endpoint — say, a "VIP customers" view — picks up the exact same validators and rate limit without any copy-paste:
url-path: /customers/vip/
{{include:request from common/customer-common.yaml}}
{{include:rate-limit from common/customer-common.yaml}}
{{include:connection from common/customer-common.yaml}}
template-source: customers-vip.sql
When segment later needs a new allowed value, you update one file and
every endpoint that includes it picks the change up on the next reload.
Warm the cache with the heartbeat worker
flAPI ships with a background heartbeat worker that periodically pings
endpoints you've opted in. Its job is to keep cached templates warm so the
first real user request hits a fresh snapshot instead of paying the cold
materialization cost. Enable it globally in flapi.yaml:
# Heartbeat configuration
heartbeat:
enabled: true # turn the worker on (off by default)
worker-interval: 10 # seconds between heartbeat passes
Then opt individual endpoints in. The params: map under the endpoint-level
heartbeat: block is passed straight to the SQL template, so you can warm
the most common filter combinations:
url-path: /customers/
# ... request, template-source, connection, cache ...
heartbeat:
enabled: true
params:
segment: AUTOMOBILE # pre-warm the most popular filter
Next Steps
- Quickstart Guide: Build your first API step-by-step
- SQL Templating: Learn advanced Mustache templates
- Endpoints Overview: Complete endpoint configuration
- Caching Strategy: When to use full vs. incremental cache modes (and how the heartbeat worker fits in)
- BigQuery Example: Scale to cloud warehouses with caching
- SAP ERP Example: Connect to enterprise systems
- Parquet Connection Guide: Learn more about Parquet
- Deployment: Deploy to production