Parquet File API Example
The simplest way to get started with flAPI: turn a Parquet file into a REST API in minutes.
What You'll Build
A filterable customer API that:
- ✅ Serves data from a local Parquet file
- ✅ Supports query parameters
- ✅ Validates input
- ✅ Generates automatic documentation
- ✅ Starts in milliseconds
Prerequisites
- flAPI installed (Quickstart)
- A Parquet file (we'll use sample customer data)
How It Works
Key advantage: DuckDB reads Parquet files directly with no ETL, database, or import needed!
Step-by-Step
1. Project Structure
my-api/
├── flapi.yaml
├── data/
│ └── customers.parquet
└── sqls/
├── customers.yaml
└── customers.sql
2. Configuration
flapi.yaml
:
project_name: customers-api
project_description: Simple customer API from Parquet
template:
path: './sqls'
connections:
customers-data:
properties:
path: './data/customers.parquet'
duckdb:
access_mode: READ_WRITE
3. Endpoint Configuration
sqls/customers.yaml
:
url-path: /customers/
request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1
- field-name: segment
field-in: query
description: Market segment (AUTOMOTIVE, BUILDING, FURNITURE)
required: false
validators:
- type: string
enum: ['AUTOMOTIVE', 'BUILDING', 'FURNITURE', 'MACHINERY', 'HOUSEHOLD']
- field-name: min_balance
field-in: query
description: Minimum account balance
required: false
validators:
- type: number
min: 0
template-source: customers.sql
connection:
- customers-data
4. SQL Template
sqls/customers.sql
:
SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance,
c_address as address,
c_phone as phone,
c_comment as notes
FROM '{{{conn.path}}}'
WHERE 1=1
{{#params.id}}
AND c_custkey = {{{params.id}}}
{{/params.id}}
{{#params.segment}}
AND c_mktsegment = '{{{params.segment}}}'
{{/params.segment}}
{{#params.min_balance}}
AND c_acctbal >= {{{params.min_balance}}}
{{/params.min_balance}}
ORDER BY c_acctbal DESC
LIMIT 100
Running the API
Start the Server
$ ./flapi -c flapi.yaml
✓ Loaded 1 endpoints
✓ Server listening on :8080
⚡ Ready in 1.2ms
Test the Endpoints
Get all customers:
$ curl http://localhost:8080/customers/
{
"data": [
{
"id": 1,
"name": "Customer#000000001",
"segment": "BUILDING",
"balance": 711.56,
"address": "IVhzIApeRb ot,c,E",
"phone": "25-989-741-2988",
"notes": "Regular requests..."
},
...
]
}
Filter by segment:
$ curl "http://localhost:8080/customers/?segment=AUTOMOTIVE"
Filter by minimum balance:
$ curl "http://localhost:8080/customers/?min_balance=5000"
Get specific customer:
$ curl "http://localhost:8080/customers/?id=12345"
Combine filters:
$ curl "http://localhost:8080/customers/?segment=AUTOMOTIVE&min_balance=5000"
Automatic Documentation
flAPI automatically generates OpenAPI (Swagger) documentation:
# Visit in your browser
http://localhost:8080/docs
You'll see:
- All endpoints listed
- Parameter descriptions
- Try-it-out functionality
- Request/response examples
Testing with CLI
Use the flapii
CLI to test locally:
# Validate endpoint
$ flapii endpoints validate /customers/
✓ Configuration valid
✓ SQL template found
✓ All validators configured
# Test template expansion
$ flapii templates expand /customers/ \
--params '{"segment": "AUTOMOTIVE", "min_balance": 5000}'
Expanded SQL:
SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance
FROM './data/customers.parquet'
WHERE 1=1
AND c_mktsegment = 'AUTOMOTIVE'
AND c_acctbal >= 5000
ORDER BY c_acctbal DESC
LIMIT 100
# Run query and see results
$ flapii query run /customers/ \
--params '{"segment": "AUTOMOTIVE"}' \
--limit 5
Why This Works
Performance
DuckDB reads Parquet files incredibly fast:
- Columnar storage: Only reads needed columns
- Compression: Efficient data storage
- Predicate pushdown: Filters applied at read time
- Parallel processing: Uses all CPU cores
No Database Required
- ✅ No PostgreSQL/MySQL to install
- ✅ No server to maintain
- ✅ Just files + flAPI
- ✅ Perfect for prototypes and small-to-medium datasets
Portability
# Everything in one directory
$ tar -czf my-api.tar.gz my-api/
$ scp my-api.tar.gz server:
# On server
$ tar -xzf my-api.tar.gz
$ cd my-api && ./flapi -c flapi.yaml
# API is running!
Scaling Up
Multiple Files
connections:
all-customers:
properties:
path: './data/customers/*.parquet' # All files in directory
Adding More Endpoints
# sqls/orders.yaml
url-path: /orders/
connection:
- orders-data
Adding Caching
If you get high traffic, add caching:
cache:
enabled: true
table: customers_cache
schedule: 60m
Next Steps
- Quickstart Guide: Build your first API step-by-step
- SQL Templating: Learn advanced Mustache templates
- Endpoints Overview: Complete endpoint configuration
- BigQuery Example: Scale to cloud warehouses with caching
- SAP ERP Example: Connect to enterprise systems
- Parquet Connection Guide: Learn more about Parquet
- Deployment: Deploy to production