Skip to main content

Parquet File API Example

The simplest way to get started with flAPI: turn a Parquet file into a REST API in minutes.

What You'll Build

A filterable customer API that:

  • ✅ Serves data from a local Parquet file
  • ✅ Supports query parameters
  • ✅ Validates input
  • ✅ Generates automatic documentation
  • ✅ Starts in milliseconds

Prerequisites

  • flAPI installed (Quickstart)
  • A Parquet file (we'll use sample customer data)

How It Works

Key advantage: DuckDB reads Parquet files directly with no ETL, database, or import needed!

Step-by-Step

1. Project Structure

my-api/
├── flapi.yaml
├── data/
│ └── customers.parquet
└── sqls/
├── customers.yaml
└── customers.sql

2. Configuration

flapi.yaml:

project-name: customers-api
project-description: Simple customer API from Parquet

template:
path: './sqls'

connections:
customers-data:
properties:
path: './data/customers.parquet'

duckdb:
access_mode: READ_WRITE

3. Endpoint Configuration

sqls/customers.yaml:

url-path: /customers/

request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1

- field-name: segment
field-in: query
description: Market segment (AUTOMOBILE, BUILDING, FURNITURE)
required: false
validators:
- type: enum
allowedValues: [AUTOMOBILE, BUILDING, FURNITURE, HOUSEHOLD, MACHINERY]

- field-name: min_balance
field-in: query
description: Minimum account balance
required: false
validators:
- type: int
min: 0

template-source: customers.sql
connection:
- customers-data

4. SQL Template

sqls/customers.sql:

SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance,
c_address as address,
c_phone as phone,
c_comment as notes
FROM '{{{conn.path}}}'
WHERE 1=1
{{#params.id}}
AND c_custkey = {{{params.id}}}
{{/params.id}}
{{#params.segment}}
AND c_mktsegment = '{{{params.segment}}}'
{{/params.segment}}
{{#params.min_balance}}
AND c_acctbal >= {{{params.min_balance}}}
{{/params.min_balance}}
ORDER BY c_acctbal DESC
LIMIT 100

Running the API

Start the Server

$ ./flapi -c flapi.yaml

✓ Loaded 1 endpoints
✓ Server listening on :8080
⚡ Ready in 1.2ms

Test the Endpoints

Get all customers:

$ curl http://localhost:8080/customers/

{
"data": [
{
"id": 1,
"name": "Customer#000000001",
"segment": "BUILDING",
"balance": 711.56,
"address": "IVhzIApeRb ot,c,E",
"phone": "25-989-741-2988",
"notes": "Regular requests..."
},
...
]
}

Filter by segment:

$ curl "http://localhost:8080/customers/?segment=AUTOMOBILE"

Filter by minimum balance:

$ curl "http://localhost:8080/customers/?min_balance=5000"

Get specific customer:

$ curl "http://localhost:8080/customers/?id=12345"

Combine filters:

$ curl "http://localhost:8080/customers/?segment=AUTOMOBILE&min_balance=5000"

Automatic Documentation

flAPI automatically generates OpenAPI (Swagger) documentation:

# Visit in your browser
http://localhost:8080/docs

You'll see:

  • All endpoints listed
  • Parameter descriptions
  • Try-it-out functionality
  • Request/response examples

Testing with CLI

Use the flapii CLI to test locally:

# Validate endpoint
$ flapii endpoints validate /customers/
✓ Configuration valid
✓ SQL template found
✓ All validators configured

# Test template expansion
$ flapii templates expand /customers/ \
--params '{"segment": "AUTOMOBILE", "min_balance": 5000}'

Expanded SQL:
SELECT
c_custkey as id,
c_name as name,
c_mktsegment as segment,
c_acctbal as balance
FROM './data/customers.parquet'
WHERE 1=1
AND c_mktsegment = 'AUTOMOBILE'
AND c_acctbal >= 5000
ORDER BY c_acctbal DESC
LIMIT 100

# Run query and see results
$ flapii query run /customers/ \
--params '{"segment": "AUTOMOBILE"}' \
--limit 5

Why This Works

Performance

DuckDB reads Parquet files incredibly fast:

  • Columnar storage: Only reads needed columns
  • Compression: Efficient data storage
  • Predicate pushdown: Filters applied at read time
  • Parallel processing: Uses all CPU cores

No Database Required

  • ✅ No PostgreSQL/MySQL to install
  • ✅ No server to maintain
  • ✅ Just files + flAPI
  • ✅ Perfect for prototypes and small-to-medium datasets

Portability

# Everything in one directory
$ tar -czf my-api.tar.gz my-api/
$ scp my-api.tar.gz server:

# On server
$ tar -xzf my-api.tar.gz
$ cd my-api && ./flapi -c flapi.yaml

# API is running!

Scaling Up

Multiple Files

connections:
all-customers:
properties:
path: './data/customers/*.parquet' # All files in directory

Adding More Endpoints

# sqls/orders.yaml
url-path: /orders/
connection:
- orders-data

Adding Caching

If you get high traffic, add caching:

cache:
enabled: true
table: customers_cache
schedule: 60m

Reuse field definitions with includes

Once you have more than one endpoint over the same dataset, you usually want the validators, rate limit, and connection settings in one place. flAPI supports YAML section includes with the syntax {{include:section_name from path/to/file.yaml}}, which inlines the named top-level section from the referenced file. The canonical example lives in examples/sqls/customers/customer-common.yaml and consolidates the shared pieces:

sqls/common/customer-common.yaml
# Shared request validators
request:
- field-name: id
field-in: query
description: Customer ID
required: false
validators:
- type: int
min: 1
max: 1000000

- field-name: segment
field-in: query
required: false
validators:
- type: enum
allowedValues: [AUTOMOBILE, BUILDING, FURNITURE, HOUSEHOLD, MACHINERY]

- field-name: email
field-in: query
required: false
validators:
- type: email

# Shared connection
connection:
- customers-data

# Shared rate limiting
rate-limit:
enabled: true
max: 100
interval: 60

The customer endpoint can now declare just what's specific to itself and pull the rest from the common file:

sqls/customers.yaml
url-path: /customers/

{{include:request from common/customer-common.yaml}}
{{include:rate-limit from common/customer-common.yaml}}
{{include:connection from common/customer-common.yaml}}

template-source: customers.sql

A second endpoint — say, a "VIP customers" view — picks up the exact same validators and rate limit without any copy-paste:

sqls/customers-vip.yaml
url-path: /customers/vip/

{{include:request from common/customer-common.yaml}}
{{include:rate-limit from common/customer-common.yaml}}
{{include:connection from common/customer-common.yaml}}

template-source: customers-vip.sql

When segment later needs a new allowed value, you update one file and every endpoint that includes it picks the change up on the next reload.

Warm the cache with the heartbeat worker

flAPI ships with a background heartbeat worker that periodically pings endpoints you've opted in. Its job is to keep cached templates warm so the first real user request hits a fresh snapshot instead of paying the cold materialization cost. Enable it globally in flapi.yaml:

flapi.yaml
# Heartbeat configuration
heartbeat:
enabled: true # turn the worker on (off by default)
worker-interval: 10 # seconds between heartbeat passes

Then opt individual endpoints in. The params: map under the endpoint-level heartbeat: block is passed straight to the SQL template, so you can warm the most common filter combinations:

sqls/customers.yaml
url-path: /customers/

# ... request, template-source, connection, cache ...

heartbeat:
enabled: true
params:
segment: AUTOMOBILE # pre-warm the most popular filter

Next Steps

🍪 Cookie Settings