Caching Strategy
flAPI's caching layer is not just a feature for improving speedβit's a strategic component for cost optimization. By serving requests from its local cache, flAPI significantly reduces the query load on expensive analytical databases that are not designed (or priced) for high-throughput serving workloads.
Think of flAPI's caching layer like a coffee shop barista. Instead of running to the warehouse (the roastery) for every single cup, they brew a large, fresh batch every hour. This means you get your coffee in seconds, not minutes, and the roastery isn't constantly getting small, inefficient orders.
The Problem with Direct Warehouse Accessβ
Performance Issuesβ
Data warehouses like BigQuery, Snowflake, and SAP are optimized for analytical workloads, not API serving:
Requirement | Warehouse | API Serving |
---|---|---|
Latency | 2-10 seconds | < 5ms |
Concurrency | Low (10-100) | High (1000s) |
Query Pattern | Few large queries | Many small queries |
Frequency | Periodic | Continuous |
Cost Explosionβ
Direct warehouse access gets expensive fast:
Direct BigQuery Queries:
βββββββββββββββββββββββββββββββββββββ
Single query: $0.05 (10GB scan)
1,000/day: $50/day = $1,500/month
10,000/day: $500/day = $15,000/month
100,000/day: $5,000/day = $150,000/month
The flAPI Solutionβ
Cache-First Architectureβ
Instead of querying the warehouse for every API request:
How it works:
- Initial Load: Query the warehouse once and materialize data in DuckDB
- Serve: Handle thousands of API requests from the fast local cache
- Refresh: Update the cache on your schedule (hourly, daily, etc.)
With flAPI Caching:
βββββββββββββββββββββββββββββββββββββ
Cache refresh: $0.05 (once per hour)
API requests: $0 (served from cache)
24 refreshes/day: $1.20/day = $36/month
Savings: 99.76% vs direct access
Speed: 1000-20,000x faster
How It Worksβ
1. Cache Configurationβ
Define your cache in the endpoint YAML:
# sqls/campaign-stats.yaml
url-path: /campaign-stats
# Cache configuration
cache:
enabled: true
table: campaign_stats_cache
schema: analytics
schedule: 60m # Refresh every hour
template_file: campaign_stats_cache.sql
template-source: campaign_stats.sql
connection:
- bigquery-warehouse
2. Cache Materializationβ
The cache template runs on schedule to populate DuckDB:
-- sqls/campaign_stats_cache.sql
-- This runs every hour (as configured above)
CREATE OR REPLACE TABLE analytics.campaign_stats_cache AS
SELECT
campaign_id,
campaign_type,
country,
SUM(clicks) as total_clicks,
SUM(revenue) as total_revenue,
SUM(conversions) as conversions,
MAX(last_updated) as last_updated
FROM bigquery_scan('project.dataset.campaigns')
WHERE date >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY 1, 2, 3;
3. Fast API Servingβ
The API template queries the local cache:
-- sqls/campaign_stats.sql
-- This serves API requests from the cache
SELECT *
FROM analytics.campaign_stats_cache
WHERE 1=1
{{#params.country}}
AND country = '{{{params.country}}}'
{{/params.country}}
{{#params.campaign_type}}
AND campaign_type = '{{{params.campaign_type}}}'
{{/params.campaign_type}}
ORDER BY total_revenue DESC
LIMIT 100;
Cache Strategiesβ
Choose the right strategy for your use case:
Full Refresh (Replace)β
Complete rebuild of the cache:
CREATE OR REPLACE TABLE analytics.cache_table AS
SELECT * FROM expensive_source
WHERE active = true;
Use when:
- Data set is small to medium
- Complete accuracy required
- Simplest to implement
Incremental Appendβ
Add only new rows:
INSERT INTO analytics.cache_table
SELECT * FROM expensive_source
WHERE created_at > (
SELECT MAX(created_at) FROM analytics.cache_table
);
Use when:
- Data is append-only (logs, events)
- Want to minimize warehouse queries
- Historical data doesn't change
Merge/Upsertβ
Update existing rows and insert new:
MERGE INTO analytics.cache_table AS target
USING expensive_source AS source
ON target.id = source.id
WHEN MATCHED THEN UPDATE SET *
WHEN NOT MATCHED THEN INSERT *;
Use when:
- Data can be updated
- Need to capture changes
- Want both accuracy and efficiency
Real-World Exampleβ
Scenario: Marketing Dashboard APIβ
Requirements:
- 50,000 dashboard loads per day
- Data from BigQuery (10GB scan per query)
- Data updates hourly
- Sub-second response time required
Without flAPI:
50,000 queries Γ $0.05 = $2,500/day = $75,000/month
Response time: 2-5 seconds
With flAPI:
cache:
enabled: true
schedule: 60m # 24 refreshes/day
24 cache refreshes Γ $0.05 = $1.20/day = $36/month
API serving: Free (from cache)
Response time: 1-50ms
Savings: $74,964/month (99.95%)
Speed: 1000-10,000x faster
Cache Freshness vs Costβ
Choose your refresh schedule based on your needs:
Refresh Schedule | Daily Cost | Monthly Cost | Use Case |
---|---|---|---|
Every 5 minutes | $14.40 | $432 | Real-time dashboards |
Every 15 minutes | $4.80 | $144 | Operational reporting |
Every hour | $1.20 | $36 | Business intelligence |
Every 6 hours | $0.20 | $6 | Analytical reports |
Daily | $0.05 | $1.50 | Historical analysis |
When to Use Cachingβ
β Ideal Use Casesβ
- High-frequency access: Dashboard APIs, analytics endpoints
- Stable data: Data that updates periodically, not every second
- Multiple consumers: Many apps/users accessing the same data
- Cost-sensitive: Reducing warehouse bills is important
- AI agents: Frequent tool calls from LLMs