Google Sheets as Database
This guide uses the gsheets extension by archiewood and mharrisb1. Thanks to the DuckDB community for this excellent extension that enables seamless Google Sheets integration!
Turn Google Sheets into production-ready REST APIs. Perfect for non-technical teams, rapid prototyping, and collaborative data management - no database setup required.
Why Use Google Sheets with flAPI?
"The database your marketing team can actually use."
Google Sheets is the world's most accessible "database":
- ✅ No setup - Just create a spreadsheet
- ✅ Collaborative - Multiple people can edit simultaneously
- ✅ Familiar - Everyone knows how to use spreadsheets
- ✅ Visual - See your data, no SQL queries needed
- ✅ Version history - Built-in audit trail
Real-World Use Cases
🎯 Content Management for Marketing
Marketing teams manage content without developer intervention:
- Blog post metadata (titles, authors, tags)
- Product descriptions and pricing
- Landing page copy variants
- Campaign tracking parameters
📋 Form & Survey Backends
Google Forms → Google Sheets → flAPI → Your App:
- Customer feedback collection
- Event registrations
- Product waitlists
- Survey responses
🚀 Rapid Prototyping
Build MVP APIs in minutes:
- No database provisioning
- No schema migrations
- Just update the spreadsheet
👥 Team Collaboration
Non-technical stakeholders contribute directly:
- Support team manages FAQ data
- Sales team updates product specs
- HR maintains team directory
Installation
The Google Sheets extension requires authentication:
connections:
my-sheets:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
-- Authenticate (opens browser for OAuth)
CREATE SECRET (TYPE gsheet);
Alternative: Access Token Authentication
For production deployments, use a service account:
connections:
my-sheets:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
CREATE SECRET (
TYPE gsheet,
PROVIDER access_token,
TOKEN '${GOOGLE_ACCESS_TOKEN}'
);
Get your access token:
- Create a Google Cloud project
- Enable Google Sheets API
- Create service account credentials
- Download JSON key and extract
access_token
Configuration Examples
Basic Configuration
project_name: sheets-api
host: 0.0.0.0
port: 8080
connections:
google-sheets:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
CREATE SECRET (TYPE gsheet);
properties:
spreadsheet_id: '1A2B3C4D5E6F7G8H9I0J'
template:
environment-whitelist:
- '^GOOGLE_.*'
Multiple Sheets
connections:
# Product catalog
products-sheet:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
CREATE SECRET (TYPE gsheet);
properties:
spreadsheet_id: '1A2B3C4D5E6F7G8H9I0J'
sheet_name: 'Products'
# Customer data
customers-sheet:
init: |
LOAD gsheets; # Already installed above
CREATE SECRET (TYPE gsheet);
properties:
spreadsheet_id: '9J0I8H7G6F5E4D3C2B1A'
sheet_name: 'Customers'
Example 1: Product Catalog API
Spreadsheet Structure
product_id | name | description | price | in_stock |
---|---|---|---|---|
PROD-001 | Widget A | Amazing widget | 29.99 | TRUE |
PROD-002 | Gadget B | Cool gadget | 49.99 | FALSE |
PROD-003 | Doohickey C | Useful tool | 19.99 | TRUE |
Share URL: https://docs.google.com/spreadsheets/d/1A2B3C4D5E6F7G8H9I0J/edit
SQL Template
SELECT
product_id,
name,
description,
price,
in_stock
FROM read_gsheet('1A2B3C4D5E6F7G8H9I0J', sheet='Products')
WHERE 1=1
{{#params.in_stock}}
AND in_stock = {{{params.in_stock}}}
{{/params.in_stock}}
{{#params.max_price}}
AND price <= {{{params.max_price}}}
{{/params.max_price}}
ORDER BY name
Endpoint Configuration
url-path: /products/
template-source: products.sql
connection:
- google-sheets
description: Get products from Google Sheet
parameters:
in_stock:
type: boolean
description: Filter by stock status
max_price:
type: number
description: Maximum price filter
API Usage
# Get all products
curl http://localhost:8080/products/
# Get in-stock products under $30
curl 'http://localhost:8080/products/?in_stock=true&max_price=30'
Response:
{
"data": [
{
"product_id": "PROD-003",
"name": "Doohickey C",
"description": "Useful tool",
"price": 19.99,
"in_stock": true
}
]
}
Example 2: Blog Post Metadata
Perfect for headless CMS scenarios:
Spreadsheet
slug | title | author | published_date | tags |
---|---|---|---|---|
hello-world | Hello World | John Doe | 2024-01-15 | tech,intro |
flapi-launch | Launching flAPI | Jane Smith | 2024-02-01 | product,announcement |
SELECT
slug,
title,
author,
published_date,
tags
FROM read_gsheet('{{{conn.spreadsheet_id}}}', sheet='Posts')
WHERE 1=1
{{#params.author}}
AND author = '{{{params.author}}}'
{{/params.author}}
{{#params.tag}}
AND tags LIKE '%{{{params.tag}}}%'
{{/params.tag}}
ORDER BY published_date DESC
{{#params.limit}}
LIMIT {{{params.limit}}}
{{/params.limit}}
url-path: /blog/posts/
template-source: blog_posts.sql
connection:
- google-sheets
cache-template: blog_cache.sql
cache-schedule: "*/5 * * * *" # Refresh every 5 minutes
Example 3: Form Submissions (Read & Write)
Reading Form Data
SELECT
timestamp,
name,
email,
message,
status
FROM read_gsheet('1FormResponseSheetID', sheet='Form Responses 1')
WHERE status = 'pending'
ORDER BY timestamp DESC
LIMIT 100
Writing Data Back (Status Updates)
-- Update processed submissions
COPY (
SELECT
timestamp,
name,
email,
message,
'processed' as status
FROM existing_data
WHERE id = {{{params.submission_id}}}
) TO '1FormResponseSheetID' (FORMAT gsheet);
Performance Considerations
Caching is Essential
Google Sheets API has rate limits (100 requests/100 seconds). Always use caching:
url-path: /products/
template-source: products.sql
connection:
- google-sheets
# Critical: Cache the data locally
cache-template: products_cache.sql
cache-schedule: "*/10 * * * *" # Refresh every 10 minutes
-- Full refresh strategy
SELECT * FROM read_gsheet('1A2B3C4D5E6F7G8H9I0J')
Performance improvement:
- ❌ Direct: ~500ms per request, rate-limited
- ✅ Cached: 1-10ms per request, no limits
Optimization Tips
- Use specific sheet names instead of full URLs
- Read only required columns (reduces bandwidth)
- Set appropriate cache intervals (balance freshness vs load)
- Avoid all_varchar=true (slows type inference)
-- ❌ Slow: Reads entire sheet
SELECT * FROM read_gsheet('1ABC...')
-- ✅ Fast: Specific sheet and columns
SELECT name, price
FROM read_gsheet('1ABC...', sheet='Products')
Security Best Practices
1. Use Service Accounts in Production
Never use personal OAuth tokens:
connections:
sheets-prod:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
CREATE SECRET (
TYPE gsheet,
PROVIDER access_token,
TOKEN '${GOOGLE_SERVICE_ACCOUNT_TOKEN}'
);
2. Restrict Sheet Permissions
- Give service account read-only access when possible
- Don't make sheets publicly editable
- Use specific sheet ranges if supported
3. Validate Input
Even though data comes from Sheets, validate parameters:
parameters:
product_id:
type: string
pattern: '^PROD-[0-9]{3}$'
required: true
4. Sanitize Data
Be aware of potential injection in sheet cells:
-- Use proper escaping
WHERE product_id = '{{{params.id}}}'
Troubleshooting
Authentication Failed
Error: Authentication failed
# Re-authenticate
1. Delete existing secret
2. Run flAPI with browser access
3. Follow OAuth flow
Rate Limit Exceeded
Error: Quota exceeded for quota metric
Solution: Increase cache intervals:
cache-schedule: "*/30 * * * *" # Every 30 minutes instead of 5
Sheet Not Found
Error: Sheet 'Name' not found
-- Verify sheet name (case-sensitive!)
FROM read_gsheet('1ABC...', sheet='Products') -- Correct
FROM read_gsheet('1ABC...', sheet='products') -- Wrong!
Type Inference Issues
Some columns aren't recognized correctly:
-- Force all columns as varchar first
SELECT * FROM read_gsheet('1ABC...', all_varchar=true)
-- Then cast explicitly
SELECT
CAST(price AS DOUBLE) as price,
CAST(quantity AS INTEGER) as quantity
FROM ...
Limitations
Limitation | Impact | Workaround |
---|---|---|
100 requests/100s | Rate limiting | Use caching (essential) |
5M cells max | Large datasets | Split into multiple sheets |
No transactions | Race conditions | Use append-only patterns |
Type inference | Inconsistent types | Use all_varchar=true + cast |
OAuth complexity | Deployment | Use service account tokens |
When to Use vs When to Avoid
✅ Use Google Sheets When:
- Non-technical team needs to manage data
- Rapid prototyping or MVP
- Data volume < 10,000 rows
- Updates are infrequent (< 1/minute)
- Collaboration is key
❌ Avoid Google Sheets When:
- High-frequency writes (> 10/second)
- Data volume > 100,000 rows
- Need ACID transactions
- Sub-50ms latency required
- Complex relational queries
Migration Path
Start with Sheets, migrate later:
# Phase 1: Prototype with Sheets
connections:
data:
init: |
INSTALL gsheets FROM community;
LOAD gsheets;
properties:
spreadsheet_id: '1ABC...'
# Phase 2: Migrate to PostgreSQL (same endpoints!)
connections:
data:
init: |
INSTALL postgres;
LOAD postgres;
properties:
host: postgres.example.com
database: production
Your SQL templates and endpoints remain the same - just change the connection!
Next Steps
- Google Sheets API Example: Complete working example
- Caching Setup: Essential for production
- Authentication: Secure your APIs
- BigQuery: When you outgrow Sheets
- Vector Search: Add semantic search to Sheets data
Additional Resources
- DuckDB GSheets Extension: Official extension docs
- Google Sheets API: API documentation
- Service Accounts: Production authentication
Start every project with Google Sheets. It's the fastest way to validate an idea. You can always migrate to a "real" database later - flAPI makes the transition seamless.