Design Decisions¶
Key architectural decisions and the rationale behind them.
1. Microservices Architecture¶
Decision: Separate API (Node.js) from workers (Python)
Rationale: - Language choice: Node.js excels at I/O and HTTP, Python excels at ML/AI - Independent scaling: API can scale separately from workers - Technology isolation: Don't force Python ML code to run on JavaScript - Team efficiency: Teams can work independently on their domain
Trade-offs: - ✅ Better technology match - ✅ Independent scaling - ❌ More operational complexity - ❌ Network latency between services
Alternatives considered: - Monolith (faster to build, harder to scale) - Kubernetes (overkill for initial scale)
2. Redis Streams for Messaging¶
Decision: Use Redis Streams for async job queue instead of RabbitMQ or Kafka
Rationale: - Lightweight: No separate infrastructure, uses existing Redis - Fast: Sub-millisecond latency for job pickup - Simple: Consumer groups built-in for job distribution - Dual-use: Can also stream real-time events to clients
Trade-offs: - ✅ Simple and fast - ✅ Real-time event support - ❌ Not suitable for massive scale (billions of messages) - ❌ All data in memory (need persistence config)
Alternatives considered: - RabbitMQ (more features, more overhead) - Kafka (designed for scale, too complex initially) - In-memory queue (no durability)
3. PostgreSQL for Primary Storage¶
Decision: Use PostgreSQL as single source of truth instead of NoSQL
Rationale: - ACID compliance: Transactions ensure data consistency - Flexible schema: JSONB columns for semi-structured data - Query power: Complex queries for analytics - Proven: Battle-tested in production at scale
Trade-offs: - ✅ Strong consistency guarantees - ✅ Rich query language - ❌ Scaling horizontally is harder - ❌ Schema changes require migrations
Alternatives considered: - MongoDB (more flexible, weaker consistency) - DynamoDB (serverless, vendor lock-in) - SQLite (simpler, not suitable for multi-user)
4. JWT for Authentication¶
Decision: Use JWT tokens instead of session cookies
Rationale: - Stateless: API servers don't need session storage - Scalable: Each request is independent - Cross-origin: Works with CORS and mobile apps - Revokable: Token blacklist in Redis if needed
Trade-offs: - ✅ Scales to many API servers - ✅ Works with mobile/SPA clients - ❌ Token size increases request size - ❌ Can't revoke immediately without Redis check
Alternatives considered: - Server sessions (requires sticky sessions) - OAuth2 (more complex, better for third-party) - API keys (less secure, good for server-to-server)
5. Vercel for Hosting¶
Decision: Deploy to Vercel instead of self-managed servers
Rationale: - Zero-ops: No server management needed - Global CDN: Automatic edge caching - Preview deployments: Test changes before production - Easy integration: Git-based deployments
Trade-offs: - ✅ Minimal ops overhead - ✅ Excellent for static/Next.js sites - ❌ Limited to their runtime (Node.js, Python) - ❌ Cold start latency (can be mitigated)
Alternatives considered: - AWS EC2 (full control, more ops work) - Fly.io (similar benefits, less ecosystem) - Google Cloud Run (good option, Vercel simpler)
6. MkDocs for Documentation¶
Decision: Use MkDocs + Material theme instead of custom site
Rationale: - Content-first: Focus on documentation, not tooling - GitHub-native: Markdown in version control - Search built-in: Full-text search out of the box - Mobile-responsive: Theme handles all devices - Fast: Static site generation, CDN ready
Trade-offs: - ✅ Simple to maintain - ✅ Version-controlled content - ❌ Limited customization without code - ❌ Can't easily add interactive features
Alternatives considered: - Docusaurus (more flexible, React-based) - Sphinx (powerful, Python ecosystem) - Custom Next.js site (full control, maintenance burden)
7. Rate Limiting per User¶
Decision: Implement rate limiting at 100 requests/minute per user
Rationale: - Fair usage: Prevents single user from monopolizing resources - Cost control: Limits runaway API calls - DDoS mitigation: Makes attacks less effective - Predictable: Users know their limits upfront
Trade-offs: - ✅ Prevents abuse - ✅ Controls costs - ❌ Some users may hit limits legitimately - ❌ Requires monitoring to adjust
Decision criteria: - Measured actual usage patterns - Set limit 10x typical heavy user - Higher limit for premium tiers planned
8. Document Processing Pipeline¶
Decision: Queue-based async processing instead of synchronous
Rationale: - Long-running: Document processing can take 5-30 seconds - Scalability: Workers can process in parallel - Reliability: Failed jobs can be retried - User experience: Immediate response to user, progress updated async
Trade-offs: - ✅ Better UX (instant response) - ✅ Scales to many concurrent requests - ❌ More complex (eventual consistency) - ❌ Webhook/polling needed for results
Alternatives considered: - Synchronous processing (simpler, slower) - Timeout with result polling (less elegant)
9. Full-Text Search in PostgreSQL¶
Decision: Use PostgreSQL full-text search instead of Elasticsearch
Rationale: - Simpler ops: One less service to manage - Adequate performance: Sufficient for most use cases - Cost: No separate search infrastructure - Consistency: Results always match database state
Trade-offs: - ✅ Minimal infrastructure - ✅ Always consistent - ❌ Scaling search independently is harder - ❌ Less powerful than Elasticsearch
When to revisit: - If search latency becomes issue - If need faceted search / analytics - If > 100M documents
10. Environment-Based Configuration¶
Decision: Use environment variables for all configuration
Rationale: - Security: Secrets not in code - Flexibility: Change behavior without code changes - Standard: Industry best practice (12-factor app) - Container-friendly: Works well with Docker
Configuration levels:
.env (local development)
.env.test (test environment)
.env.staging (staging environment)
GitHub Secrets (production)
Trade-offs: - ✅ Secure by default - ✅ Follows best practices - ❌ More environment setup needed - ❌ Easy to forget to set required vars
Summary Table¶
| Decision | Choice | Alternative | Reason |
|---|---|---|---|
| Architecture | Microservices | Monolith | Tech variety, independent scaling |
| Messaging | Redis Streams | RabbitMQ/Kafka | Simple, fast, dual-use |
| Database | PostgreSQL | MongoDB/DynamoDB | ACID, query power, proven scale |
| Auth | JWT | Sessions/OAuth | Stateless, scalable |
| Hosting | Vercel | AWS/GCP | Zero-ops, git integration |
| Docs | MkDocs | Custom/Docusaurus | Content-first, simple |
| Rate Limit | Per-user/min | Per-IP/hour | Fair, user-based |
| Processing | Async queue | Sync | Better UX, scalability |
| Search | PostgreSQL FTS | Elasticsearch | Simpler ops |
| Config | Env vars | Files | Security, standard |
Evolution Plan¶
These decisions are good for the current stage (MVP → early scale). As the product grows:
Month 1-3 (Current) - Current architecture works well - Monitor performance metrics - Adjust rate limits based on real usage
Month 3-6 (Early Traction) - Consider caching layer if database queries slow - Possibly add Elasticsearch if search latency becomes issue - Load test with 1000 concurrent users
Month 6-12 (Scaling) - Move to Kubernetes if horizontal scaling critical - Consider managed database (RDS) if ops become burden - Add CDN for large file distribution
Year 1+ (Scale) - Multi-region deployment - Separate read/write database replicas - Advanced caching strategy
Related Documentation¶
- System Design - How decisions are implemented
- Data Model - Database structure
- Architecture Overview - High-level view