Architecture¶
OpenComplai is built on a modern, scalable architecture designed for document processing, analysis, and AI-powered insights.
High-Level Overview¶
┌─────────────┐
│ Users │
└──────┬──────┘
│
▼
┌──────────────────┐
│ HTTP API │
│ (Node.js) │
└──────┬───────────┘
│
▼
┌──────────────────┐
│ Message Queue │
│ (Redis Streams) │
└──────┬───────────┘
│
▼
┌──────────────────┐
│ Worker Services │
│ (Python FastAPI) │
└──────┬───────────┘
│
▼
┌──────────────────┐
│ Data Storage │
│ (PostgreSQL) │
└──────────────────┘
Core Components¶
API Gateway (Node.js)¶
The primary entry point for all client requests: - HTTP Request Handling - RESTful API endpoints - Authentication & Authorization - JWT token validation and permission checking - Rate Limiting - Per-user request quotas to prevent abuse - Request Validation - Schema validation before processing - Response Formatting - Consistent JSON responses
Message Queue (Redis Streams)¶
Asynchronous job processing for long-running operations: - Document Processing Queue - Jobs for document analysis - Real-Time Event Streaming - WebSocket events for live updates - Job Persistence - Guaranteed delivery with durability - Priority Queues - Handle urgent tasks first
Worker Services (Python FastAPI)¶
Background services that process documents: - Document Processing - Extract text, metadata, structure - AI Analysis - Run ML models for insights - Data Extraction - Parse and normalize content - Error Handling - Retry logic and failure recovery
Data Storage (PostgreSQL)¶
Persistent storage layer: - Documents & Metadata - Full-text searchable content - User Data & Auth - Secure credential storage - Processing Results - Extracted information and analysis - Audit Logs - Complete action history
Sections¶
- System Design - Detailed component descriptions and interactions
- Design Decisions - Key architectural choices and rationale
- Data Model - Database schema and entity relationships
Scalability¶
OpenComplai is designed to scale horizontally:
- Stateless API - Multiple instances behind load balancer
- Message Queue - Distributes work across workers
- Database Partitioning - Splits data by tenant/region
- Caching Layer - Redis for frequently accessed data
Security¶
Multiple layers of security:
- Authentication - OAuth 2.0 + JWT tokens
- Authorization - Role-based access control (RBAC)
- Encryption - TLS in transit, AES at rest
- Audit Logging - All actions tracked and logged
- Rate Limiting - DDoS protection
Technologies¶
| Component | Technology | Version |
|---|---|---|
| API | Node.js + Express | 18+ |
| Workers | Python + FastAPI | 3.9+ |
| Queue | Redis | 7.0+ |
| Database | PostgreSQL | 14+ |
| Docs | MkDocs | 1.5+ |
| Deployment | Vercel + Docker | - |
Related Documentation¶
- API Reference - Complete API documentation
- Contributing Guide - Development setup and contribution guidelines
- Deployment Guide - Deployment and operations
- Troubleshooting - Common issues and solutions