Architecture Documentation
This section covers the system architecture, design patterns, and structural documentation for Arches - a high-performance data processing platform with AI-powered capabilities.
Overview
Arches is built on modern cloud-native principles with a focus on scalability, maintainability, and developer experience. The platform combines workflow automation, AI integration, and data processing in a cohesive architecture.
Documentation Structure
- System Architecture - Complete system design, patterns, and technical decisions
- Authentication - JWT-based authentication and authorization architecture
- Project Layout - Directory structure, code organization, and conventions
Core Architecture Principles
1. Hexagonal Architecture (Ports & Adapters)
Arches implements Hexagonal Architecture to achieve true separation of concerns:
Code
Benefits:
- Business logic remains independent of frameworks
- Easy to test with mock implementations
- Flexible infrastructure changes without affecting core logic
- Clear separation between technical and business concerns
2. Domain-Driven Design (DDD)
The system is organized into distinct bounded contexts, each representing a core business domain:
Auth Domain
- User authentication and session management
- JWT token generation and validation
- OAuth provider integration
- Password reset and email verification
Organizations Domain
- Multi-tenant organization management
- Member invitations and role management
- Billing and subscription handling
- Organization-level settings
Pipelines Domain
- DAG-based pipeline definitions
- Pipeline execution engine
- Task orchestration and scheduling
- Execution monitoring
Artifacts Domain
- Artifact storage and retrieval
- File management and versioning
- Content processing
- Metadata management
Tools Domain
- Tool registry and integration
- Tool capability management
- External service integration
- Tool execution framework
Runs Domain
- Pipeline run management
- Run history and monitoring
- Execution state tracking
- Result aggregation
Sessions Domain
- User session management
- Session persistence and caching
- Session-based authentication
- Activity tracking
Each domain maintains:
- Isolated data models - No shared database tables
- Independent APIs - Domain-specific endpoints
- Clear boundaries - No cross-domain imports
- Event communication - Domains interact through events
3. Code Generation Strategy
Arches leverages a unified code generation system to maintain consistency and reduce boilerplate:
OpenAPI-Driven Development
Code
Generates:
- Type definitions (
types.gen.go) - Repository interface (
repository.gen.go) - PostgreSQL implementation (
postgres.gen.go) - SQLite implementation (
sqlite.gen.go) - Service interface (
service.gen.go) - HTTP server implementation (
server.gen.go) - API client interface (
handler.gen.go) - Test mocks (
mocks_test.gen.go)
SQL-First Database Layer
Code
Generates:
- Type-safe database queries via SQLC
- Database models and interfaces
Unified Code Generation
The unified generator reads x-codegen annotations to produce:
- Repository Layer: Interface and database implementations (PostgreSQL/SQLite)
- Service Layer: Business logic interfaces and HTTP servers
- Events Layer: Optional event publishing with NATS/Redis
- Configuration: Default values and environment mappings
4. Technology Stack
Backend
- Language: Go 1.21+ for performance and simplicity
- Framework: Echo for HTTP routing
- Database: PostgreSQL with pgvector extension
- Cache: Redis for session and data caching
- Message Queue: NATS for event streaming
Frontend
- Framework: React with TanStack Router
- UI Library: Custom component library
- State Management: TanStack Query
- Build Tool: Vite for fast development
Infrastructure
- Container: Docker with multi-stage builds
- Orchestration: Kubernetes with Helm charts
- CI/CD: GitHub Actions
- Monitoring: Prometheus + Grafana
- Logging: Structured logging with Loki
5. Scalability Patterns
Horizontal Scaling
- Stateless services enable easy horizontal scaling
- Load balancing across multiple instances
- Database connection pooling
- Redis cluster for distributed caching
Async Processing
- Background job queues for heavy operations
- Event-driven architecture for decoupling
- Workflow execution in separate workers
- Rate limiting and backpressure handling
Performance Optimization
- Query optimization with indexes
- Caching at multiple layers
- CDN for static assets
- Database read replicas
Key Design Decisions
Why Go?
- Excellent performance for I/O-bound operations
- Simple deployment (single binary)
- Strong standard library
- Great tooling and testing support
Why PostgreSQL?
- ACID compliance for critical data
- pgvector for AI/ML embeddings
- Rich feature set (JSONB, full-text search)
- Mature ecosystem
Why Code Generation?
- Type safety across the stack - Generated types ensure compile-time safety
- Reduced boilerplate code - Automatic generation of CRUD operations
- Consistent patterns - Unified structure across all domains
- Self-documenting APIs - OpenAPI serves as single source of truth
- Automatic mock generation - Mockery generates test mocks from interfaces
- Multi-database support - Same interface, multiple implementations
Architecture Evolution
The architecture is designed to evolve:
- Current State: Monolithic with domain separation
- Next Phase: Service extraction for high-traffic domains
- Future State: Full microservices with service mesh
Getting Started
To understand the architecture in practice:
- Review the System Design for detailed patterns
- Explore the Project Layout to understand code organization
- Check Authentication for security architecture
- See working examples in the
internal/directory
Architecture Guidelines
When contributing to Arches:
- Respect domain boundaries - Don't import across domains
- Define first, generate second - Use OpenAPI/SQL before coding
- Test at the right level - Unit tests for logic, integration for APIs
- Document decisions - Add ADRs for significant changes
- Optimize thoughtfully - Measure before optimizing