Self-hosted PaaS on bare metal. Architecture, deployment, costs, technical debt, waitlist, and everything a buyer needs to evaluate the asset.
All traffic enters through Cloudflare, hits Traefik for TLS termination and routing, then reaches the API, Web dashboard, or user-deployed applications. No Kubernetes. HashiCorp Nomad + Consul.
INTERNET
|
+--------------v---------------+
| Cloudflare (DNS / WAF) |
| cloudrail.ca |
+--------------+---------------+
| :443
+--------------v---------------+
| TRAEFIK v3 |
| Ingress + TLS + LB |
| Consul Catalog discovery |
+-----+--------+--------+------+
| | |
+------------v-+ +--v------+ +v--------------+
| CloudRail | |CloudRail| | User Apps |
| API (NestJS) | |Web(Next)| | (Nomad Jobs) |
| :3001 | | :3000 | | |
+--+-------+---+ +--------+ +---------------+
| |
+-----v---+ +-v-----------+
|Postgres | | Redis |
| :5433 | | :6380 |
|(TypeORM) | | (BullMQ) |
+---------+ +------------+
| Component | Technology | Purpose |
|---|---|---|
| API | NestJS 11 GraphQL | Backend. Auth, deployments, billing, GitHub webhooks, async jobs |
| Web | Next.js 15 Tailwind v4 | Dashboard UI. Project management, deploy tracking, logs, monitoring |
| Orchestrator | Nomad | Container scheduling, rolling deploys, health checks, scaling |
| Service Mesh | Consul | Service discovery, health checks, KV config store |
| Ingress | Traefik v3 | HTTP/HTTPS routing, auto TLS via Let's Encrypt, load balancing |
| Database | PostgreSQL 17 | Primary data store (TypeORM ORM) |
| Cache/Queue | Redis 7 BullMQ | Session cache, rate limiting, async build/deploy job queue |
| Registry | Harbor | Private Docker image registry |
| Monitoring | Prometheus Grafana Loki | Metrics, dashboards, centralized logs |
| Alerting | Alertmanager | Rule-based alerts (email/webhook) |
Turborepo monorepo with pnpm workspaces. 8 packages sharing types via @cloudrail/shared.
cloudrail/ +-- packages/ | +-- api/ NestJS backend | +-- web/ Next.js dashboard | +-- shared/ Shared TS types | +-- nomad-client/ Nomad API wrapper | +-- cli/ CLI tool | +-- mcp-server/ AI agent (MCP) | +-- e2e/ Playwright tests | +-- marketing/ Marketing site +-- infra/ | +-- nomad/ HCL job definitions | +-- production/ | | +-- terraform/ Azure + Cloudflare | | +-- ansible/ Config + deploy | +-- configs/ Monitoring config +-- .github/workflows/ CI/CD
@cloudrail/web -------> @cloudrail/shared
@cloudrail/api --+--> @cloudrail/shared
+--> @cloudrail/nomad-client
@cloudrail/cli -------> GraphQL -> API
@cloudrail/mcp -------> HTTP -> API
@cloudrail/e2e -------> Playwright -> Web+API
| Package | Stack |
|---|---|
api | NestJS 11, GraphQL Apollo, TypeORM, BullMQ |
web | Next.js 15, React 19, Tailwind v4, Shadcn |
shared | TypeScript types (DeploymentStatus, etc.) |
nomad-client | Nomad HTTP API wrapper |
cli | Commander.js, GraphQL client |
mcp-server | Model Context Protocol (AI agents) |
e2e | Playwright browser tests |
marketing | Next.js marketing site |
NestJS application with 24+ domain modules. GraphQL (Apollo) as the primary API, with REST as alternative.
User --< WorkspaceMember >-- Workspace --< Project
|
+----------------+-----------------+
| | |
Service Database Environment
| | |
Deployment DatabaseBackup Variable
| StagedChange
BuildLog
Service --< Domain Workspace --< GitHubInstallation
Service --< AutoscalePolicy Workspace --< Subscription (Billing)
Service --< Variable Workspace --< Invite
User enters email
|
v
API generates MagicLinkToken
|
v
Email sent via Resend
|
v
User clicks link
|
v
API validates -> JWT (HS256, 24h)
|
v
httpOnly cookie "cloudrail_session"
From git push to running container. Fully automated through BullMQ job queues with real-time status updates.
GitHub push event hits API. Signature validated. Repo matched to Service. Deployment created (status: queued).
BullMQ build processor clones at specific commit SHA using simple-git. Status: cloning.
Nixpacks auto-detects language/framework. Generates multi-stage Dockerfile. Docker builds image. Status: building.
Image tagged harbor.internal/project/service:sha and pushed to private registry. Status: pushing.
Job spec generated with resource limits, env vars, health checks, Traefik routing tags. Submitted to Nomad. Zero-downtime rolling update. Status: deploying.
Traefik auto-discovers via Consul Catalog. Let's Encrypt provisions TLS. Email via Resend, webhook POST, WebSocket push to dashboard. Status: succeeded.
Node.js, Python, Go, Ruby, Rust, PHP, Java, .NET, Elixir, Haskell, Clojure, Dart, Swift, Zig, Crystal, Scala, F#, Deno, Bun, Static sites, and custom Dockerfiles as override.
Production runs on Azure VM provisioned by Terraform, configured with Ansible. All services run as Nomad jobs.
| Service | Image | Port | Purpose |
|---|---|---|---|
| Traefik | traefik:v3 | 80, 443 | Ingress, TLS, load balancing |
| PostgreSQL | postgres:17 | 5433 | Primary data store |
| Redis | redis:7.4-alpine | 6380 | Cache + BullMQ queue |
| Harbor | goharbor/harbor | 8880 | Docker image registry |
| Prometheus | prom/prometheus | 9090 | Metrics collection |
| Grafana | grafana/grafana | 3000 | Dashboards |
| Loki | grafana/loki | 3100 | Log aggregation |
| Alloy | grafana/alloy | - | Log/metric collector |
| Alertmanager | prom/alertmanager | 9093 | Alert routing |
| MinIO | minio/minio | 9000 | S3-compatible storage |
| CloudRail API | custom build | 3001 | Backend server |
| CloudRail Web | custom build | 3000 | Dashboard |
Azure VM, VNET, NSG, Public IP, Blob Storage (state backend). Cloudflare DNS + WAF. State stored remotely in Azure Blob.
Providers: azurerm, cloudflare
One command: terraform apply
7-stage playbook: Common setup, Consul cluster, Nomad cluster, ACL hardening, platform services, backups, application deployment.
One command: ansible-playbook playbook.yml
Deploy time: ~45 minutes from bare VM
PR/push: lint, typecheck, unit tests, integration tests (real PG+Redis)
Push to main: build images, push Harbor, E2E tests, deploy staging
Manual: build, push, deploy via Ansible/Nomad, post-deploy verify
PR: static analysis security scanning
Internet --> Cloudflare WAF --> :443 (Traefik ONLY)
|
+--------------+------------- Private (10.0.1.0/24)
| |
Nomad :4646 Consul :8500
(localhost) (localhost only)
ACL enabled ACL enabled
Azure NSG: ONLY ports 22 (SSH), 80 (HTTP), 443 (HTTPS) open
Total recurring cost: approximately $55/month. Scales linearly with customer count.
| Plan | CPU | Memory | Replicas |
|---|---|---|---|
| Trial | 512 MHz | 512 MB | 1 |
| Hobby | 2048 MHz | 2048 MB | 3 |
| Pro | 8192 MHz | 8192 MB | 10 |
Plus usage-based: $0.000463/CPU-min, $0.0000068/MB-min. $5 trial credit on signup.
Pre-launch status. No public waitlist has been created yet. The platform is production-ready and accepting signups at cloudrail.ca.
The platform is deployed and live at cloudrail.ca. All 14 Nomad jobs running. Stripe billing configured (test mode, swap to live keys to start charging). No paying customers yet. No public waitlist page has been set up.
Honest assessment of what's incomplete or needs work. Organized by priority.
Core value prop unconfirmed on production. Push to GitHub -> build -> deploy -> public URL needs real-world test.
P1 ~30 min fix
Playwright tests still use Auth0 (replaced with magic-link in PR #2). Tests pass but don't test real flow.
P1 ~1 hr fix
findOrCreateFromGithub lacks error handling on workspace creation. Users see empty dashboard.
P1 ~5 min fix
Verify verif-hash header is checked. Without it, anyone can POST fake payment events.
P1 ~15 min fix
Currently uses null receiver. No alerts fire if services go down. Needs real Discord/Slack webhook.
P2 ~10 min fix
Hanging builds leave users with infinite "Building" spinner. Need 15-minute BullMQ timeout.
P2 ~20 min fix
| Item | Impact |
|---|---|
| Build service: zero tests (critical path) | Test coverage gap |
| REST v1 API: zero tests | Test coverage gap |
| Integration tests: empty stubs | Test coverage gap |
| ~322 hardcoded color values | Cosmetic only |
| ~28 raw HTML inputs (not design system) | Cosmetic only |
| Redis backup not configured | BullMQ state at risk |
| Horizontal scaling docs missing | Documentation gap |
| Harbor tenant isolation unverified | Multi-tenant security |
| Build deduplication unverified | Edge case (double-push) |
With AI-assisted development: approximately 2-3 hours. Without AI: 1-2 days. All fixes are well-scoped with clear file paths and instructions in TODOS.md.
Why CloudRail is for sale and what's included in the transfer.
Shifting focus to HitchPay (fintech/credit scoring product). CloudRail is a complete, production-ready asset that needs a dedicated owner to market and grow it. It's not abandoned, just not the primary focus.
A freelancer would charge $80K-$150K+ to build this from scratch (6-12 months of work). This is a complete, deployed, production-ready PaaS with billing, monitoring, CI/CD, and admin tools.
| Metric | Value |
|---|---|
| Development effort | 400+ commits, 49 phases |
| Current MRR | $0 (pre-revenue) |
| Monthly costs | ~$55/month |
| Break-even | ~10 customers |
| Tech stack age | Latest (NestJS 11, Next 15) |
| White-label ready | Yes |
Open to offers. Preferred closing within 2-4 weeks. Transfer includes a 1-2 hour live walkthrough call and 30 days of async support via email/chat for any technical questions during onboarding.