Back to Blog
ReactNode.jsPythonPostgreSQLDevOps

Building a Hosting Platform from Scratch

Umut Korkmaz2025-01-1512 min read

Building a hosting platform from scratch is a systems problem more than a UI problem. You are not just building signup forms and dashboards. You are building provisioning flows, tenant isolation, billing states, service orchestration, and operational tooling that has to behave predictably under failure.

My stack for the hosting platform combined Next.js for the dashboard, Node.js services for orchestration, FastAPI for infrastructure-facing automation endpoints, and PostgreSQL for persistent state. The architecture was shaped around one question: how do we turn a user action in the control panel into a safe, observable backend workflow?

The Platform Needed Clear Service Boundaries

The first mistake in this kind of product is stuffing everything into one application. Provisioning, billing, identity, DNS operations, and UI rendering all evolve at different speeds. I separated them into focused services with clear contracts.

ts
export interface ProvisioningRequest {
  tenantId: string
  product: 'shared-hosting' | 'vps' | 'database'
  region: string
  plan: string
  requestedBy: string
}

export interface ProvisioningResult {
  orderId: string
  status: 'queued' | 'running' | 'completed' | 'failed'
  resourceIds: string[]
}

That contract sat between the control panel and the orchestration layer. The frontend did not need to know how a VPS was actually created. It only needed a stable lifecycle to display.

Next.js Handled the Control Panel

The control panel focused on account management, resource visibility, and action dispatch. It was not responsible for long-running tasks.

ts
export async function createServerAction(input: CreateServerInput) {
  const response = await fetch(`${process.env.API_URL}/provisioning/orders`, {
    method: 'POST',
    headers: { 'content-type': 'application/json' },
    body: JSON.stringify(input),
  })

  if (!response.ok) {
    throw new Error('Failed to create provisioning order')
  }

  return response.json()
}

That separation kept the web layer clean. A dashboard request should never block while infrastructure work runs for minutes.

Orchestration Needed Durable State

The orchestration service turned business events into infrastructure tasks. The important part was durability: every order had to be resumable, debuggable, and safe to retry.

ts
export async function enqueueProvisioningOrder(order: ProvisioningRequest) {
  const record = await db.order.create({
    data: {
      tenantId: order.tenantId,
      product: order.product,
      region: order.region,
      plan: order.plan,
      status: 'queued',
    },
  })

  await queue.add('provision-resource', { orderId: record.id })
  return record
}

Once the order existed in the database, the worker layer could process it asynchronously and update status transitions without losing visibility.

FastAPI Was a Good Fit for Infrastructure Automation

I used FastAPI for machine-facing automation endpoints because the ergonomics for background tasks, validation, and typed request bodies were strong, especially for infrastructure workflows.

py
from fastapi import FastAPI
from pydantic import BaseModel

app = FastAPI()

class DnsRecordRequest(BaseModel):
    zone: str
    record_type: str
    name: str
    value: str

@app.post('/dns/records')
async def create_record(payload: DnsRecordRequest):
    # call provider SDK or internal DNS service
    return {'status': 'queued', 'name': payload.name}

That service did not own customer-facing UX. It owned deterministic infrastructure operations with clear request and response models.

Provisioning Was Built as a State Machine

Long-running workflows become much easier to debug when they move through explicit states instead of ad hoc logs.

ts
export type OrderStatus =
  | 'queued'
  | 'validating'
  | 'provisioning'
  | 'configuring'
  | 'completed'
  | 'failed'

Each state transition emitted an event and a timestamp. That made the admin panel, customer notifications, and support tooling much easier to implement later.

PostgreSQL Was the Source of Truth

I used PostgreSQL for customer accounts, orders, invoices, service assignments, and audit trails. The key decision was treating infrastructure state as application data, not just log output.

That meant support staff could answer practical questions quickly:

  1. when did an order start?
  2. which step failed?
  3. was DNS configured?
  4. was billing activated before provisioning completed?

If those questions require grepping logs across multiple services, the platform becomes painful to operate.

The Biggest Lesson: Operability Is a Feature

What made the hosting platform feel solid was not one technology choice. It was the discipline of keeping workflows observable, stateful, and resumable. Every provisioning task needed a visible status. Every failure needed a place to land. Every retry needed to be safe.

That is the difference between a demo platform and a real one. In a real platform, the hardest part is not creating resources. It is making resource creation understandable to both users and operators.