k8sapi / explainer 01
All explainers
cluster founding flow

POST /clusters

This endpoint is the act of turning an authenticated operator request into durable cluster truth. From first principles, no VM should be trusted to invent the cluster it belongs to. The API must define the shared trust root, create the control-plane machines, and persist the node facts that later bootstrap depends on.

What it creates

Cluster-wide PKI, service-account signing keys, control-plane VMs, and one node record per control-plane facet.

Why it exists

Because cluster identity must be consistent before any node tries to boot Kubernetes locally.

What it optimizes for

Immediate usefulness and a real boot path, not a full reconciliation engine or long-running operations system.

Section 01

Why this endpoint exists

A control plane is not one object. It is a coordination problem across shared trust, machine creation, address discovery, and durable cluster membership. If those facts are created separately by different actors, they drift. POST /clusters centralizes the founding moment.

Shared truth that must be born once

Cluster CA Every later server or client cert needs one trust root.
Service-account signer The control plane needs one stable signing authority for service-account identity.
Control-plane naming model The cluster needs deterministic names for its shared endpoint and each control-plane node.

Node truth that must be persisted early

Which VMs belong to the control plane The future bootstrap flow is keyed by VM ID.
Which certs belong to which node etcd and apiserver serving identities are per-node, not shared secrets.
How those nodes are addressed Bootstrap needs current network identity, not guessed placeholders.
Section 02

What the request contract really means

Scope comes from auth, not JSON

The handler does not accept org_id in the body. Organization scope comes from the bearer token already validated by authware.AuthMiddleware. That prevents a caller from claiming a different org in the request payload.

auth first token org is authoritative

The body only describes shape

The request tells the API how many control-plane VMs to make and where to place them. It does not define cluster PKI, cluster IDs, or the internal node rows. Those are API-owned responsibilities.

shape and placement not identity or trust

Accepted fields

count
control_plane_count, default 1, max 3
image
control_plane_image_id, required and positive
size
control_plane_instance_type, required
zone
zone_id, default 1
subnet
subnet_id, required
security
security_group_ids, optional
extras
allocate_public_ipv4, ssh_pubkey, root_volume_size_gib

Not part of the current handler

org
org_id is derived from the validated token
project
project_id is not part of the current contract
existing VMs
control_plane_vm_ids belonged to an older removed path
Section 03

The execution sequence, step by step

01

Validate and normalize

The handler parses the body, applies defaults, and rejects impossible shapes. That front-loads simple failures so the expensive side effects do not begin until the request is coherent.

02

Create cluster-wide trust

CreateCA generates the cluster CA and private key. CreateRSAKeyPairPEM generates the service-account signer. This is the cluster’s shared identity material.

CreateCA CreateRSAKeyPairPEM
03

Open the database transaction

repository.GetConnTransaction starts the transaction. CreateKubeCluster then inserts the cluster row. The transaction protects repo-owned state so the cluster either has a coherent persistent foundation or it does not.

04

Resolve control-plane security groups

If the caller omitted security_group_ids, the handler asks computeapi for the subnet CIDR, creates a security group, allows all IPv4 traffic from the subnet, and opens external TCP 6443. That encodes the minimum reachability the control plane needs.

05

Create the control-plane VMs

One VM is created per desired control-plane ordinal. Each request includes a role tag and the bootstrap userdata shim. That means the VM is born with the means to fetch its real bundle later, not with a giant hardcoded cloud-init secret blob.

computeapi bootstrap userdata
06

Discover real VM addresses from the local DB

After VM creation, the handler reads back each VM’s current name, IPv4, IPv6, and public IPv4 from the database. From first principles, this is safer than assuming the create response alone is the final truth.

07

Sign one etcd cert and one apiserver cert per node

The handler derives node-specific DNS SANs and IP SANs and signs two serving identities for each control-plane VM: one for stacked etcd peer/server traffic, one for the apiserver’s serving endpoint.

08

Persist node rows in both control-plane facets

Each VM gets one row in etcdnodes and one row in apiservernodes. That is how the MVP models a single machine acting as both an etcd member and an API server host.

09

Create the external admin kubeconfig and commit

The handler signs an admin client cert, renders a kubeconfig pointing at the shared control-plane DNS name on port 6443, and commits the transaction. Only after commit does the API treat the cluster as founded.

10

Attempt DNS as a best-effort side effect

DNS record creation happens after durable state exists. This is deliberate: DNS is useful, but the cluster’s root truth is the committed row set plus the future bootstrap path. DNS failure should not erase a valid cluster foundation.

soft failure not rolled back
Section 04

Why the cert and naming strategy looks like this

Deterministic node names

Names like cp-1.cluster-<id>.internal let the system reconstruct topology consistently from durable rows later.

Broad SAN coverage

The apiserver cert includes the node name, the shared control-plane name, the standard in-cluster kubernetes names, and direct IPs because those are all real client access paths.

Separate etcd and apiserver identities

etcd peer/server traffic and apiserver serving traffic are different trust relationships. The certificates are separated so each one matches the role it actually performs.

Shared CA

The cluster is founded with one signing root.

derives

Per-node serving certs

Each node gets identities that match its exact names and addresses.

Per-node serving certs

Persisted now so later bootstrap can be pure rendering, not retroactive state creation.

enable

Future bundle fetch

The VM can later prove identity and fetch the exact files it should run.

Section 05

Where failure stops the flow, and where it does not

Hard-stop failures

  • invalid request shape
  • CA or signer generation failure
  • transaction start failure
  • security-group resolution failure
  • VM creation failure
  • VM identity lookup failure before commit
  • certificate signing or row insertion failure

Soft failure by design

DNS configuration is soft. If it fails, the response reports the error but still returns success because the cluster’s durable bootstrap state already exists.

The current MVP does not reconcile external side effects such as already-created VMs if a later pre-commit step fails. That is one of the cleanest reasons to add an operation table and reconciliation loop later.

Section 06

What the response is trying to tell the operator

Immediate control surface

The response returns the cluster ID, the created VM IDs, the shared DNS name, the admin kubeconfig, and the cluster networking defaults.

Operational hints

It also returns the bootstrap path template, the wildcard ingress DNS name, the suggested DNS records, and whether DNS was configured automatically.

Important nuance

The wildcard ingress DNS name is returned as metadata, but the current code only suggests and attempts to create the shared control-plane A record.

Another nuance

The response kubeconfig uses the shared control-plane DNS endpoint, while internal bootstrap kubeconfigs later use the first control-plane node address. Those serve different access contexts.

Section 07

First-principles recap

Identity before execution The system must define cluster trust and node membership before any VM can bootstrap safely.
Shared truth once, node truth many times The cluster CA and service-account signer are created once, then used to derive node-specific identities.
Durable truth matters more than convenience layers DNS is useful, but committed cluster and node rows are the real root of bootstrap truth.
This is an MVP orchestrator, not yet a reconciler The code chooses a direct boot path over full compensation logic and lifecycle repair.

Key code: internal/handlers/clusterscreate.go, internal/repository/kubeclusters.go, internal/repository/etcdnodes.go, internal/repository/apiservernodes.go, internal/computeclient/client.go, internal/dnsclient/client.go.