Skip to content

Cluster Setup Guide

This guide details the deployment of a Ceph storage cluster using cephadm on Ubuntu 22.04.

1. Inventory & Architecture

RoleHostnameIP AddressDescription
Adminceph-mgr-1192.168.202.221Bootstrap node, Mgr #1, Dashboard
Standbyceph-mgr-2192.168.202.222Mgr #2
OSD/Monceph-osd-1192.168.202.231OSD, Mon #1
OSD/Monceph-osd-2192.168.202.232OSD, Mon #2
OSD/Monceph-osd-3192.168.202.233OSD, Mon #3
OSDceph-osd-4192.168.202.234OSD (Storage Only)

2. Prerequisites

Run on ALL Nodes

The following steps ensure environment consistency and time synchronization.

2.1 Set Hostnames

bash
sudo hostnamectl set-hostname ceph-mgr-2

2.2 Configure /etc/hosts

Ensure all nodes can resolve each other by name.

text
192.168.202.221 ceph-mgr-1
192.168.202.222 ceph-mgr-2
192.168.202.231 ceph-osd-1
192.168.202.232 ceph-osd-2
192.168.202.233 ceph-osd-3
192.168.202.234 ceph-osd-4

2.3 Dependencies

We use Docker as the container runtime, lvm2 for disk management, and chrony for time sync.

bash
sudo apt update
sudo apt install -y docker.io lvm2 chrony python3

2.4 Time Synchronization

WARNING

Ceph is extremely sensitive to clock drift. Ensure chrony is active.

bash
sudo systemctl enable --now chrony
chronyc sources

3. Admin Node Setup

Admin Node Only

Run these steps strictly on ceph-mgr-1.

3.1 Install Cephadm

bash
# Add the Squid repository
sudo curl --silent --remote-name --location https://download.ceph.com/rpm-squid/el9/noarch/cephadm
sudo chmod +x cephadm
sudo ./cephadm add-repo --release squid
sudo apt update
sudo apt install -y cephadm

3.2 Bootstrap the Cluster

bash
sudo cephadm bootstrap --mon-ip 192.168.202.221 --allow-fqdn-hostname

Wait for the process to finish. Upon success, it will output the dashboard URL.

3.3 Enable Ceph CLI

The bootstrap process installs a minimal config at /etc/ceph/ceph.conf. To use the ceph command without typing sudo or entering the container every time:

bash
# Install the common tools
sudo cephadm install ceph-common

# Verify access
sudo ceph -s
# Status should be HEALTH_WARN (OSD count 0)

3.4 Distribute SSH Keys

cephadm uses SSH to manage other nodes. The bootstrap process generated an SSH key at /etc/ceph/ceph.pub. Copy this key to all other nodes (ceph-mgr-2 and ceph-osd-1 through 04).

bash
# Copy key to ceph-mgr-2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-mgr-2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-osd-1
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-osd-2
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-osd-3
ssh-copy-id -f -i /etc/ceph/ceph.pub root@ceph-osd-4

Note: You must have root SSH access enabled on target nodes, or copy the key to a user with sudo privileges and configure cephadm to use that user.

4. Expanding the Cluster

Run all following commands from the Admin Node.

4.1 Add the Manager Node

Add the second manager node to the cluster.

bash
sudo ceph orch host add ceph-mgr-2 192.168.202.222 --labels _admin

4.2 Add OSD Nodes

Add the four storage nodes. We label them osd to help with placement.

bash
sudo ceph orch host add ceph-osd-1 192.168.202.231 --labels=osd,mon
sudo ceph orch host add ceph-osd-2 192.168.202.232 --labels=osd,mon
sudo ceph orch host add ceph-osd-3 192.168.202.233 --labels=osd,mon
sudo ceph orch host add ceph-osd-4 192.168.202.234 --labels=osd

4.3 Verify Host List

bash
sudo ceph orch host ls

Output should show 6 hosts total.

5. Deploy Services

5.1 Finalize Monitor Placement

By default, Ceph's bootstrap places the first Monitor on the bootstrap node (ceph-mgr-1). Since you want Monitors exclusively on the storage nodes (ceph-osd-1, ceph-osd-2, ceph-osd-3) to maintain high availability and quorum separated from the Manager nodes:

CRITICAL: You need an odd number of monitors (3).

bash
# Apply the monitor specification to place them explicitly on the 3 OSD nodes
sudo ceph orch apply mon --placement="ceph-osd-1,ceph-osd-2,ceph-osd-3"

# Verify the monitors have successfully migrated and are running
sudo ceph orch ps --daemon_type mon

5.2 Finalize Manager Placement

By default, the Ceph bootstrap process only creates one Manager (mgr) daemon on the admin node (ceph-mgr-1). To ensure high availability of the Ceph dashboard and orchestrator, you should safely add a standby manager to your second manager node (ceph-mgr-2).

bash
# Apply the manager specification to ensure it runs on both mgr nodes
sudo ceph orch apply mgr --placement="ceph-mgr-1,ceph-mgr-2"

# Verify that two mgr daemons are now running (one active, one standby)
sudo ceph orch ps --daemon_type mgr

5.3 Deploy OSDs (Storage)

We will tell Ceph to consume all unused raw disks on nodes labeled osd.

Ensure your OSD nodes have empty, unformatted secondary disks (e.g., /dev/sdb).

bash
# Check available devices
sudo ceph orch device ls

# Provision OSDs on all available devices on 'osd' labeled hosts
sudo ceph orch apply osd --all-available-devices

6.1 Check Cluster Status

bash
sudo ceph -s

Healthy Output:

text
health: HEALTH_OK
mon: 3 daemons (ceph-osd-1, ceph-osd-2, ceph-osd-3)
mgr: ceph-mgr-1(active, since X), standbys: ceph-mgr-2
osd: X osds: X up, X in (where X is number of disks)

6.2 Access Dashboard

Note: You will get a self-signed certificate warning in your browser.


Having issues? Check the 00-troubleshooting.md guide for solutions to common deployment errors, missing disks, or stuck orchestrator daemons!