Servers¶
The Server
Custom Resource Definition (CRD) represents a bare metal server. It manages the state and lifecycle of
physical servers, enabling automated hardware management tasks such as power control, BIOS configuration, and
firmware updates. Interaction with a Server
resource is facilitated through its associated Baseboard Management
Controller (BMC), either by referencing a BMC
resource or by providing direct BMC configuration.
Example Server Resource¶
apiVersion: metal.ironcore.dev/v1alpha1
kind: Server
metadata:
name: my-server
spec:
uuid: "123e4567-e89b-12d3-a456-426614174000"
power: "Off"
bmcRef:
name: my-bmc
bootOrder:
- name: PXE
priority: 1
device: Network
BIOS:
- version: "1.0.3"
settings:
BootMode: UEFI
Virtualization: Enabled
Usage¶
The Server
CRD is central to managing bare metal servers. It allows for:
- Power Management: Powering servers on and off.
- BIOS Configuration: Changing BIOS settings and performing BIOS updates.
- Lifecycle Management: Handling the server's lifecycle through various states.
- Hardware Discovery: Gathering hardware information via BMC and in-band agents.
Lifecycle and States¶
A server undergoes the following phases:
-
Initial: The server object is created; hardware details are not yet known.
-
Discovery:
- The
ServerReconciler
interacts with the BMC to retrieve hardware details. - An initial boot is performed using a predefined ignition configuration.
- An agent called
metalprobe
runs on the server to collect additional data (e.g., network interfaces, disks). - The collected data is reported back to the
metal-operator
and added to theServerStatus
.`
- The
-
Available: The server has completed discovery and is ready for use.
-
Reserved:
- A
ServerClaim
resource is created to claim the server. - The server transitions to the
Reserved
state. - The server is allocated for a specific use or user.
- A
-
Cleanup:
- When the
ServerClaim
is removed, the server enters the Cleanup state. - Sanitization processes are performed (e.g., wiping disks, resetting BIOS settings).
- When the
-
Maintenance:
- Servers in the
Available
state can transition toMaintenance
. - Maintenance tasks such as BIOS updates or hardware repairs are performed.
- Servers in the
-
Error:
- The server has encountered an error.
- Requires intervention to resolve issues before it can return to
Available
.
The state diagram below represents the various server states and their transitions:
stateDiagram-v2
[*] --> Initial
Initial --> Discovery : Server object created
Discovery --> Available : Discovery complete
Available --> Reserved : ServerClaim created
Reserved --> Cleanup : ServerClaim removed
Cleanup --> Available : Cleanup complete
Available --> Maintenance : Maintenance initiated
Maintenance --> Available : Maintenance complete
Available --> Error : Error detected
Reserved --> Error : Error detected
Discovery --> Error : Error detected
Cleanup --> Error : Error detected
Maintenance --> Error : Error detected
Error --> Maintenance : Enter maintenance to fix error
Error --> Available : Error resolved
Interaction with BMC¶
Interaction with a server is done through its BMC:
Via Reference: Reference a BMC
resource using bmcRef
.
apiVersion: metal.ironcore.dev/v1alpha1
kind: Server
metadata:
name: server-with-bmc-ref
spec:
uuid: "123e4567-e89b-12d3-a456-426614174000"
power: "On"
bmcRef:
name: my-bmc
bootOrder:
- name: PXE
priority: 1
device: Network
BIOS:
- version: "1.0.3"
settings:
BootMode: UEFI
HyperThreading: Enabled
Inline Configuration: Use the bmc
field to provide direct BMC access details.
apiVersion: v1alpha1
kind: BMC
metadata:
name: my-bmc
spec:
endpointRef:
name: my-bmc-endpoint
bmcSecretRef:
name: my-bmc-secret
protocol:
name: Redfish
port: 8000
consoleProtocol:
name: SSH
port: 22