Machine Controller

A Machine is the declarative spec for a Node, as represented in Kubernetes core. If a new Machine object is created, a provider-specific controller will handle provisioning and installing a new host to register as a new Node matching the Machine spec. If the Machines spec is updated, a provider- specific controller is responsible for updating the Node in-place or replacing the host with a new one matching the updated spec. If a Machine object is deleted, the corresponding Node should have its external resources released by the provider-specific controller, and should be deleted as well.

Machines can be associated with a Cluster using a custom label cluster.k8s.io/cluster-name. When the label is set and non-empty, then it must reference the name of a cluster residing in the same namespace. The label must be set only once and updates are not permitted, an admission controller is going to enforce the change in a future version.

Machine

Machine has 4 fields:

Spec contains the desired machine state specified by the object. While much of the Spec is defined by users, unspecified parts may be filled in with defaults or by Controllers such as autoscalers.

Status contains only observed machine state and is only written by controllers. Status is not the source of truth for any information, but instead aggregates and publishes observed state.

TypeMeta contains metadata about the API itself - such as Group, Version, Kind.

ObjectMeta contains metadata about the specific object instance, for example, it's name, namespace, labels, and annotations, etc. ObjectMeta contains data common to most objects.

// Machine is the Schema for the machines API
// +k8s:openapi-gen=true
// +kubebuilder:resource:shortName=ma
// +kubebuilder:subresource:status
// +kubebuilder:printcolumn:name="ProviderID",type="string",JSONPath=".spec.providerID",description="Provider ID"
// +kubebuilder:printcolumn:name="Phase",type="string",JSONPath=".status.phase",description="Machine status such as Terminating/Pending/Running/Failed etc"
// +kubebuilder:printcolumn:name="NodeName",type="string",JSONPath=".status.nodeRef.name",description="Node name associated with this machine",priority=1
type Machine struct {
    metav1.TypeMeta   `json:",inline"`
    metav1.ObjectMeta `json:"metadata,omitempty"`

    Spec   MachineSpec   `json:"spec,omitempty"`
    Status MachineStatus `json:"status,omitempty"`
}

MachineSpec

The ProviderSpec is recommended to be a serialized API object in a format owned by that provider. This will allow the configuration to be strongly typed, versioned, and have as much nested depth as appropriate. These provider-specific API definitions are meant to live outside of the Machine API, which will allow them to evolve independently of it. Attributes like instance type, which network to use, and the OS image all belong in the ProviderSpec.

Some providers and tooling depend on an annotation to be set on the Machine to determine if provisioning has completed. For example, the clusterctl command does this here:

        // TODO: update once machine controllers have a way to indicate a machine has been provisoned. https://github.com/kubernetes-sigs/cluster-api/issues/253
    // Seeing a node cannot be purely relied upon because the machine running the control plane
    // will not be registering with the stack that provisions it.
        ready := m.Status.NodeRef != nil || len(m.Annotations) > 0
        return ready, nil
// MachineSpec defines the desired state of Machine
type MachineSpec struct {
    // ObjectMeta will autopopulate the Node created. Use this to
    // indicate what labels, annotations, name prefix, etc., should be used
    // when creating the Node.
    // +optional
    metav1.ObjectMeta `json:"metadata,omitempty"`

    // The list of the taints to be applied to the corresponding Node in additive
    // manner. This list will not overwrite any other taints added to the Node on
    // an ongoing basis by other entities. These taints should be actively reconciled
    // e.g. if you ask the machine controller to apply a taint and then manually remove
    // the taint the machine controller will put it back) but not have the machine controller
    // remove any taints
    // +optional
    Taints []corev1.Taint `json:"taints,omitempty"`

    // ProviderSpec details Provider-specific configuration to use during node creation.
    // +optional
    ProviderSpec ProviderSpec `json:"providerSpec"`

    // Versions of key software to use. This field is optional at cluster
    // creation time, and omitting the field indicates that the cluster
    // installation tool should select defaults for the user. These
    // defaults may differ based on the cluster installer, but the tool
    // should populate the values it uses when persisting Machine objects.
    // A Machine spec missing this field at runtime is invalid.
    // +optional
    Versions MachineVersionInfo `json:"versions,omitempty"`

    // ConfigSource is used to populate in the associated Node for dynamic kubelet config. This
    // field already exists in Node, so any updates to it in the Machine
    // spec will be automatically copied to the linked NodeRef from the
    // status. The rest of dynamic kubelet config support should then work
    // as-is.
    // +optional
    ConfigSource *corev1.NodeConfigSource `json:"configSource,omitempty"`

    // ProviderID is the identification ID of the machine provided by the provider.
    // This field must match the provider ID as seen on the node object corresponding to this machine.
    // This field is required by higher level consumers of cluster-api. Example use case is cluster autoscaler
    // with cluster-api as provider. Clean-up logic in the autoscaler compares machines to nodes to find out
    // machines at provider which could not get registered as Kubernetes nodes. With cluster-api as a
    // generic out-of-tree provider for autoscaler, this field is required by autoscaler to be
    // able to have a provider view of the list of machines. Another list of nodes is queried from the k8s apiserver
    // and then a comparison is done to find out unregistered machines and are marked for delete.
    // This field will be set by the actuators and consumed by higher level entities like autoscaler that will
    // be interfacing with cluster-api as generic provider.
    // +optional
    ProviderID *string `json:"providerID,omitempty"`
}

MachineStatus

Like ProviderSpec, ProviderStatus is recommended to be a serialized API object in a format owned by that provider.

Note that NodeRef may not be set. This can happen if the Machine and corresponding Node are not within the same cluster. Two reasons this might be the case are:

  • During bootstrapping, the control plane Machine will initially not be in the same cluster which is being created.
  • Some providers distinguish between manager and managed clusters. For these providers a Machine and it's corresponding Node may never be within the same cluster. TODO: There are open issues to address this.
// MachineStatus defines the observed state of Machine
type MachineStatus struct {
    // NodeRef will point to the corresponding Node if it exists.
    // +optional
    NodeRef *corev1.ObjectReference `json:"nodeRef,omitempty"`

    // LastUpdated identifies when this status was last observed.
    // +optional
    LastUpdated *metav1.Time `json:"lastUpdated,omitempty"`

    // Versions specifies the current versions of software on the corresponding Node (if it
    // exists). This is provided for a few reasons:
    //
    // 1) It is more convenient than checking the NodeRef, traversing it to
    //    the Node, and finding the appropriate field in Node.Status.NodeInfo
    //    (which uses different field names and formatting).
    // 2) It removes some of the dependency on the structure of the Node,
    //    so that if the structure of Node.Status.NodeInfo changes, only
    //    machine controllers need to be updated, rather than every client
    //    of the Machines API.
    // 3) There is no other simple way to check the control plane
    //    version. A client would have to connect directly to the apiserver
    //    running on the target node in order to find out its version.
    // +optional
    Versions *MachineVersionInfo `json:"versions,omitempty"`

    // ErrorReason will be set in the event that there is a terminal problem
    // reconciling the Machine and will contain a succinct value suitable
    // for machine interpretation.
    //
    // This field should not be set for transitive errors that a controller
    // faces that are expected to be fixed automatically over
    // time (like service outages), but instead indicate that something is
    // fundamentally wrong with the Machine's spec or the configuration of
    // the controller, and that manual intervention is required. Examples
    // of terminal errors would be invalid combinations of settings in the
    // spec, values that are unsupported by the controller, or the
    // responsible controller itself being critically misconfigured.
    //
    // Any transient errors that occur during the reconciliation of Machines
    // can be added as events to the Machine object and/or logged in the
    // controller's output.
    // +optional
    ErrorReason *common.MachineStatusError `json:"errorReason,omitempty"`

    // ErrorMessage will be set in the event that there is a terminal problem
    // reconciling the Machine and will contain a more verbose string suitable
    // for logging and human consumption.
    //
    // This field should not be set for transitive errors that a controller
    // faces that are expected to be fixed automatically over
    // time (like service outages), but instead indicate that something is
    // fundamentally wrong with the Machine's spec or the configuration of
    // the controller, and that manual intervention is required. Examples
    // of terminal errors would be invalid combinations of settings in the
    // spec, values that are unsupported by the controller, or the
    // responsible controller itself being critically misconfigured.
    //
    // Any transient errors that occur during the reconciliation of Machines
    // can be added as events to the Machine object and/or logged in the
    // controller's output.
    // +optional
    ErrorMessage *string `json:"errorMessage,omitempty"`

    // ProviderStatus details a Provider-specific status.
    // It is recommended that providers maintain their
    // own versioned API types that should be
    // serialized/deserialized from this field.
    // +optional
    ProviderStatus *runtime.RawExtension `json:"providerStatus,omitempty"`

    // Addresses is a list of addresses assigned to the machine. Queried from cloud provider, if available.
    // +optional
    Addresses []corev1.NodeAddress `json:"addresses,omitempty"`

    // Conditions lists the conditions synced from the node conditions of the corresponding node-object.
    // Machine-controller is responsible for keeping conditions up-to-date.
    // MachineSet controller will be taking these conditions as a signal to decide if
    // machine is healthy or needs to be replaced.
    // Refer: https://kubernetes.io/docs/concepts/architecture/nodes/#condition
    // +optional
    Conditions []corev1.NodeCondition `json:"conditions,omitempty"`

    // LastOperation describes the last-operation performed by the machine-controller.
    // This API should be useful as a history in terms of the latest operation performed on the
    // specific machine. It should also convey the state of the latest-operation for example if
    // it is still on-going, failed or completed successfully.
    // +optional
    LastOperation *LastOperation `json:"lastOperation,omitempty"`

    // Phase represents the current phase of machine actuation.
    // E.g. Pending, Running, Terminating, Failed etc.
    // +optional
    Phase *string `json:"phase,omitempty"`
}

// LastOperation represents the detail of the last performed operation on the MachineObject.
type LastOperation struct {
    // Description is the human-readable description of the last operation.
    Description *string `json:"description,omitempty"`

    // LastUpdated is the timestamp at which LastOperation API was last-updated.
    LastUpdated *metav1.Time `json:"lastUpdated,omitempty"`

    // State is the current status of the last performed operation.
    // E.g. Processing, Failed, Successful etc
    State *string `json:"state,omitempty"`

    // Type is the type of operation which was last performed.
    // E.g. Create, Delete, Update etc
    Type *string `json:"type,omitempty"`
}

Machine Actuator Interface

All methods should be idempotent. Each time the Machine controller attempts to reconcile the state it will call one or more of the following actuator methods.

Create() will only be called when Exists() returns false.

Update() will only be called when Exists() returns true.

Delete() will only be called when the Machine is in the process of being deleted.

The definition of Exists() is determined by the provider.

TODO: Provide more guidance on Exists().

// Actuator controls machines on a specific infrastructure. All
// methods should be idempotent unless otherwise specified.
type Actuator interface {
    // Create the machine.
    Create(context.Context, *clusterv1.Cluster, *clusterv1.Machine) error
    // Delete the machine. If no error is returned, it is assumed that all dependent resources have been cleaned up.
    Delete(context.Context, *clusterv1.Cluster, *clusterv1.Machine) error
    // Update the machine to the provided definition.
    Update(context.Context, *clusterv1.Cluster, *clusterv1.Machine) error
    // Checks if the machine currently exists.
    Exists(context.Context, *clusterv1.Cluster, *clusterv1.Machine) (bool, error)
}

Machine Controller Semantics

  1. Determine the Cluster associated with the Machine from its cluster.k8s.io/cluster-name label.
  2. If the Machine hasn't been deleted and doesn't have a finalizer, add one.
  3. If the Machine is being deleted, and there is no finalizer, we're done
    • Check if the Machine is allowed to be deleted. 1
    • Call the provider specific actuators Delete() method.
      • If the Delete() method returns true, remove the finalizer.
  4. Check if the Machine exists by calling the provider specific Exists() method.
    • If it does, call the Update() method.
    • If the Update() fails and returns a retryable error:
      • Retry the Update() after N seconds.
  5. If the machine does not exist, attempt to create machine by calling actuator Create() method.

Machines depend on Clusters

The Machine actuator methods expect both a Cluster and a Machine to be passed in. While there is not a strong link between Clusters and Machines, the machine controller will determine which cluster to pass by looking for a Cluster in the same namespace as the Machine

There are two consequences of this:

  • The machine actuator assumes there will be exactly one Cluster in the same namespace as any Machines it reconciles. See getCluster() for the details.
  • If the Cluster is deleted before the Machine it will not be possible to delete the Machine. Therefore Machines must be deleted before Clusters.

machine reconciliation logic

machine reconciliation logic

machine deletion block

machine deletion block

machine object creation sequence

machine object creation

machine object deletion sequence

machine object deletion


1 One reason a Machine may not be deleted is if it corresponds to the node running the Machine controller.

results matching ""

    No results matching ""