Eucalyptus: Cloud, Cluster and Node Controller

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

Cloud Controller

The Cloud Controller – also known as CLC – is the highest level in Eucalyptus. There is one Cloud Controller per infrastructure. The Cloud Controller is in charge of the following tasks:

  • Connect to virtual instances via SSH
  • Provide a Front end for the Web Services that are EC2 and S3 compatible
  • The Cloud Controller acts as a Meta Scheduler for the Cloud Infrastructure and determines which infrastructure to use.
  • The Cloud Controller collects resource information from Cluster Controllers

The Cloud Controller runs per default on same machine as Walrus und the Storage Controller.

Eucalyptus architecture
Eucalyptus architecture

The Cloud Controller acts as the main Element for a Eucalyptus Cloud. Each Eucalyptus-based Cloud starts with the Cloud Controller. Different Zones or Regions are realized with a Cluster Controller. There is exactly one Cloud Controller.

Cluster Controller

The Cluster Controller (CC) comes next in hierarchy after the Cloud Controller (CLC). There is exactly one Cluster Controller per location. A location could be compared to an Availability Zone within a Region in Amazon Web Services. The Cluster Controller is basically in charge of receiving requests from the Cloud Controller to deploy new virtual Instances. The Cluster Controller decides which Node is used for the new virtual Instance. The Cluster Controller also maintains virtual Networks available to the instances and collects information about the Node Controllers registered. This information is reported to the Cloud Controller. Each Cluster can have exactly one Cluster Controller.

Eucalyptus process
Eucalyptus process

When a new Instance is started, the Cloud Controller is instructed with the Image, Instance Type and Instance Number. The Cloud Controller looks up a Cluster Controller with enough available resources and selects one to start the instance. The Cloud Controller now itself looks up Node Controllers with enough resource availability and instructs the Node Controller to launch a new virtual Instance. If the Image requested is not available on the Node, the Node Controller looks up the Image by asking the Cloud Controller. The Cloud Controller now provides the Image via Walrus to the Node.

Node Controller

The Node Controller is the lowest Level in the Eucalyptus Stack. Node Controllers run on each physical instance, where virtual machines should run on. Node Controllers support XEN and KVM for virtualization purposes. A Node Controller is in charge of collecting data on the resources available on each instance. It also reports the utilization of the Node the Cluster Controller, to inform the Cluster Controller about the utilization and availability of the instance. The Node Controller also takes care of Instance life cycle management.

The header image is provided by  jar (away for a while) under the creative commons licence.

Eucalyptus: Overview

This post is part of the Open Source Cloud Computing series. For an Overview, please click on the Tag.

Eucalyptus was developed at the University of California, Santa Barbara (UCSB) and is provided under the GNU GLP v3 Open Source License. The name Eucalyptus stands for  “Elastic Utility Computing Architecture for Linking Your Programs To Useful Systems”. Its main target is to enable the execution and control of virtual instances with Xen or KVM under Linux and to provide an API that is compatible to Amazon Web Services (AWS). Since Eucalyptus is basically built upon the Amazon APIs, it is great for hybrid Cloud Solutions. The first version of Eucalyptus was released in 2008.

Platform Description

Each Eucalyptus component runs as UNIX service and communication between the components is based on SOAP Web services. Eucalyptus infrastructure may consist of one or more locations, which represent different datacenters. Eucalyptus consists the Cloud Controller, Cluster Controller, Node Controller, Walrus and the Storage Controller. The Platform provides Tools that are called “Euca2ools”, which are written in Python. The command-line tools distributed by Amazon Web Services inspire Euca2ools. There are two major Tools:

  • api-tools (Command Line interface to EC2)
  • ami-tools (Command Line interface to work with Amazon Machine Image)

With Euca2ools, it is possible to:

  • Do queries on availability zones (i.e. clusters in Eucalyptus)
  • Do SSH key management (add, list, delete)
  • Manage virtual Instances (start, list, stop, reboot, get console output)
  • Configure and Manage Security groups
  • Configure and Manage Volumes and snapshots (attach, list, detach, create, bundle, delete)
  • Manage Images (bundle, upload, register, list, deregister)
  • Manage IP addresses (allocate, associate, list, release)

All Configuration Elements for Eucalyptus are stored in a config-file as Key-Value Pairs. To start Eucalyptus, the configuration must be finished. Eucalyptus needs to connect to Clients (End Users) and Cloud Components (CC, Walrus, etc.). Therefore, network management is essential. Eucalyptus knows the following networking topologies:

  • Managed Mode. With Manged Mode, Eucalyptus provides all Networking Features such asVM Network Isolation, Security Groups, Elastic IPs and Metadata Service. A Cluster Controller must be in the same broadcasting Domain as the Node Controllers with Managed Mode. Furthermore, all Cluster and Node Controllers must be configured.
  • Managed Mode without VLAN. This is basically the same as Managed, but no VLAN is used. The Connectivity must be made by Ethernet and all Cluster Controllers and Node Controllers must be in the same Broadcast Domain.
  • System Mode. Eucalyptus mostly stays out of the way in terms of VM networking and basically relies on DHCP service to configure VM networks On all Cluster Controllers, VNET_MODE=”SYSTEM“ and on a Node Controller, a Bridge must be specified.
  • Static Mode. Eucalyptus DHCP Server „issues“ the Network Configuration. Nodes must be configured with VNET_MODE=”STATIC”.

The header image is provided by  jar (away for a while) under the creative commons licence.

Open Source Cloud Computing Platforms

In the next blog posts, I will describe some major Open Source Cloud Computing platforms. I will cover the 4 major platforms, including:

  • OpenStack
  • Eucalyptus
  • OpenNebula
  • CloudStack

This series will run alongside the self service IT series. By the end of the series, I will compare these 4 platforms with the self service attributes I will evaluate during the series. So keep on reading all of them 🙂


Automation in Datacenters for Cloud Computing

When we talk about Cloud Computing, we also talk about Automation in Datacenters. Cloud Computing transforms Datacenters to a way where we see much more Automation than we saw before. There is significant transformation going on and more and more Projects that enable that are launched nowadays. Famous Automation Platforms are Eucalyptus and OpenStack in the Open Source area. Microsoft and vmWare also offer some Automation Tools for the Cloud. But what are the concepts for Cloud Automation?

Let us first look at the illustration below to find out how automation in Datacenters work.

Datacenter Automation in the Cloud
Datacenter Automation in the Cloud

As shown in the illustration above, there are several steps involved. First, we add a new physical Server. This usually happens when a new Rack or Container is deployed to a Datacenter. The new physical Server is started and a Maintenance OS is started. This is usually a lightweight Version of Windows Server or Linux. The Maintenance OS is the basis for virtualisation. The Maintenance OS now connects to the Controller. The Controller is a Server that is somewhat of a Master in the Datacenter. The Controller tells the Maintenance OS what it should do. Normally, this is what virtual Host VM to start. The Host VM is now the container for different virtual Instances. They are called “Guest VMs”. Guest virtual Machines now run the applications the user wants. This can either be the Operating System (Infrastructure as a Service) or a Platform.