You are here: Home GRID Seed GRID Seed overview
Personal tools
Document Actions

GRID Seed overview

by Ezio Corso last modified 2010-04-12 12:09

This is an overview of what is GRID Seed and the motivations behind it. This document is outdated: to know more on the latest release of GRIDSEED chechout the official site gridseed.escience-grid.org

The GRID Seed tool

GRID Seed is a tool developed at ICTP within the EU-IndiaGrid project that simplifies the setting up of a fully fledged grid infrastructure based on the gLite middleware. It was developed to easily deploy a training grid infrastructure almost everywhere in the world, since the only requirement is a set of hosts (simple PCs) locally networked together.

 

Motivations

ICTP team promoted recently a few training events on grid technologies in several parts of the world, some of them with very limited bandwidth. It therefore happened that a remote training grid infrastructure (like for instances the GILDA one) was mostly unaccessible, resulting in the impossibility of having hands-on sessions. These experiences convinced us that in order to be productive, a fully fledged training grid infrastructure must be setup locally: this approach is actually followed right now in many training events at the European level as well.

There are several methods for setting up such a temporary infrastructure. Right now almost all proposed solutions are based on a virtualization approach: the idea is to have many different virtual machines each one serving a specific grid service. These virtual machines can be hosted in one or more physical servers and then appropriately configured in order to start all the needed services.

GRID Seed is just a step further in this direction: based on virtual machines this tool hides the complexity needed to configure by hand all the services a GRID must have.



Technical description

GRID Seed (GS) consists of a set VMWare virtual machines VMs for each gLite grid node, a tool called start.sh to boot and automatically configure the VMs, and a simple installer programme that sets up GS in any chosen host. More specifically: the VMs contain images of each grid service augmented with extra scripting for automatic configuration of a grid; start.sh is not limited to booting the VM, but also configures it automatically for grid operation; the installer simply creates an elementary directory structure in the target host and copies the VMs, the start.sh tool, and documentation.

Prerequisite for GS, is the presence of VMWare software in the target host. The VMs can all be booted from the same physical host, or alternatively they may be booted from different ones: it depends on the hardware available. Clearly, the more VMs are booted from the same physical host, the bigger the hardware requirements for that host.




Architectural overview

The set up of a gLite grid requires the installation of large amounts of software, followed by long and complex configuration. GS was designed precisely to allow for the automation of gLite grid configuration to the maximum extent possible. It was the single most pressing technical challenge for GS.

In order to install, configure and work with a gLite grid, there must first be a set of basic general network services up and running; only then can proper grid services be installed and executed. In gLite however, there are two classes of services: Central ones that must be up and running before the remaining Site ones can operate. The configuration of each service proceeds in parallel with its installation: there is therefore a precise ordering for the configuration and booting of a full gLite grid. GS encapsulates this knowledge and leverages on it when automatically configuring a grid on the fly.

Through start.sh, a special VM is booted first. It provides the basic set of network services that gLite software expects for its functioning; it also includes important auxiliary services used during the automatic configuration of the grid services proper. Then again through start.sh, each VM corresponding to gLite Central grid services is booted; a set of scripts contained in the VM interacts with the Special VM and obtains the grid configuration information needed. Finally, each VM for the Site grid services can be booted once more through start.sh; more scripts included in the VM interact with the Special VM to obtain any necessary grid configuration information.

It is important to understand that start.sh is not a simple starter of the VMWare software, but has the responsibility to properly configure all IP addresses, and in general supplying the network configuration for the virtualised gLite grid services. It is only then that thanks to the properly configured network, the VMs can interact with the Special one, specifically for obtaining their grid configuration. Moreover start.sh is smart enough to signal if there are race conditions going on when fetching configuration information, such as when booting several VMs at the same time; and also for pointing out unsupported gLite grid configurations.

GS is designed to create and configure a gLite grid made up of 1 WMS, 1 Top BDII, 1 LFC, 1 VOMS, up to 99 UI, up to 255 Sites each one consisting of 1 CE + 1 SE + up to 252 WN. As can be seen, it still represents a formidable grid to play with. Any attempt to set up a broken configuration, such as booting a second VOMS for example, will result in a warning and in denying the cofiguration information, thus stopping the VM from being operative.



The Architecture in more detail

(1) The Special VM that must be booted before all others, is called DNS.

It contains several important general services that must be up and running for any gLite grid installation to be carried out, and for the installed grid to work properly.They consist in a DNS Server, an NTP Server, and a CA. Moreover, it contains auxiliary services used during automatic configuration of the other VMs: there is an Apache web server with CGIs consisting of bash scripts. These scripts carry out several operations that will be described later on: indeed, the other VMs interact with this Special one through these web CGIs.

The DNS Server provides fully qualified names as needed by the GLOBUS middleware. A set of conventions is used to name  the WMS, Top BDII, CEs, WNs, etc., as well as assigning the IP addresses. The 10.x.y.z subnet is used by SG, and it has thus been partitioned:

  • x is a configurable parameter through start.h; for the rest of this document we will assume it has been chosen and set to X.

 

  • Central services and UI:

10.X.0.1      dns.grid.box           Used by DNS VM
10.X.0.2      voms.grid.box        Used by VOMS VM
10.X.0.3      bdii.grid.box           Used by BDII VM
10.X.0.4      lfc.grid.box             Used by LFC VM
10.X.0.5      wms.grid.box          Used by WMS VM

10.X.0.21     ui1.grid.box            Used by UI VM for setting up UI1
 [...]
10.X.0.119    ui99.grid.box          Used by UI VM for setting up UI99

  • Sites:

10.X.1.254   ce-1.grid.box           Used by SITE VM for setting up CE of site 1
10.X.1.253   se-1.grid.box           Used by SITE VM for setting up SE of site 1
10.X.1.1       ce-1wn1.grid.box     Used by SITE VM for setting up WN1 of site 1
  [...]
10.X.1.252   ce-1wn252.grid.box  Used by SITE VM for setting up WN252 of site 1

Notice that in the previous table only the IP and names for Site 1 were shown; for all other remaining sites, the information is the same apart from substituting 1 with the specific site number.

The special scripts provide for synchronised time: VM technology tends to have random effects on machine clock, which is critical for grids in general and especially for the security component of gLite called GSI. Notice that an NTP Server could not be used because it does not work in VMWare.

Finally, the CA Certification Authority issues all host and user certificates; indeed, upon booting the other grid node VMs, the scripts that each VM executes at boot time also ask for host certificate, the CA public certificate, the CA CRL, and the VOMS Server public certificate. Notice that this CA was purpose made, and consists of an Apache web server and a set of CGIs.

With this VM up and running, all essential services are configured and the grid nodes may be installed. There are two big configuration issues which GS handles automatically: the setting up, management, and handling of HOST certificates; and the setting up of a functioning information system which must be populated with correct information about the installed grid. Each time a VM of one grid service is started, the scripts perform these two configuration operations.


(2) The second VM to boot must be the VOMS server.

It is the first proper grid service to boot: it allows for the support of advanced digital certificates containing cryptographic extensions. VOMS proxies however are not optional: all gLite grid nodes as default only support VOMS proxies, and will fail with classic proxies. Each of those two kinds of proxies, however, requires a specific and different enforcing mechanism. This is important to keep in mind because there are two gLite services that behave differently: TopBDII and WMS. The first one makes no use of proxies at all; while the second one still uses old authentication mechanisms for classic proxies. Both of these situations are taken care of by GS.

As mentioned in the Architectural Overview, network configuration is provided by the start.sh tool; if the tool is not used, then the information must be supplied manually to VMWare following the convention explained in the table supplied earlier on.

At boot time, a specific script asks the Special VM for the certificate it got assigned, and if it isn't there it asks for one to be created. General CA stuff is also downloaded, such as CRL, CA public certificate, list of recognised CAs, etc.

Specific to the VOMS VM, special scripts make sure that every two minutes any newly released user certificate is uploaded into the VOMS server. The CA in the Special VM is checked periodically: this is done by querying proper CGI pages on the Apache web server in the Special VM.



(3) The third VM to boot must be the BDII VM.

It is the second central service to boot: it is the Top BDII. It is the root of the information system which periodically polls all configured Site BDII and MDS.

As mentioned in the Architectural Overview, network configuration is provided by the start.sh tool; if the tool is not used, then the information must be supplied manually to VMWare following the convention explained in the table supplied earlier on.

At boot time, this VM does NOT ask for certificates to the Special VM since gLite does not require the information system to be authenticated. It does ask for the bddi configuration file which is mostly pre-configured, similarly to the domain name configuration. Beware that only the static information is pre-configured: the dynamic one that depends on how many sites there are configured, is supplied through automatic mechanisms as will be explained later on.


(4) The other VMs representing Central grid services to boot are: the LFC and the WMS.

As mentioned in the Architectural Overview, network configuration is provided by the start.sh tool; if the tool is not used, then the information must be supplied manually to VMWare following the convention explained in the table supplied earlier on.

All of them ask the Special VM for the host certificate as explained earlier on; moreover they ask for VOMS server specific information, especially the VOMS server public certificate. This is needed by gLite grids otherwise they won't accept any VOMS proxies from users. That is, the VOMS Server public certificate is needed so the services may verify that VOMS proxies indeed do come from that server and were not forged.

As for the configuration of the part of the information system present locally in these grid services, it is straight forward. Since there are only one of each of these services, they are pre-configured with the correct information that will be subsequently polled by the Top BDII during routine grid operation.

Finally, the WMS VM carries out an extra step. Since it works with the classic proxy enforcing mechansms while gLite in general uses VOMS proxies, a special script runs every two minutes and checks with the CA for newly added users; it then updates appropriately the WMS configuration files. These scripts interact with the Special VM through the CGI pages that are exposed by the Apache server installed there.



(5) Site installation.
These are the last grid services to boot. They consist of a CE, an SE and up to 252 WNs.

As mentioned in the Architectural Overview, network configuration is provided by the start.sh tool; if the tool is not used, then the information must be supplied manually to VMWare following the convention explained in the table supplied earlier on.

The same automatic configuration mechanisms for Host Certification as described earlier on for other services, apply here too. For the configuration of the part of the information system that resides locally, however, there is a special script. Upon booting the services, it automatically registers with the Top BDII. Again this script interacts with the Special VM to obtain the needed information, through the CGI pages exposed by the Apache webserver present there.


« June 2017 »
Su Mo Tu We Th Fr Sa
123
45678910
11121314151617
18192021222324
252627282930
 

Powered by Plone This site conforms to the following standards: