您现在的位置是:首页-> 米鼠技术 ->从TeraGrid中学习到的经验,第1部分: 管理大型网格上的大数据集

从TeraGrid中学习到的经验,第1部分: 管理大型网格上的大数据集

A major goal of the TeraGrid project is to make it as easy and transparent as possible for researchers to move jobs and data freely among machines within and outside TeraGrid sites.

What is the TeraGrid?

The TeraGrid is a National Science Foundation (NSF)-funded project to provide an integrated computational and data infrastructure for research scientists within the United States. The TeraGrid currently consists of seven resource provider (RP) sites, each connected through a dedicated TeraGrid network, and each providing high-end computing resources totaling more than 50 tera-Floating Point Operations Per Second (teraFLOPS) of compute power and 600 TB of available storage.

TeraGrid users have the option of storing their data, managing their jobs, and performing their computations on the machine most appropriate for their tasks, using grid technology for access to each resource.

Compute platforms TeraGrid sites provide include IBM IA64 clusters, IBM and Dell IA32 clusters, all running Linux?0?3, 32-processor and 8-processor IBM POWER4/AIX?0?3 clusters, an IBM Blue Gene?0?3/L rack, a Cray XT3, and SPARC/Solaris nodes. Data resources include at least six IBM General Parallel File System (GPFS) systems and an assortment of other parallel file systems, including file systems based on Parallel Virtual File System 2 (PVFS2), Lustre, Sun Microsystems' QFS, and IBRIX, along with several Mass Storage Systems (MSS) and Hierarchical Storage Management (HSM) systems.

The advantage of this diversity is the ability to serve almost any computational or data-based scientific research requirement. However, with this many options spread across seven geographically distributed sites (as shown in Figure 1), having tools to manage complexity and provide simplicity in accessing these resources is crucial because many scientific problems have complex multistage -- and often multiplatform -- workflows.

Early in the project, the group decided that using grid technologies was the most effective way to provide the kind of transparency and ease of use scientific researchers require to accomplish their work. This article describes the process by which you deploy and verify grid software across these various resources, along with some of the development projects the TeraGrid staff have undertaken to further enable the use of grid technologies.


Figure 1. The TeraGrid network
The TeraGrid network

The Common TeraGrid software stack

A core TeraGrid requirement is for users to be able to expect that the same interoperable grid software tools and servers will be available on every TeraGrid machine by default. To accomplish this, the project team defined a group of grid software tools required at each site, known as the Common TeraGrid Software Stack (CTSS), and developed an extensible version- and unit-testing tool known as Inca, which is used to visually verify the existence and functionality of all CTSS components. The current CTSS version is 2, which has been in production use on all TeraGrid machines for about two years.

The following items from the CTSS V2 stack are required to be a production TeraGrid resource. Some servers run on login nodes, while some sites run all servers on nodes dedicated to grid services. Most sites run specialized GridFTP server nodes to achieve maximum performance in bulk data transfer. In the case of tools such as Secure Shell (SSH) generally available on UNIX?0?3 platforms, the grid-enabled client must be in the default path, and the grid-enabled server must listen on the default port.

CTSS server requirements:

  • Globus V2.4.3 Gatekeeper
  • Globus V2.4.3 GridFTP Server
  • Condor-G V6.5 or later
  • Grid Security Infrastructure (GSI)-enabled OpenSSH
  • GX-Map or other grid-mapfile management system

CTSS client requirements:

  • Globus V2.4.3 job submission, information service, and GridFTP clients
  • Condor-G V6.5 or later job submission tools
  • GSI-enabled OpenSSH client
  • MyProxy client tools
  • MPICH-G2 grid-enabled Message Passing Interface (MPI) implementation
  • Storage Resource Broker client tools
  • Accounting information client
  • Grid identity management tools
  • SoftEnv environment management tools

Notes on grid components

A few services may not be required on all TeraGrid computers because of platform differences. There may also be differences in standard underlying tools. These exceptions and differences are noted below:

  • Optionally, you can run the Globus V2.4.3 Monitoring and Discovery System (MDS) Information Server (which is based on the OpenLDAP server) on each machine, in which case you configure the server to report to a central TeraGrid MDS server, which in turn provides a hierarchical view of TeraGrid resources.
  • MyProxy infrastructure is centralized, with a single MyProxy server used to store credentials for users at all TeraGrid sites.
  • Specialized accounting and grid security components are described in the TeraGrid account and allocation management section below.
  • In addition to the grid software components, CTSS also requires the provision of a high-performance optimizing compiler, such as IBM's xlC compiler variant for Power Architecture machines or Intel's icc for the Itanium?0?3 architecture. Of course, parallel programming libraries such as MPI are also a requirement. While these tools are not grid software, you do need to take them into account when configuring the Globus GRAM job-submission and execution services, and they are also required to build and use the MPICH-G2 software.


TeraGrid management tools

The TeraGrid is a large, widely distributed collection of heterogeneous resources, with stringent software installation and configuration requirements. The goal of the TeraGrid project is to reliably provide easy, transparent access to all compute, data, and visualization resources for all qualified researchers at all times.

In addition to the usual 24x7x365 operational support staff, the project must also have automated mechanisms to regularly perform version and unit tests on all CTSS software components so project members can identify problems with the grid infrastructure before users do. It is also a requirement that the project create a consistent environment on each machine for each user. And finally, the project requires the ability to treat users as TeraGrid users, possibly associated with a peer-reviewed scientific research project, rather than as users of only one site or as functionally anonymous individuals, and to identify the resources allocated to a user or project. This requires a centralized mechanism to store and automatically distribute functional data regarding all TeraGrid projects, the principal investigators (PIs) and other users associated with each project, and tools to allow the users to determine their remaining allocations for compute time and data storage.

Inca unit and version testing

During deployment of the current version of the TeraGrid software stack (based on the Globus Toolkit V2.4.3), it was clear that the number of machines and software components made any kind of manual verification of each component's availability and version impractical at best. As a result, the TeraGrid project developed a modular grid-enabled version- and unit-test harness called Inca.

Inca is implemented as a set of Python classes, which you can use to quickly and easily build simple version tests and more complex functionality tests for various grid components. The results are automatically made available through a secure Web interface, as shown in Figure 2, so that TeraGrid staff can quickly view the status of each machine within the TeraGrid with regard to installed software versions and the ability to perform typical grid computing tasks, such as using the globus-url-copy command to transfer a file from one site to another.


Figure 2. The Inca unit and version testing tool
The Inca unit and version testing tool

While the initial tests developed for the Inca system focused primarily on simply verifying the installed versions of each software component, as the project has matured, the Inca team has developed more sophisticated functionality tests, as well as adding the ability to store test history so you can use Inca to verify that all common grid tasks users perform will work. If, for any reason, a test stops working, an e-mail message to the TeraGrid operations center is automatically generated and dispatched to the TeraGrid site in question.

TeraGrid account and allocation management

In grid terminology, the TeraGrid acts as a virtual organization (VO) and as a collection of smaller sites, each of which also acts as a VO. Each site within the TeraGrid may have its own account management system and mechanisms to track usage of each resource. However, for the TeraGrid to operate as a single VO, these local management and accounting systems must be interfaced with a centralized TeraGrid-wide account management system.

To provide this centralized accounting service, the Account Management Information Exchange (AMIE) system was developed in cooperation with Boston University. AMIE provides a centralized management database of all TeraGrid projects and users, including information about which resources users have requested access to and how much time on these resources they have been allocated. Each TeraGrid site with an existing account management system in place has developed an interface between the local account management infrastructure and the AMIE system. Using AMIE and the local account management systems, accounts are created automatically at all TeraGrid sites when new users and projects are added, and you can quickly access information about the user community and the use of TeraGrid resources.

Users also need to have mechanisms to access the stored accounting data for TeraGrid projects. Every time a computational task is submitted for execution on the TeraGrid, the resources that task consumes are charged to the project associated with the submitting user using an abstract currency known as service units (SUs). Some users are also members of multiple projects, so they need mechanisms to see all their projects and select the correct project to be charged for any given task.

TeraGrid provides this capability through a tool called tgusage, which interfaces with the central AMIE database and queries user and project status based on user requests. The tgusage tool also allows users to select the current project or the project to be charged anytime the user submits a computational task to any TeraGrid resource. You use a similar mechanism at the system level to verify that a user has permission to access a specific resource and that the current project has enough SUs to perform the task. If a project runs out of SUs, users won't be able to run tasks under that project until a new allocation is given out through the TeraGrid or the NSF peer-review allocation committees.

Taken together, the AMIE tools provide automated account management for all TeraGrid sites, as well as accountability for all use of TeraGrid resources, which is a critical capability for a project of this size and complexity.

Security infrastructure

One of the most time-consuming and manual tasks a grid system manager performs is the management of GSI account mappings. In a Globus-based grid, each user must have a certificate issued by a Certificate Authority (CA). This certificate has a Distinguished Name (DN) represented as a text string globally unique to the user and certificate.

On each machine in a grid, a file called grid-mapfile must be maintained. This file contains mappings between DNs and local user account names. Without additional tools, a system administrator must maintain this file manually. Obviously, this isn't feasible in a large distributed environment containing thousands of users, so an alternate mechanism had to be developed.

In addition to account mappings, each grid site must also maintain a set of trusted CAs. A grid user must have a certificate from a trusted CA to access a TeraGrid resource. Because the project is trusting each CA to verify the identity of remote users for the purposes of granting access to large, expensive resources, it is crucial that a process exist by which project members determine whether a CA is trustworthy.

To accomplish verification, a security group within the TeraGrid is responsible for reviewing the policies of grid CAs and deciding whether each one meets the project's requirements for stringent verification of user identities through photo identification or some other means. When we have decided to trust a given CA, each site must also maintain updated lists of revoked certificates -- credentials that could be compromised or lost and, therefore, can no longer be used. Administrators must download the Certificate Revocation List (CRL) for each CA regularly, and the Globus Toolkit does not provide automated mechanisms to do this.

To fill these needs for account mapping and CA management, the TeraGrid project collaborated with the NSF Middleware Initiative (NMI) project to develop the gx-map tool kit. This tool kit is a set of open source Perl scripts that provide common functionality grid sites require in general and TeraGrid sites require specifically. These functions include a program users can run to map their grid credentials (DNs) to their local accounts on each machine, so users become responsible for maintaining their own credentials across the TeraGrid in whatever way they prefer. The gx-map tool kit also provides mechanisms for managing a common set of CA files, including automatic update of CRLs. Using the gx-map utilities, system engineers can focus on helping users solve problems and enhancing the TeraGrid infrastructure, rather than spending their time manually managing large lists of user credentials and downloading CRLs.



Conclusion

This article introduced the TeraGrid, currently the largest set of public high-end computational resources in the United States. It described the motivations behind the project and briefly introduced some of the challenges inherent in managing a large geographically distributed grid. A wide variety of strategies and tools is required to address these challenges, and this article provides an overview of some of the most significant ways in which the TeraGrid project has overcome these challenges.

The TeraGrid will continue to grow and change as high-performance computing and grid technologies mature, but the strategies and tools outlined in this article provide a solid foundation on which to build additional infrastructure and add sites. By applying similar strategies or using some of the same tools, it should be possible for you to begin exploring the TeraGrid or other grids or even build your own set of integrated grid resources.

Future articles in this "Lessons learned from the TeraGrid" series will focus on the special capabilities and tools provided within the TeraGrid to orchestrate computational tasks across numerous grid resources. These articles will also examine the data management strategies currently in use or in development to further increase the mobility of users and data within the TeraGrid.


热点文章
最新项目
相关文章 最新文章