Da Do IT Now

BUILDING THE FUTURE OF HIGH-PERFORMANCE COMPUTING

Building a cutting-edge HPC environment inside Deucalion

University of Minho in Portugal hosts Deucalion supercomputer and is a renowned research hub known for its focus on advancing scientific research and technological innovation.
With increasing demands for computational power to tackle complex simulations, large-scale data analysis, and AI-driven projects, Deucalion operations team with specialists from University of Minho, INESC TEC and FCT, through its digital services unit FCCN, faced the challenge of building a state-of-the-art high-performance computing (HPC) environment from scratch. To achieve this ambitious goal, Deucalion operations team needed a solution that seamlessly integrated various technologies while delivering exceptional performance.

Recognizing the need for expertise in designing and implementing advanced computing infrastructure, Deucalion operations team published a tender won by Fujitsu and Atos and requested the expertise of
Exeliz and Do IT Now (formerly HPCNow!) with a strong reputation for their extensive experience and successful HPC installations, Do IT Now & Exeliz were chosen to play a key role in installing and configuring the x86 Compute Cluster, a critical component of the new high-performance computing environment.

x86 Compute cluster installation at Deucalion

The High-Performance Computing (HPC) cluster installation at Deucalion, installed at the University
of Minho in Portugal, represents a major undertaking involving key industry partners such as Fujitsu,
ATOS, Nvidia, and DDN. This case study highlights Do IT Now’s role in the installation and configuration of the x86 Compute Cluster.

Hybrid computing cluster installation with 10 petaflops capacity

This large-scale installation features a hybrid computing cluster with a total capa-
city of 10 petaflops, including more than 2,000 nodes. The project’s size and the mix
of x86, ARM and GPU technologies underscores the ambitious nature of this project.
Do IT Now’s extensive installation work from scratch highlights their expertise towards the integration of ARM and x86 technologies to achieve exceptional computing performance.

Unified x86 and ARM compute clusters for high-performance research

Deucalion supercomputer comprises three primary compute partitions: a x86 Compute Cluster, a GPU (x86 based) Compute Cluster, and an ARM Compute Cluster, each connec- ted via its own InfiniBand (IB) network. Designed for high-intensity scientific research and data processing, these clusters operate as a unified system in terms of storage, user management, job scheduling and networking services.

KEY OUTCOMES

Enhanced computational power. The successful deployment of over 500 x86 nodes, including GPU-accelerated systems, has substantially increased the center’s capability to handle complex simulations and large-scale data analyses at Deucalion supercomputer.

Seamless integration. The robust Deucalion networking infrastructure, including a dedicated InfiniBand network, ensures high-speed, low-latency communication, facilitating optimal performance for diverse research applications.

Streamlined management. The implementation of intelligent workload management and essential networking services has optimized Deucalion system performance and provided a user-friendly experience for researchers and administrators.

Condividi

Legga altri documenti tecnici