References

AI servers for research and development of LLMs at Charles University, Prague

With regard to the number and quality of research centres and research infastructures, the Czech Republic is at the forefront of the European Union. One such centre is LINDAT/CLARIAH-CZ, or the Digital Research Infrastructure for Language Technology, Arts and Humanities.

What is LINDAT/CLARIAH-CZ

LINDAT/CLARIAH-CZ is a Czech data centre providing certified storage and computer language processing services. It is a unique large research infrastructure, dealing mainly with linguistic but also with other digital resources and tools for their processing.

LINDAT/CLARIAH-CZ also offers know-how, software tools for the processing of language and other digital resources and the development of language technologies for industry and services, including use in new cultural and creative industries. LINDAT/CLARIAH-CZ engages in international collaborations between similar research infrastructures and directly between institutions in all humanities disciplines and emphasises digital and interdisciplinary processing methods, including advanced machine learning and artificial intelligence.

The project is led by the team of Prof. Jan Hajič, PhD, Charles University, Faculty of Mathematics and PhysicsInstitute of Formal and Applied Linguistics.

About the solution

Charles University requested improvements to the IT infrastructure used to develop the necessary language technologies. These new technologies nowadays almost exclusively use machine learning and artificial intelligence methods, which are highly computationally intensive in the learning phase and cannot be operated without specialized hardware, i.e. without a large cluster using powerful graphics cards (referred to as “GPUs”).

Within the framework of the ongoing Operational Programme Science, Research, Education “LINDAT/CLARIAH-CZ – Expansion of the repository, services and computing cluster of research infrastructure, Charles University announced a contract for the supply of servers to strengthen the application cloud for the operation of services provided in high availability mode and to increase the capacity of the fast data storage Lustre. The contract also included the delivery of several powerful laptops for application development. We have won the contract, met all the conditions and offered a solution based on servers from Hewlett Packard Enterprise (HPE).

Optimised tailor-made solutions

The solution consisted of the delivery of several different HPE ProLiant servers and HPE Apollo systems using AMD processors.

HPE ProLiant servers are at the heart of systems that automate environment management and optimize performance for a specific type of workload or job to deliver results in less time. The ProLiant server family offers versatility and a wide range of features that make it suitable for any solution – from 5G, edge and AI to hyper-converged infrastructure, containerization and various CPU and GPU architectures.

HPE Apollo systems are specifically designed to support demanding HPC computing operations (e.g., modeling or simulation). This helps companies speed up development, especially by processing large volumes of data and using digital models to simulate the real world. The systems are also adapted for artificial intelligence to optimise the system’s ability to learn and provide the highest quality output.

Servers for neural network processing
2 x HPE ProLiant XL675d/Apollo 6500 server
each of them equipped with AMD CPU (32 cores in total) and each of them equipped with 8 interconnected NVidia A100 graphics cards, 40GB in SXM4 version
110 592 CUDA cores in total
640 GB of GPU memory in total
Total FP16 power – 624 TFlops

HPE ProLiant XL675d Gen10 Plus

DATA SHEET HPE PROLIANT XL675D

HPE ProLiant DL385 Gen10 Plus server

QUICK SPECS HPE PROLIANT DL385

Hybrid CPU-GPU server
10x HPE ProLiant DL385 Gen10+
equipped with AMD CPU (Total 32 cores) and each server is also equipped with 3x NVidia A40 graphics accelerator, 48GB
322 560 CUDA cores in total
1,440 GB total GPU memory
Total FP16 power – 2,244 TFlops
High density CPU server: 2x 4-node
2x HPE Apollo n2600 Gen10+ / XL225n Gen10+ high density server
8-node total (each NODe equipped with 32 computing cores)
256 GB RAM
1x MDS server for LustreFS 4x OSS server for LustreFS
HPE ProLiant DL325 Gen10+ v2 HPE ProLiant DL325 Gen10+ v2
16 computing cores 16 computing cores
128GB RAM 128GB RAM, 30.72TB NVMe SSD capacity

HPE ProLiant XL225n Gen10 Plus

DATA SHEET HPE PROLIANT XL225N

Installation and commissioning

The implementation of this project was carried out at 2 different sites, which placed greater demands on organisational and logistical planning.

The challenge was to integrate our solution into various racks within the data halls of Charles University with respect to the required power and cooling requirements of individual servers. Thanks to our experience from different types of projects, everything went smoothly and after verifying the functionality of all components, the project was successfully handed over to the customer for deployment.

Author

Petr Plodik

Leave a comment

Your email address will not be published. Required fields are marked *