Whit Schonbein and Scott Levy from Sandia National Laboratories
Date
Tuesday April 9, 20241:30 pm - 2:30 pm
Location
69 Union St, Mitchell Hall, Room 395Supercomputing at Sandia Overview
The talk will open with an overview of Sandia National Laboratories by Whit Schonbein. Sandia is one of the largest research laboratories in the world, and their use of Supercomputing over many years. This is meant to be very accessible and demonstrate to the audience the many uses of supercomputing and its emerging use in AI.
Leveraging high-performance data transfer to offload data management and analysis tasks to DPUs
Network interface controllers (NICs) with general-purpose compute capabilities (SmartNICs) present an opportunity for reducing host application overheads by offloading tasks that are not central to the execution of the target application to a SmartNIC. In this talk, we will discuss the role of SmartNICs in high performance computing (HPC) and then describe our approach for leveraging SmartNICs to offload data management and analysis tasks.
Data management and analysis plays a critical role in our application workflows. Our applications generate enormous amounts of complex data that require analysts to apply powerful tools for analysis and visualization. Offloading tasks associated with these tools to a SmartNIC has the potential to free up host resources that can be exploited to advance the target application. Effectively offloading tasks from host to SmartNIC also requires the host data associated with the offloaded computation to be transferred to the SmartNIC. To address this need, we introduce a high-performance, general-purpose data movement service that facilitates the offloading of tasks to SmartNICs: the SmartNIC Data Movement Service (SDMS). SDMS provides near-line-rate transfer bandwidths between the host and NIC with minimal host involvement. Moreover, SDMS’s In-transit Data Placement (IDP) feature can reduce (or even eliminate) the cost of serializing data on the NIC by performing the necessary data formatting during the transfer. We also present an in-depth case study based on Apache Arrow to demonstrate how SDMS can used effectively to offload data transformation operations.