Normal view MARC view ISBD view

Comparison of HPC architectures for computing all-pairs shortest paths : Intel Xeon Phi KNL vs NVIDIA Pascal

By:

Costanzo, Manuel

Contributor(s):

Material type: Article

ArticleDescription: 1 archivo (466,1 kB)Subject(s):

Online resources:

Click here to access online

Summary: Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared. Selecting the Floyd-Warshall algorithm as a representative case of graph and memory-bound applications, optimized implementations were developed to analyze and compare performance and energy efficiency on both devices. As it was expected, Xeon Phi showed superior when considering double-precision data. However, contrary to what was considered in our preliminary analysis, it was found that the performance and energy efficiency of both devices were comparable using single-precision datatype.

Average rating: 0.0 (0 votes)

Holdings ( 1 )
Title notes ( 3 )

Holdings
Item type	Home library	Collection	Call number	URL	Status	Date due	Barcode
Capítulo de libro	Biblioteca de la Facultad de Informática	Biblioteca digital	A1179 (Browse shelf(Opens below))	Link to resource	No corresponde

Formato de archivo PDF. -- Este documento es producción intelectual de la Facultad de Informática - UNLP (Colección BIPA/Biblioteca)

Today, one of the main challenges for high-performance computing systems is to improve their performance by keeping energy consumption at acceptable levels. In this context, a consolidated strategy consists of using accelerators such as GPUs or many-core Intel Xeon Phi processors. In this work, devices of the NVIDIA Pascal and Intel Xeon Phi Knights Landing architectures are described and compared. Selecting the Floyd-Warshall algorithm as a representative case of graph and memory-bound applications, optimized implementations were developed to analyze and compare performance and energy efficiency on both devices. As it was expected, Xeon Phi showed superior when considering double-precision data. However, contrary to what was considered in our preliminary analysis, it was found that the performance and energy efficiency of both devices were comparable using single-precision datatype.

Congreso Argentino de Ciencias de la Computación (CACIC) (26to : 2020 : San Justo, Argentina)