Standard
Aspects of GPU perfomance in algorithms with random memory access. / Kashkovsky, Alexander V.; Shershnev, Anton A.; Vashchenkov, Pavel V.
Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. ed. / Fomin. Vol. 1893 American Institute of Physics Inc., 2017. 030047 (AIP Conference Proceedings; Vol. 1893).
Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review
Harvard
Kashkovsky, AV, Shershnev, AA & Vashchenkov, PV 2017,
Aspects of GPU perfomance in algorithms with random memory access. in Fomin (ed.),
Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. vol. 1893, 030047, AIP Conference Proceedings, vol. 1893, American Institute of Physics Inc., 25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017, Novosibirsk, Russian Federation,
05.06.2017.
https://doi.org/10.1063/1.5007505
APA
Vancouver
Kashkovsky AV, Shershnev AA, Vashchenkov PV.
Aspects of GPU perfomance in algorithms with random memory access. In Fomin, editor, Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. Vol. 1893. American Institute of Physics Inc. 2017. 030047. (AIP Conference Proceedings). doi: 10.1063/1.5007505
Author
Kashkovsky, Alexander V. ; Shershnev, Anton A. ; Vashchenkov, Pavel V. /
Aspects of GPU perfomance in algorithms with random memory access. Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. editor / Fomin. Vol. 1893 American Institute of Physics Inc., 2017. (AIP Conference Proceedings).
BibTeX
@inproceedings{638d68b47dd948b2bc7e218d322338e0,
title = "Aspects of GPU perfomance in algorithms with random memory access",
abstract = "The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into {"}virtual{"} blocks, resulted in 2.5 times speed up.",
author = "Kashkovsky, {Alexander V.} and Shershnev, {Anton A.} and Vashchenkov, {Pavel V.}",
year = "2017",
month = oct,
day = "26",
doi = "10.1063/1.5007505",
language = "English",
volume = "1893",
series = "AIP Conference Proceedings",
publisher = "American Institute of Physics Inc.",
editor = "Fomin",
booktitle = "Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017",
address = "United States",
note = "25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017 ; Conference date: 05-06-2017 Through 09-06-2017",
}
RIS
TY - GEN
T1 - Aspects of GPU perfomance in algorithms with random memory access
AU - Kashkovsky, Alexander V.
AU - Shershnev, Anton A.
AU - Vashchenkov, Pavel V.
PY - 2017/10/26
Y1 - 2017/10/26
N2 - The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.
AB - The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.
UR - http://www.scopus.com/inward/record.url?scp=85034272764&partnerID=8YFLogxK
U2 - 10.1063/1.5007505
DO - 10.1063/1.5007505
M3 - Conference contribution
AN - SCOPUS:85034272764
VL - 1893
T3 - AIP Conference Proceedings
BT - Proceedings of the XXV Conference on High-Energy Processes in Condensed Matter, HEPCM 2017
A2 - Fomin, null
PB - American Institute of Physics Inc.
T2 - 25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017
Y2 - 5 June 2017 through 9 June 2017
ER -