Aspects of GPU perfomance in algorithms with random memory access

Standard

Aspects of GPU perfomance in algorithms with random memory access. / Kashkovsky, Alexander V.; Shershnev, Anton A.; Vashchenkov, Pavel V.

AIP Conference Proceedings: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. ed. / Fomin. Vol. 1893 American Institute of Physics Inc., 2017. 030047 (AIP Conference Proceedings; Vol. 1893).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution › Research › peer-review

Harvard

Kashkovsky, AV, Shershnev, AA & Vashchenkov, PV 2017, Aspects of GPU perfomance in algorithms with random memory access. in Fomin (ed.), AIP Conference Proceedings: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. vol. 1893, 030047, AIP Conference Proceedings, vol. 1893, American Institute of Physics Inc., 25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017, Novosibirsk, Russian Federation, 05.06.2017. https://doi.org/10.1063/1.5007505

APA

Kashkovsky, A. V., Shershnev, A. A., & Vashchenkov, P. V. (2017). Aspects of GPU perfomance in algorithms with random memory access. In Fomin (Ed.), AIP Conference Proceedings: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS (Vol. 1893). [030047] (AIP Conference Proceedings; Vol. 1893). American Institute of Physics Inc.. https://doi.org/10.1063/1.5007505

Vancouver

Kashkovsky AV, Shershnev AA, Vashchenkov PV. Aspects of GPU perfomance in algorithms with random memory access. In Fomin, editor, AIP Conference Proceedings: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. Vol. 1893. American Institute of Physics Inc. 2017. 030047. (AIP Conference Proceedings). doi: 10.1063/1.5007505

Author

Kashkovsky, Alexander V. ; Shershnev, Anton A. ; Vashchenkov, Pavel V. / Aspects of GPU perfomance in algorithms with random memory access. AIP Conference Proceedings: Dedicated to the 60th Anniversary of the Khristianovich Institute of Theoretical and Applied Mechanics SB RAS. editor / Fomin. Vol. 1893 American Institute of Physics Inc., 2017. (AIP Conference Proceedings).

BibTeX

@inproceedings{638d68b47dd948b2bc7e218d322338e0,

title = "Aspects of GPU perfomance in algorithms with random memory access",

abstract = "The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into {"}virtual{"} blocks, resulted in 2.5 times speed up.",

author = "Kashkovsky, {Alexander V.} and Shershnev, {Anton A.} and Vashchenkov, {Pavel V.}",

year = "2017",

month = oct,

day = "26",

doi = "10.1063/1.5007505",

language = "English",

isbn = "9780735415782",

volume = "1893",

series = "AIP Conference Proceedings",

publisher = "American Institute of Physics Inc.",

editor = "Fomin",

booktitle = "AIP Conference Proceedings",

address = "United States",

note = "25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017, HEPCM 2017 ; Conference date: 05-06-2017 Through 09-06-2017",

}

RIS

TY - GEN

T1 - Aspects of GPU perfomance in algorithms with random memory access

AU - Kashkovsky, Alexander V.

AU - Shershnev, Anton A.

AU - Vashchenkov, Pavel V.

N1 - Conference code: 25

PY - 2017/10/26

Y1 - 2017/10/26

N2 - The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.

AB - The numerical code for solving the Boltzmann equation on the hybrid computational cluster using the Direct Simulation Monte Carlo (DSMC) method showed that on Tesla K40 accelerators computational performance drops dramatically with increase of percentage of occupied GPU memory. Testing revealed that memory access time increases tens of times after certain critical percentage of memory is occupied. Moreover, it seems to be the common problem of all NVidia's GPUs arising from its architecture. Few modifications of the numerical algorithm were suggested to overcome this problem. One of them, based on the splitting the memory into "virtual" blocks, resulted in 2.5 times speed up.

UR - http://www.scopus.com/inward/record.url?scp=85034272764&partnerID=8YFLogxK

UR - https://www.mendeley.com/catalogue/6ac9b6fc-f9de-3b2c-b27e-662e45eae32c/

U2 - 10.1063/1.5007505

DO - 10.1063/1.5007505

M3 - Conference contribution

AN - SCOPUS:85034272764

SN - 9780735415782

VL - 1893

T3 - AIP Conference Proceedings

BT - AIP Conference Proceedings

A2 - Fomin, null

PB - American Institute of Physics Inc.

T2 - 25th Conference on High-Energy Processes in Condensed Matter, HEPCM 2017

Y2 - 5 June 2017 through 9 June 2017

ER -

ID: 9673653