The peculiarities of the parallel implementation of particle-in-cell method

Standard

The peculiarities of the parallel implementation of particle-in-cell method. / Romanenko, A. A.; Snytnikov, A. V.

в: Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki, Том 28, № 3, 01.01.2018, стр. 419-426.

Результаты исследований: Научные публикации в периодических изданиях › статья › Рецензирование

Harvard

Romanenko, AA & Snytnikov, AV 2018, 'The peculiarities of the parallel implementation of particle-in-cell method', Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki, Том. 28, № 3, стр. 419-426. https://doi.org/10.20537/vm180311

APA

Romanenko, A. A., & Snytnikov, A. V. (2018). The peculiarities of the parallel implementation of particle-in-cell method. Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki, 28(3), 419-426. https://doi.org/10.20537/vm180311

Vancouver

Romanenko AA, Snytnikov AV. The peculiarities of the parallel implementation of particle-in-cell method. Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki. 2018 янв. 1;28(3):419-426. doi: 10.20537/vm180311

Author

Romanenko, A. A. ; Snytnikov, A. V. / The peculiarities of the parallel implementation of particle-in-cell method. в: Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki. 2018 ; Том 28, № 3. стр. 419-426.

BibTeX

@article{ebb7b1046be244598affb0aab3f997d2,

title = "The peculiarities of the parallel implementation of particle-in-cell method",

abstract = "Particle-In-Cell (PIC) method is widely used for plasma simulation and the GPUs appear to be the most efficient way to run this method. In this work we propose a technique that enables one to speed up one of the most time-consuming operations in the GPU implementation of the PIC method. The operation is particle reordering, or redistribution of particles between cells, which is performed after pushing. The reordering operation provides data locality which is the key performance issue of the PIC method. We propose to divide the reordering into two stages. First, gather the particles that are going to leave a particular cell into arrays, the number of arrays being equal to the number of neighbor cells (26 for 3D case). Second, each neighbor cell copies the particles from the necessary array to its own particle array. The second operation is done in 26 threads independently with no synchronization or waiting and involves no critical sections, semaphores, mutexes, atomic operations etc. It results in the more than 10 times reduction of the reordering time compared to the straightforward reordering algorithm.",

keywords = "GPU, Optimization, PIC, Simulation",

author = "Romanenko, {A. A.} and Snytnikov, {A. V.}",

year = "2018",

month = jan,

day = "1",

doi = "10.20537/vm180311",

language = "English",

volume = "28",

pages = "419--426",

journal = "Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki",

issn = "1994-9197",

publisher = "Udmurt State University",

number = "3",

}

RIS

TY - JOUR

T1 - The peculiarities of the parallel implementation of particle-in-cell method

AU - Romanenko, A. A.

AU - Snytnikov, A. V.

PY - 2018/1/1

Y1 - 2018/1/1

N2 - Particle-In-Cell (PIC) method is widely used for plasma simulation and the GPUs appear to be the most efficient way to run this method. In this work we propose a technique that enables one to speed up one of the most time-consuming operations in the GPU implementation of the PIC method. The operation is particle reordering, or redistribution of particles between cells, which is performed after pushing. The reordering operation provides data locality which is the key performance issue of the PIC method. We propose to divide the reordering into two stages. First, gather the particles that are going to leave a particular cell into arrays, the number of arrays being equal to the number of neighbor cells (26 for 3D case). Second, each neighbor cell copies the particles from the necessary array to its own particle array. The second operation is done in 26 threads independently with no synchronization or waiting and involves no critical sections, semaphores, mutexes, atomic operations etc. It results in the more than 10 times reduction of the reordering time compared to the straightforward reordering algorithm.

AB - Particle-In-Cell (PIC) method is widely used for plasma simulation and the GPUs appear to be the most efficient way to run this method. In this work we propose a technique that enables one to speed up one of the most time-consuming operations in the GPU implementation of the PIC method. The operation is particle reordering, or redistribution of particles between cells, which is performed after pushing. The reordering operation provides data locality which is the key performance issue of the PIC method. We propose to divide the reordering into two stages. First, gather the particles that are going to leave a particular cell into arrays, the number of arrays being equal to the number of neighbor cells (26 for 3D case). Second, each neighbor cell copies the particles from the necessary array to its own particle array. The second operation is done in 26 threads independently with no synchronization or waiting and involves no critical sections, semaphores, mutexes, atomic operations etc. It results in the more than 10 times reduction of the reordering time compared to the straightforward reordering algorithm.

KW - GPU

KW - Optimization

KW - PIC

KW - Simulation

UR - http://www.scopus.com/inward/record.url?scp=85055286944&partnerID=8YFLogxK

U2 - 10.20537/vm180311

DO - 10.20537/vm180311

M3 - Article

AN - SCOPUS:85055286944

VL - 28

SP - 419

EP - 426

JO - Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki

JF - Vestnik Udmurtskogo Universiteta: Matematika, Mekhanika, Komp'yuternye Nauki

SN - 1994-9197

IS - 3

ER -

ID: 17250044