Standard

Wav2vec2 Without Attention: Do You Need Hopfield Networks for Self-Supervised Learning of Speech Representations? / Grebenkin, D.; Bondarenko, I.

In: Journal of Mathematical Sciences (United States), Vol. 285, No. 1, 10.2024, p. 28-35.

Research output: Contribution to journalArticlepeer-review

Harvard

APA

Vancouver

Grebenkin D, Bondarenko I. Wav2vec2 Without Attention: Do You Need Hopfield Networks for Self-Supervised Learning of Speech Representations? Journal of Mathematical Sciences (United States). 2024 Oct;285(1):28-35. doi: 10.1007/s10958-024-07420-6

Author

Grebenkin, D. ; Bondarenko, I. / Wav2vec2 Without Attention: Do You Need Hopfield Networks for Self-Supervised Learning of Speech Representations?. In: Journal of Mathematical Sciences (United States). 2024 ; Vol. 285, No. 1. pp. 28-35.

BibTeX

@article{b0168e1d4e004bbf9d3d1ced6bf9c5ac,
title = "Wav2vec2 Without Attention: Do You Need Hopfield Networks for Self-Supervised Learning of Speech Representations?",
abstract = "In this work, we consider the possibility of replacing multi-head attention with dense associative memory (DAM) layers in the wav 2vec2 automatic speech recognition algorithm. We examine the hypothesis that the concept of modern Hopfield networks is more suitable for restoration of missing fragments of the audio signal task and speech-to-text task than multi-head attention. Our experiments indicate that the model with the new architecture allows to improve the quality of speech recognition and can be used for pretraining the models on a large amount of audio data.",
author = "D. Grebenkin and I. Bondarenko",
year = "2024",
month = oct,
doi = "10.1007/s10958-024-07420-6",
language = "English",
volume = "285",
pages = "28--35",
journal = "Journal of Mathematical Sciences (United States)",
issn = "1072-3374",
publisher = "Springer Nature",
number = "1",

}

RIS

TY - JOUR

T1 - Wav2vec2 Without Attention: Do You Need Hopfield Networks for Self-Supervised Learning of Speech Representations?

AU - Grebenkin, D.

AU - Bondarenko, I.

PY - 2024/10

Y1 - 2024/10

N2 - In this work, we consider the possibility of replacing multi-head attention with dense associative memory (DAM) layers in the wav 2vec2 automatic speech recognition algorithm. We examine the hypothesis that the concept of modern Hopfield networks is more suitable for restoration of missing fragments of the audio signal task and speech-to-text task than multi-head attention. Our experiments indicate that the model with the new architecture allows to improve the quality of speech recognition and can be used for pretraining the models on a large amount of audio data.

AB - In this work, we consider the possibility of replacing multi-head attention with dense associative memory (DAM) layers in the wav 2vec2 automatic speech recognition algorithm. We examine the hypothesis that the concept of modern Hopfield networks is more suitable for restoration of missing fragments of the audio signal task and speech-to-text task than multi-head attention. Our experiments indicate that the model with the new architecture allows to improve the quality of speech recognition and can be used for pretraining the models on a large amount of audio data.

UR - https://www.scopus.com/record/display.uri?eid=2-s2.0-85208801879&origin=inward&txGid=e7bf0a610121a8caecfc1191f85cf728

UR - https://www.mendeley.com/catalogue/27435cff-6d49-3e53-b0c3-31a36aed60c8/

U2 - 10.1007/s10958-024-07420-6

DO - 10.1007/s10958-024-07420-6

M3 - Article

VL - 285

SP - 28

EP - 35

JO - Journal of Mathematical Sciences (United States)

JF - Journal of Mathematical Sciences (United States)

SN - 1072-3374

IS - 1

ER -

ID: 61122792