Яндекс Метрика
Языковая модель

Multipop Adaptive Continuous Stack (WT2)

DeepMind,University of Oxford
Языковое моделирование

Инновационная архитектура от DeepMind, использующая непрерывную стековую память для прокачки рекуррентных нейросетей. Этот ИИ-подход значительно снижает уровень перплексии в языковом моделировании, доказывая превосходство стековой памяти над обычными методами.

We compare and analyze sequential, random access, and stack memory architectures for recurrent neural network language models. Our experiments on the Penn Treebank and Wikitext-2 datasets show that stack-based memory architectures consistently achieve the best performance in terms of held out perplexity. We also propose a generalization to existing continuous stack models (Joulin & Mikolov,2015; Grefenstette et al., 2015) to allow a variable number of pop operations more naturally that further improves performance. We further evaluate these language models in terms of their ability to capture non-local syntactic dependencies on a subject-verb agreement dataset (Linzen et al., 2016) and establish new state of the art results using memory augmented language models. Our results demonstrate the value of stack-structured memory for explaining the distribution of words in natural language, in line with linguistic theories claiming a context-free backbone for natural language.

Что такое Multipop Adaptive Continuous Stack (WT2)?+
Кто разработал Multipop Adaptive Continuous Stack (WT2)?+
Какие задачи решает Multipop Adaptive Continuous Stack (WT2)?+