RECURRENT NEURAL NETWORK FOR PROCESSING LARGE TEXT DATA

Authors

  • A. Nguyen
  • Y. Sidorov

DOI:

https://doi.org/10.26906/SUNZ.2018.4.135

Keywords:

neural networks, recurrent networks, program libraries, large text data

Abstract

The subject of the article is neural networks, namely recurrent neural networks, characterized by the ability to store long-term data dependencies, as well as software libraries for machine learning implementation. The aim of the work is to analyze the Hopfield neural network, Elman and Jordan networks, echo network, recursive network and recurrent network with long short-term memory to directly determine the most efficient network architecture. And also this paper contains analysis of the following software libraries: CNTK, Theano, Gluon, TensorFlow. Tasks: to compare the software tools pros and cons of application and the possibilities of working with large text data using neural networks described above, to determine which of the software libraries under consideration is optimal and performance-effective for the development of a recurrent neural network. The method of the study is load testing of software frameworks in the same hardware conditions, using the same set of data. The results of the work are following: the technology integration platform is picked to be an application for large-scale text data processing and its summarization, namely the project of an interactive writing environment created using .NET, which allows to automatically summarize the text according to certain criteria. To analyze the performance of software libraries, an examination of a premade benchmark-test has took place. A benchmark test, at its core, is based on the training and use of recurrent networks with LSTM-modules on a test set of data, using all of the considered frameworks. Conclusion: As the most optimal architectural approach, it is worth considering using an LSTM-modules, which solves the problem of a vanishing gradient. Thanks to this, neural networks based on this approach show the best results when working with long-term dependencies in data, which is an extremely important factor in a text data processing. According to the results of performance tests, we might consider using CNTK and Gluon as they appear to be the most optimized solution for LSTMunits applying. When training, they demonstrate a speed that exceeds the TensorFlow and Theano performance by 10-60%.

Downloads

References

Levy Omer, Yoav Goldberg, Ido Dagan. Improving distributional similarity with lessons learned from word embeddings. Transactions of the Association for Computational Linguistics 3. — 2015. — P. 211–225.

Дональд Мичи. Machine Learning, Neural and Statistical Classification. 1994г. Лос-Аламос. с.120-145

Christopher Manning. Computational linguistics and deep learning. Computational Linguistics. — 2016.

Кристофер Бишоп. Pattern Recognition and Machine Learning. 2006г. Массачусетс. с.201-215

Кевин Мерфи. Machine Learning: A Probabilistic Perspective. 2012г. Технический колледж Нью-Йорка. с.65-82

Published

2018-09-12