CHOICE OF A MATHEMATICAL INSTRUMENT FOR CONSTRUCTING A VECTOR TEXT MESSAGE MODEL FOR TRAINING A DEEP NEURAL NETWORK TO PREDICT UNFAVORABLE AIRCRAFT ACCIDENTS IN THE FLIGHT

Authors

  • E. Grishmanov
  • I. Zakharchenko
  • P. Berdnik
  • M. Kasyanenko

DOI:

https://doi.org/10.26906/SUNZ.2019.2.018

Keywords:

flight safety, prediction, text messaging vector model, hybrid neural network, Word2Vec, CBOW, TF-ICF

Abstract

The paper studies and selects a mathematical instrument for constructing a dictionary and a vector model of text messages for teaching a deep hybrid neural network to predict unfavorable aircraft accidents in the flight. To determine the weight values of words in text messages about unfavorable aircraft accidents in the flight during the formation of the dictionary, weighting models based on the measures TF-IDF, TF-RF and TF-ICF are analyzed. As methods of vector representation of text information,the paper analyzes: “bag of words”, latent-semantic analysis and models of vector representation, such as Word2Vec, Global Vectors (GloVe) and Doc2Vec. As a result of the analysis of these models and methods, it is proposed to use the TF-ICF measure as the basic approach to the formation of the unigram vocabulary (bigrams), and use the CBOW model as a model for the vector representation of words (word combinations).

Downloads

Download data is not yet available.

References

Григорків В.С. Нейронні мережі та їхнє використання для прогнозування тенденцій ринку нерухомості // В.С. Григорків, О.І. Ярошенко, Н.В. Філіпчук / Науковий вісник НЛТУ України. – 2012. – Вип. 22.5. – С. 328-33.

Y. Kim. Convolutional neural networks for sentence classification. arXiv:1408.5882 [cs.CL], 2014.

C. Olah. Neural networks, recurrent neural networks, convolutional neural networks. Ел. ресурс/ http://colah.github.io.htm/

Крейнес М. Г. Модели текстов и текстовых коллекций для поиска и анализа информаци // М. Г. Крейнес / Матем. модел. эколого-экономич. систем: экономика ТРУДЫ МФТИ. – 2017. – Том 9( 3). – С. 132-142.

Reed J.W., Jiao Y., Potok T.E., Klump B.A., Elmore M.T., Hurson A.R. TF-ICF: A New Term Weighting Scheme for Clustering Dynamic Data Streams // In: Proc.Machine Learning and Applications (ICMLA '06). 2006. pp. 258–263.

П.Флах Машинное обучение. Наука и искусство построения алгоритмов, которые извлекают знания из даннях / пер. с англ А.А.Слинкина. ̶ М.: ДМК Пресс, 2015. ̶ 400 с.

Mikolov T. Distributed representations of words and phrases and their compositionality / T.Mikolov, I.Sutskever, K.Chen, G.S. Corrado, J. Dean // Advances in neural information processing systems. 2013. P. 3111–3119.

Борисов Е.С.Автоматизированная обработка текстов на естественном языке, с использованием инструментов языка Python /Електронний ресурс/ http://mechanoid.kiev.ua/ml-text-proc.htm.

Published

2019-04-11