ACCELERATED ALGORITHM FOR WORD SEARCH IN TEXT WITH ADAPTIVE DECOMPOSITION OF THE OUTPUT

Authors

  • О. Yu. Barkovska
  • D. I. Pyvovarova
  • V. S. Serdechnyi
  • А. А. Liashova

DOI:

https://doi.org/10.26906/SUNZ.2019.4.028

Keywords:

word-image, parallelization, high-performance computing system, Boyer-Moore algorithm

Abstract

Word search algorithms in text are widely used - contextual search in databases and databases, bibliographic search, search for a text fragment and its replacement when editing, in data compression tasks, prediction algorithms, etc., which determines the urgency of developing new algorithms , as well as improving and adapting existing algorithms for implementation on highperformance computers. The purpose of the research is to modify the Boyer-Moore algorithm to search for word-images in text to achieve reduced text search time by using parallel computation methods and decomposing raw data. Results and conclusions. In the course of the work, the existing Boyer-Moore algorithm for word search in text has been improved by the use of parallel computation methods and decomposition of raw data. An overview of the existing word search algorithms in the text was performed, which showed the lowest complexity of the Boyer-Moore algorithm. Two accelerated modifications of the Boyer-Moore algorithm with simple and adaptive data decomposition are developed. Analysis of the results showed that the number of false positives in adaptive decomposition tends to 0, as opposed to simple decomposition of the input data. An analysis of the execution time of the algorithms showed that on small volumes of the source text, the use of parallel technologies for systems with shared memory is not justified, since the time for generation of parallel flows is spent more than for comparative operations.

Downloads

References

Левин М.П. Параллельное программирование с использованием OpenMP / М.П. Левин. – БИНОМ. Лаборатория знаний, Интернет-университет информационных технологий, 2008. – 120с.

Антонов А.С. Параллельное программирование с использованием технологии OpenMP. – М.: МГУ, 2009. – 77 с.

Эндрюс Г. Р. Основы многопоточного, параллельного и распределенного программирования / Г.Р. Эндрюс. – М.: Издательский дом "Вильямс", 2003. – 512 с.

Миллер Р. Последовательные и параллельные алгоритмы: Общий подход / Р. Миллер, Л. Боксер. – БИНОМ. Лаборатория знаний, 2006. – 406 с.

V. Kharchenko, A. Kovalenko, O. Siora, V. Sklyar, "Security assessment of FPGA-based safety-critical systems: US NRC requirements context", Proceedings of IEEE Information and Digital Technologies Conference (IDT), pp. 132-138, July 2015. DOI: http://doi.org/10.1109/DT.2015.7222963

Kuchuk G., Kovalenko A., Komari I.E., Svyrydov A., Kharchenko V. Improving big data centers energy efficiency: Traffic based model and method. Studies in Systems, Decision and Control, vol 171. Kharchenko, V., Kondratenko, Y., Kacprzyk, J. (Eds.). Springer Nature Switzerland AG, 2019. Pp. 161-183. DOI: http://doi.org/10.1007/978-3-030-00253-4_8

Kharchenko, V., Illiashenko, O., Kovalenko, A., Sklyar, V., Boyarchuk, A. Security informed safety assessment of NPP I&C systems: GAP-IMECA technique. Proc. 22th Int. Conf. on Nuclear Engineering (ICONE), Prague, Republic, 7-11 Jul. 2014, pp. 1-9. doi: http://doi.org/10.1115/ICONE22-31175.

Кучук Г.А. Метод мінімізації середньої затримки пакетів у віртуальних з’єднаннях мережі підтримки хмарного сервісу / Г.А. Кучук, А.А. Коваленко, Н.В. Лукова-Чуйко // Системи управління, навігації та зв’язку. – Полтава . ПНТУ, 2017. – Вип. 2(42). – С. 117-120.

Z. Xiong, "A Composite Boyer-Moore Algorithm for the String Matching Problem," 2010 International Conferenceon Parallel and Distributed Computing, Applicationsand Technologies, Wuhan, 2010, pp. 492-496.doi: 10.1109/PDCAT.2010.58

A. Domínguez, P. P. Carballoand A. Núñez, "Programmable SoC platform for deep packet inspection using enhanced BoyerMoore algorithm," 2017 12th International Symposiumon Reconfigurable Communication-centric Systems-on-Chip (ReCoSoC), Madrid, 2017, pp. 1-8. doi: 10.1109/ReCoSoC.2017.8016159

Y. Jeong, M. Lee, D. Nam, J. Kimand S. Hwang, "High Performance Parallelization of Boyer-Moore Algorithmon ManyCore Accelerators," 2014 International Conference on Cloud and Autonomic Computing, London, 2014, pp. 265-272. doi: 10.1109/ICCAC.2014.20

X. Zhaand S. Sahni, "GPU-to-GPU and Host-to-Host Multipattern String Matching on a GPU," in IEEE Transactionson Computers, vol. 62, no. 6, pp. 1156-1169, June 2013. doi: 10.1109/TC.2012.61

V. Gupta, M. Singhand V. K. Bhalla, "Pattern matching algorithms for intrusion detection and prevention system: A comparative analysis," 2014 International Conferenceon Advancesin Computing, Communications and Informatics (ICACCI), NewDelhi, 2014, pp. 50-54. doi: 10.1109/ICACCI.2014.6968595

Y. Wangand H. Kobayashi, "AnImproved Technology for Content Matching Intrusion Detection System," 2006 Int. Conf. on Software in Telecommunications and Computer Networks, Split, 2006, pp. 238-241. doi: 10.1109/SOFTCOM.2006.329755

V. R. Krishna and P. V. Kumar, "Optimal pattern search for sequence databases," 2010 3rd Int. Conf. on Advanced Computer Theory and Engineering (ICACTE), Chengdu, 2010, pp. V2-654-V2-658. doi: 10.1109/ICACTE.2010.5579518

P. Lin, S. Liu, L. Zhangand P. Huang, "Compressed Patter nMatchingin DNA Sequences Using Multithreaded Technology," 2009 3rd International Conference on Bioinformatics and Biomedical Engineering, Beijing, 2009, pp. 1-4. doi: 10.1109/ICBBE.2009.5162550

Published

2019-09-11