ANALYSIS OF THE FUNCTIONING OF DISTRIBUTED DATA PROCESSING AND STORAGE SYSTEMS

Authors

  • Anton Bilokon
  • Stanislav Borisov
  • Maxim Usatenko
  • Volodymyr Fedorchenko

DOI:

https://doi.org/10.26906/SUNZ.2024.3.084

Keywords:

distributed information system, data processing, Web-Scale, data storage, data processing center, event-driven architectures

Abstract

Relevance. As the amount of data generated by users, IoT devices, social media, business processes, etc. grows, the need for scalable storage solutions becomes more and more evident. Distributed systems allow for efficient scaling, providing an increase in storage volumes and computing power without significant costs. Modern business requires high availability and reliability of systems, because even minimal downtime can lead to significant financial losses and a decrease in customer confidence. Distributed systems provide high availability and resilience by automatically recovering from failures and replicating data to ensure data integrity. Globalization of business requires working with data in different geographical locations. Distributed systems allow localizing data storage and pro-cessing closer to end users, reducing delays and increasing the overall performance of systems. Increasing security threats and in-creasing regulatory requirements for data protection are forcing organizations to look for more secure data storage solutions. Dis-tributed systems offer enhanced capabilities for data encryption, access control, auditing, and regulatory compliance. Processing large amounts of data often requires large computing power. Distributed data storage systems are ideal for working together with distributed computing, such as streaming data processing, machine learning, big data, allowing to efficiently distribute tasks and process large amounts of information. Among the challenges that distributed data storage systems may face are - ensuring data con-sistency between nodes, network delay management, data protection and security. Various strategies and technologies are used to address these challenges, including consistent hashing algorithms, data replication, transaction protocols with guaranteed atomicity, consistency, isolation, and durability, and sequential consistency models. Thus, in conditions of constant growth of data volumes and increasing requirements for their processing, distributed data storage systems are a key element of the infrastructure of any or-ganization striving for innovation and efficiency. The purpose of this work is to analyze the functioning of distributed data processing and storage systems. The object of research is distributed data processing and storage systems. The subject of research is architectural solutions of distributed data storage and processing systems. The results. The analysis of the functioning of distributed data processing and storage systems was carried out. The choice of an architectural solution for a distributed system depends on the specifics of the tasks to be solved, requirements for performance, scalability, reliability and availability. Usually, effective distributed systems use a combination of these approaches to achieve optimal results. Choosing an architectural solution for distributed systems is a complex process that requires balancing technical, business, and operational requirements. Consideration of future growth, potential challenges and flexibility of the system are key factors to ensure its long-term success.

Downloads

References

Tanenbaum, Andrew S., and Maarten van Steen. Distributed Systems: Principles and Paradigms. 3rd ed., Pearson, 2017.p.17-30.

Thomas Erl, Benjamin Carlyle, Cesare Pautasso, Raj Balasubramanian. SOA with REST. – Prentice Hall, 2013.

Kovalenko, A., Kuchuk, H., Kuchuk, N. and Kostolny, J. (2021), “Horizontal scaling method for a hyperconverged network”, 2021 Int. Conf. on Information and Digital Technologies (IDT), Zilina, doi: https://doi.org/10.1109/IDT52577.2021.9497534

Panda, Dhabaleswar K.; Sayantan Sur. Network Speed Acceleration with IB and HSE. Designing Cloud and Grid Computing Systems with InfiniBand and High-Speed Ethernet. Newport Beach, CA, USA: CCGrid – 2011. p. 23.

HWM Singapore. An introduction to network attached storage. SPH Magazines – 2003. pp. 90–92.

Wang, Yandong; Goldstone, Robin; Yu, Weikuan; Wang, Teng. Characterization and Optimization of Memory-Resident MapReduce on HPC Systems. IEEE 28th International Parallel and Distributed Processing Symposium. IEEE – 2014. pp. 799–808. doi:10.1109/IPDPS.2014.87

Published

2024-09-06