Pantykhin Andrey Maksimovich (Peter the Great St. Petersburg Polytechnic University)
Gladun Vladimir Vadimovich (Peter the Great St. Petersburg Polytechnic University )
Malinin Ilya Igorevich (Peter the Great St. Petersburg Polytechnic University )
Molodyakov Sergey Aleksandrovich (Doctor of technical Sciences, Professor
Peter the Great St. Petersburg Polytechnic University
This paper examines a deduplication service based on hash functions to minimize the volume of stored data. The main algorithm involves segmenting data into fixed-size blocks, calculating unique hash values for each segment, and saving only unique data blocks while creating references for duplicates. The technological stack includes Python, MongoDB, and the MongoEngine library. The paper presents research results related to the use of different hashing algorithms and data segment sizes.
Keywords:data deduplication, hash functions, data storage systems, storage optimization, MongoDB, Python, MongoEngine
Read the full article …
Citation link: Pantykhin A. M., Gladun V. V., Malinin I. I., Molodyakov S. A. DEVELOPMENT AND RESEARCH OF A DATA DEDUPLICATION SERVICE FOR STORAGE SYSTEMS // Современная наука: актуальные проблемы теории и практики. Серия: Естественные и Технические Науки. -2024. -№07. -С. 118-123 DOI 10.37882/2223-2966.2024.7.32 |