
(2) * Yeni Herdiyeni

(3) Agus Buono

(4) Karlisa Priandana

(5) Iskandar Zulkarnaen Siregar

(6) Wisnu Ananta Kusuma

*corresponding author
AbstractDigital data encoding is crucial for communication and data storage, but conventional techniques, such as ASCII and binary coding, have drawbacks in terms of processing speed and storage capacity. A potential substitute with parallel processing and high-capacity storage is DNA-based data encoding. The goal of this research is to develop a digital data encoding technique based on DNA, while considering biological constraints such as homopolymer and GC-content. The process involves converting image pixel values into binary format, followed by encoding into DNA sequences, ensuring they meet biological constraints. The validity of the resulting DNA sequences is assessed through transcription and translation processes. Additionally, Multiple Sequence Alignment analysis is conducted to compare the similarities between the encoded DNA sequences. The results indicate that the DNA sequences from MNIST images share similar characteristics, reflected in the phylogenetic tree's close clustering. Multiple Sequence Alignment analysis shows that biological constraints successfully preserved the core visual features, allowing accurate clustering. However, this method also faces drawbacks, particularly in the reduction of visual information and sensitivity to changes in image intensity. Despite these challenges, DNA-based encoding shows potential for digital image representation. Further development, particularly the integration of deep learning, could lead to more efficient, secure, and sustainable data storage systems, especially for image data.
KeywordsBiological Constraint; Digital Image Encoding; DNA Data Storage; MNIST Dataset; Multiple Sequence Allignment
|
DOIhttps://doi.org/10.26555/ijain.v11i3.1747 |
Article metricsAbstract views : 1423 | PDF views : 43 |
Cite |
Full Text![]() |
References
[1] A. Bessa-Silva, “Fasta2Structure: a user-friendly tool for converting multiple aligned FASTA files to STRUCTURE format,” BMC Bioinformatics, vol. 25, no. 1, p. 73, Feb. 2024, doi: 10.1186/s12859-024-05697-7.
[2] X. Zhang and F. Zhou, “An Encoding Table Corresponding to ASCII Codes for DNA Data Storage and a New Error Correction Method HMSA,” IEEE Trans. Nanobioscience, vol. 23, no. 2, pp. 344–354, Apr. 2024, doi: 10.1109/TNB.2024.3356522.
[3] C. Zhang et al., “The Historical Evolution and Significance of Multiple Sequence Alignment in Molecular Structure and Function Prediction,” Biomolecules, vol. 14, no. 12, p. 1531, Nov. 2024, doi: 10.3390/biom14121531.
[4] E. Upenik, D. Lazzarotto, M. Testolina, and T. Ebrahimi, “On the performance of learning-based image compression as source coding for JPEG DNA,” in Applications of Digital Image Processing XLVII, Sep. 2024, vol. 13137, p. 31, doi: 10.1117/12.3031848.
[5] W. Wu, L. Xiang, Q. Liu, and K. Yang, “Deep Joint Source-Channel Coding for DNA Image Storage: A Novel Approach With Enhanced Error Resilience and Biological Constraint Optimization,” IEEE Trans. Mol. Biol. Multi-Scale Commun., vol. 9, no. 4, pp. 461–471, Dec. 2023, doi: 10.1109/TMBMC.2023.3331579.
[6] I. Preuss, M. Rosenberg, Z. Yakhini, and L. Anavy, “Efficient DNA-based data storage using shortmer combinatorial encoding,” Sci. Rep., vol. 14, no. 1, p. 7731, Apr. 2024, doi: 10.1038/s41598-024-58386-z.
[7] P. M. Schwarz and B. Freisleben, “Data recovery methods for DNA storage based on fountain codes,” Comput. Struct. Biotechnol. J., vol. 23, pp. 1808–1823, Dec. 2024, doi: 10.1016/j.csbj.2024.04.048.
[8] D. Nachtigall Lazzarotto, J. Encinas Ramos, M. Testolina, and T. Ebrahimi, “Storing images and point clouds on DNA support with fountain codes,” in Applications of Digital Image Processing XLVII, Sep. 2024, vol. 13137, p. 39, doi: 10.1117/12.3030612.
[9] L. Li, “Image encryption algorithm based on hyperchaos and DNA coding,” IET Image Process., vol. 18, no. 3, pp. 627–649, Feb. 2024, doi: 10.1049/ipr2.12974.
[10] Zeenath, K. DurgaDevi, and J. W. Carey M, “An Efficient Image Encryption Scheme for Medical Image Security,” Int. J. Electr. Electron. Res., vol. 12, no. 3, pp. 964–976, Aug. 2024, doi: 10.37391/ijeer.120330.
[11] T. Heinis, R. Sokolovskii, and J. J. Alnasir, “Survey of Information Encoding Techniques for DNA,” ACM Comput. Surv., vol. 56, no. 4, pp. 1–30, Apr. 2024, doi: 10.1145/3626233.
[12] J. H. D. B. Gervasio, H. da Costa Oliveira, A. G. da Costa Martins, J. B. Pesquero, B. M. Verona, and N. N. P. Cerize, “How close are we to storing data in DNA?,” Trends Biotechnol., vol. 42, no. 2, pp. 156–167, Feb. 2024, doi: 10.1016/j.tibtech.2023.08.001.
[13] T. Buko, N. Tuczko, and T. Ishikawa, “DNA Data Storage,” BioTech, vol. 12, no. 2, p. 44, Jun. 2023, doi: 10.3390/biotech12020044.
[14] S. Wang, X. Mao, F. Wang, X. Zuo, and C. Fan, “Data Storage Using DNA,” Adv. Mater., vol. 36, no. 6, p. 2307499, Feb. 2024, doi: 10.1002/adma.202307499.
[15] W. Alexan, E. Mamdouh, A. Aboshousha, Y. S. Alsahafi, M. Gabr, and K. M. Hosny, “Stegocrypt: A robust tri‐stage spatial steganography algorithm using TLM encryption and DNA coding for securing digital images,” IET Image Process., vol. 18, no. 13, pp. 4189–4206, Nov. 2024, doi: 10.1049/ipr2.13242.
[16] B. Cao et al., “Efficient data reconstruction: The bottleneck of large-scale application of DNA storage,” Cell Rep., vol. 43, no. 4, p. 113699, Apr. 2024, doi: 10.1016/j.celrep.2024.113699.
[17] K. O. Mohammed Aarif, V. Mohammed Yousuf Hasan, A. Alam, K. Shoukath Ali, and B. Pakruddin, “Decoding DNA: Deep learning’s impact on genomic exploration,” in Deep Learning in Genetics and Genomics, Elsevier, 2025, pp. 77–95, doi: 10.1016/B978-0-443-27574-6.00005-9.
[18] A. Usmani and L. Wiese, “DNA-Based Storage of RDF Graph Data: A Futuristic Approach to Data Analytics,” IEEE Access, vol. 11, pp. 129931–129944, 2023, doi: 10.1109/ACCESS.2023.3332254.
[19] Yixun Wei, “Enlarge Practical DNA Storage Capacity: The Challenge and The Methodology,” University Of Minnesota, pp. 1-24, 2023. [Online]. Available at: https://www.proquest.com/openview/551d0656f073ab423c2fb8f763c09470/1?pq-origsite=gscholar&cbl=18750&diss=y.
[20] L. Yunfei and Z. Xuncai, “Highly Robust DNA Data Storage Based on Controllable GC Content and homopolymer of 64-Element Coded Tables,” bioRxiv. pp. 2023–2029, Sep. 29, 2023, doi: 10.1101/2023.09.27.559852.
[21] D. Landsman and K. Strauss, “The DNA Data Storage Model,” Computer (Long. Beach. Calif)., vol. 56, no. 7, pp. 78–85, Jul. 2023, doi: 10.1109/MC.2023.3272188.
[22] X. Li, M. Chen, and H. Wu, “Multiple errors correction for position-limited DNA sequences with GC balance and no homopolymer for DNA-based data storage,” Brief. Bioinform., vol. 24, no. 1, pp. 1–11, Jan. 2023, doi: 10.1093/bib/bbac484.
[23] M. B. S. Al-Shuhaib and H. O. Hashim, “Mastering DNA chromatogram analysis in Sanger sequencing for reliable clinical analysis,” J. Genet. Eng. Biotechnol., vol. 21, no. 1, p. 115, Dec. 2023, doi: 10.1186/s43141-023-00587-6.
[24] Y. Liu, X. He, and X. Tang, “Capacity-Achieving Constrained Codes with GC-Content and Runlength Limits for DNA Storage,” in 2022 IEEE International Symposium on Information Theory (ISIT), Jun. 2022, vol. 2022-June, pp. 198–203, doi: 10.1109/ISIT50566.2022.9834494.
[25] B. Pei et al., “A Novel DNA‐Based Dual‐Mode Data Storage System with Interrelated Concise and Detailed Data,” Small Sci., vol. 4, no. 11, p. 2400094, Nov. 2024, doi: 10.1002/smsc.202400094.
[26] S. Jo, H. Shin, S. Joe, D. Baek, C. Park, and H. Chun, “Recent progress in DNA data storage based on high-throughput DNA synthesis,” Biomed. Eng. Lett., vol. 14, no. 5, pp. 993–1009, Sep. 2024, doi: 10.1007/s13534-024-00386-z.
[27] Q. Huang et al., “Emerging preservation materials for long-term DNA-based data storage,” Chem. Eng. J., vol. 509, p. 161245, Apr. 2025, doi: 10.1016/j.cej.2025.161245.
[28] A. Doricchi et al., “Emerging Approaches to DNA Data Storage: Challenges and Prospects,” ACS Nano, vol. 16, no. 11, pp. 17552–17571, Nov. 2022, doi: 10.1021/acsnano.2c06748.
[29] Y. Zheng, B. Cao, X. Zhang, S. Cui, B. Wang, and Q. Zhang, “DNA-QLC: an efficient and reliable image encoding scheme for DNA storage,” BMC Genomics, vol. 25, no. 1, p. 266, Mar. 2024, doi: 10.1186/s12864-024-10178-5.
[30] H. Du, S. Zhou, W. Yan, and S. Wang, “Study on DNA Storage Encoding Based IAOA under Innovation Constraints,” Curr. Issues Mol. Biol., vol. 45, no. 4, pp. 3573–3590, Apr. 2023, doi: 10.3390/cimb45040233.
[31] A. Rasool, J. Hong, Q. Jiang, H. Chen, and Q. Qu, “BO-DNA: Biologically optimized encoding model for a highly-reliable DNA data storage,” Comput. Biol. Med., vol. 165, p. 107404, Oct. 2023, doi: 10.1016/j.compbiomed.2023.107404.
[32] X. Zhang, B. Qi, and Y. Niu, “A dual-rule encoding DNA storage system using chaotic mapping to control GC content,” Bioinformatics, vol. 40, no. 3, Mar. 2024, doi: 10.1093/bioinformatics/btae113.
[33] A. A. Yassin, A. Mohammed Rashid, A. J. Yassin, and H. Alasadi, “A novel image encryption scheme based on DCT transform and DNA sequence,” Indones. J. Electr. Eng. Comput. Sci., vol. 21, no. 3, p. 1455, Mar. 2021, doi: 10.11591/ijeecs.v21.i3.pp1455-1464.
[34] M. Courel et al., “GC content shapes mRNA storage and decay in human cells,” Elife, vol. 8, pp. 1–32, Dec. 2019, doi: 10.7554/eLife.49708.
[35] C.-A. Canfield and P. C. Bradshaw, “Amino acids in the regulation of aging and aging-related diseases,” Transl. Med. Aging, vol. 3, pp. 70–89, Jan. 2019, doi: 10.1016/j.tma.2019.09.001.
[36] X. Fang et al., “A method for multiple-sequence-alignment-free protein structure prediction using a protein language model,” Nat. Mach. Intell., vol. 5, no. 10, pp. 1087–1096, Oct. 2023, doi: 10.1038/s42256-023-00721-6.
[37] D. Huo, D. Zhou, S. Yuan, S. Yi, L. Zhang, and X. Zhou, “Image encryption using exclusive-OR with DNA complementary rules and double random phase encoding,” Phys. Lett. A, vol. 383, no. 9, pp. 915–922, Feb. 2019, doi: 10.1016/j.physleta.2018.12.011.
[38] R. Xie, X. Zan, L. Chu, Y. Su, P. Xu, and W. Liu, “Study of the error correction capability of multiple sequence alignment algorithm (MAFFT) in DNA storage,” BMC Bioinformatics, vol. 24, no. 1, p. 111, Mar. 2023, doi: 10.1186/s12859-023-05237-9.
[39] M. H. Raza, S. Desai, S. Aravamudhan, and R. Zadegan, “An outlook on the current challenges and opportunities in DNA data storage,” Biotechnol. Adv., vol. 66, p. 108155, Sep. 2023, doi: 10.1016/j.biotechadv.2023.108155.
[40] Y. Cevallos et al., “A brief review on DNA storage, compression, and digitalization,” Nano Commun. Netw., vol. 31, p. 100391, Mar. 2022, doi: 10.1016/j.nancom.2021.100391.
[41] J. McLeod and E. Lomas, “Record DNA: reconceptualising digital records as the future evidence base,” Arch. Sci., vol. 23, no. 3, pp. 411–446, Sep. 2023, doi: 10.1007/s10502-023-09414-w.
[42] B. Ahuja, R. Doriya, S. Salunke, M. F. Hashmi, and A. Gupta, “Advanced 5D logistic and DNA encoding for medical images,” Imaging Sci. J., vol. 71, no. 2, pp. 142–160, Feb. 2023, doi: 10.1080/13682199.2023.2178097.
[43] K. Cho and H. Bahn, “Evaluating Image DNA Techniques for Filtering Unauthorized Content in Large-Scale Social Platforms,” Appl. Sci., vol. 15, no. 8, p. 4539, Apr. 2025, doi: 10.3390/app15084539.
[44] S. Jia, H. Lv, Q. Li, C. Fan, and F. Wang, “DNA-based biocomputing circuits and their biomedical applications,” Nat. Rev. Bioeng., vol. 3, no. 7, pp. 535–548, Apr. 2025, doi: 10.1038/s44222-025-00303-8.
[45] Y. Liu, Z. Li, X. Chen, X. Cui, Z. Gao, and R. Jiang, “INSTINCT: Multi-sample integration of spatial chromatin accessibility sequencing data via stochastic domain translation,” Nat. Commun., vol. 16, no. 1, p. 1247, Feb. 2025, doi: 10.1038/s41467-025-56535-0.
[46] Y. Zhou, K. Bi, Q. Ge, and Z. Lu, “Advances and Challenges in Random Access Techniques for In Vitro DNA Data Storage,” ACS Appl. Mater. Interfaces, vol. 16, no. 33, pp. 43102–43113, Aug. 2024, doi: 10.1021/acsami.4c07235.

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
___________________________________________________________
International Journal of Advances in Intelligent Informatics
ISSN 2442-6571 (print) | 2548-3161 (online)
Organized by UAD and ASCEE Computer Society
Published by Universitas Ahmad Dahlan
W: http://ijain.org
E: info@ijain.org (paper handling issues)
andri.pranolo.id@ieee.org (publication issues)
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0