Penerapan Metode Image-to-Speech melalui Kamera dalam Aplikasi berbasis Kecerdasan Buatan untuk Orang dengan Disleksia

Daniel  Aprillio; Anna Bella Atmadjaja; Bryan; Mychael Wijaya; Theresia Ratih Dewi Saputri

doi:10.32493/informatika.v9i1.39173

Authors

Daniel Aprillio Universitas Ciputra
Anna Bella Atmadjaja Universitas Ciputra
Bryan
Mychael Wijaya Universitas Ciputra
Theresia Ratih Dewi Saputri Universitas Ciputra Surabaya http://orcid.org/0000-0002-9234-2889

DOI:

https://doi.org/10.32493/informatika.v9i1.39173

Keywords:

Dyslexia, computer vision, image, image-to-speech, python

Abstract

Dyslexia occurs worldwide despite the culture or language. Dyslexia affects about 9% - 12% of the population, with 2% - 4% of the population experiencing significant reading impairments. This research aims to develop an artificial intelligence-based application using the Image-to-Speech method that can convert digital text into audible sound for individuals with dyslexia without requiring their brain to process the writing. This method can assist people with dyslexia in daily life challenges such as reading traffic signs, books, or documents. Results from 10 experiments on the implementation of the proposed method indicate that individuals with dyslexia can scan the text they want to read using a camera from a smartphone or laptop. The expirements also shows that the application can convert text in image form into sound comprehensible to those with dyslexia, thus facilitating their recognition of digital writing with 90% accuracy. The application also demonstrates efficiency in terms of data processing time. The average time required for image to audio conversion is 0.22 seconds, with an average memory usage of 163.2 MiB.

References

Bazen, L., de Bree, E. H., van den Boer, M., & de Jong, P. F. (2023). Perceived negative consequences of dyslexia: the influence of person and environmental factors. Annals of Dyslexia, 73(2), 214–234.

Bhahri, S., & others. (2018). Transformasi Citra Biner Menggunakan Metode Thresholding Dan Otsu Thresholding. E-JURNAL JUSITI: Jurnal Sistem Informasi Dan Teknologi Informasi, 7(2), 196–203.

Bradski, G., Kaehler, A., & others. (2000). OpenCV. Dr. Dobb’s Journal of Software Tools, 3(2).

Cire, R. (2021). A Multilingual Scrabble Game using the Googletrans Library in Python.

Dome, S., & Sathe, A. P. (2021). Optical charater recognition using tesseract and classification. 2021 International Conference on Emerging Smart Computing and Informatics (ESCI), 153–158.

Georgiou, G., & Parrila, R. (2023). Dyslexia and mental health problems. Encyclopedia of Mental Health, 3.

Husni, H., Nasri, N. I. S. M., & Saip, M. A. (2023). Eye-Tracking Usability Data of BacaDisleksia for an Informed Dyslexia-Friendly Design Decision. International Conference on Computing and Informatics, 69–80.

Iancu, B. (2019). Evaluating Google speech-to-text API’s performance for Romanian e-learning resources. Informatica Economica, 23(1), 17–25.

Jan, T. G., & Khan, S. M. (2023). A systematic review of research dimensions towards dyslexia screening using machine learning. Journal of The Institution of Engineers (India): Series B, 104(2), 511–522.

Kanan, C., & Cottrell, G. W. (2012). Color-to-grayscale: does the method matter in image recognition? PloS One, 7(1), e29740.

Ko, H.-K., Park, G., Jeon, H., Jo, J., Kim, J., & Seo, J. (2023). Large-scale text-to-image generation models for visual artists’ creative works. Proceedings of the 28th International Conference on Intelligent User Interfaces, 919–933.

Kumar, A., Samal, S., Saluja, M. S., & Tiwari, A. (2023). Automated Attendance System Based on Face Recognition Using Opencv. 2023 9th International Conference on Advanced Computing and Communication Systems (ICACCS), 1, 2256–2259.

Martinelli, V., & Brincat, B. (2022). The similarity of phonological skills underpinning reading ability in shallow and deep orthographies: a bilingual perspective. International Journal of Bilingual Education and Bilingualism, 25(6), 2095–2108.

Moncayo Arias, M. A., Bastidas Vera, E. A., Cabezas Mac’ias, P. M., Ledesma Esp’in, C. del R., & Bayas Guevara, B. I. (2024). Innovative and Inclusive Digital Applications to Enhance Literacy in Students with Dyslexia.

Moraza, A., & Nurhastuti, N. (2021). Mengurangi Kesalahan Membaca Permulaan Pada Anak Disleksia (X) Melalui Media Pembelajaran Berbasis Aplikasi Game Secil. Jurnal Penelitian Pendidikan Khusus, 9(1), 35–43.

Nguyen, T. T. H., Jatowt, A., Coustaty, M., & Doucet, A. (2021). Survey of post-OCR processing approaches. ACM Computing Surveys (CSUR), 54(6), 1–37.

Politi-Georgousi, S., & Drigas, A. (2020). Mobile Applications, an Emerging Powerful Tool for Dyslexia Screening and Intervention: A Systematic Literature Review.

Reid, G. (2022). Dyslexia around the Globe: Perspectives on Practice. In The Routledge International Handbook of Dyslexia in Education (pp. 386–396). Routledge.

Sasmito, G. W., & Nishom, M. (2020). Testing the Population Administration Website Application Using the Black Box Testing Boundary Value Analysis Method. 2020 IEEE Conference on Open Systems (ICOS), 48–52.

Sheffer, R., & Adi, Y. (2023). I hear your true colors: Image guided audio generation. ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 1–5.

Surayya, S., & Mubarok, H. (2021). Pengaruh Aplikasi Marbel Membaca Terhadap Kemampuan Membaca Anak Disleksia. Jurnal Ibriez: Jurnal Kependidikan Dasar Islam Berbasis Sains, 6(2), 193–204.

Suresh, H. S., & Niranjanamurthy, M. (2021). Image Processing Using OpenCV Technique for Real World Data. Intelligent Computing Paradigm and Cutting-Edge Technologies: Proceedings of the Second International Conference on Innovative Computing and Cutting-Edge Technologies (ICICCT 2020), 285–296.

Tariq, R., & Latif, S. (2016). A mobile application to improve learning performance of dyslexic children with writing difficulties. Journal of Educational Technology & Society, 19(4), 151–166.

Tejero, P., Insa, B., & Roca, J. (2019). Difficulties of drivers with dyslexia when reading traffic signs: Analysis of reading, eye gazes, and driving performance. Journal of Learning Disabilities, 52(1), 84–95.

Tirtana, E., Gunadi, K., & Sugiarto, I. (2021). Penerapan Metode YOLO dan Tesseract-OCR untuk Pendataan Plat Nomor Kendaraan Bermotor Umum di Indonesia Menggunakan Raspberry Pi. Jurnal Infra, 9(2), 241–247.