Arabic Text Diacritization Using Deep Neural Networks and Transformer-Based Architectures

Authors

  • Mohamed Cherradi Abdelmalek Essaâdi University (UAE), ENSAH, Tetouan, Morocco. https://orcid.org/0009-0003-8139-9454 Author
  • Hajar El Mahajer Abdelmalek Essaâdi University (UAE), FSTT, Tetouan, Morocco. https://orcid.org/0009-0008-7152-4342 Author

DOI:

https://doi.org/10.59543/kadsa.v1i.15077

Keywords:

Arabic Diacritization; Natural Language Processing (NLP); Sequence-to-Sequence Models; Transformer Architecture

Abstract

This study investigates the application of deep learning architectures for automatic Arabic text diacritization, with a particular focus on character-level neural networks. Four architectures were implemented: a Transformer encoder-decoder, a BiGRU model, a baseline stacked BiLSTM, and a CBHG model. Diacritic Error Rate (DER) and Word Error Rate (WER) were used as evaluation metrics, with training and evaluation conducted on the Tashkeela corpus. The results show that the CBHG model achieved faster inference times while slightly outperforming the Transformer encoder-decoder in diacritic accuracy. However, the findings also suggest that the Transformer model may yield better performance with larger datasets, improved parameter tuning, and increased model capacity.

Downloads

Published

2025-08-28

How to Cite

Mohamed Cherradi, & Hajar El Mahajer. (2025). Arabic Text Diacritization Using Deep Neural Networks and Transformer-Based Architectures. Knowledge and Decision Systems With Applications, 1, 257-269. https://doi.org/10.59543/kadsa.v1i.15077

Issue

Section

Articles