Journal Screenshot

International Journal of Academic Research in Progressive Education and Development

Open Access Journal

ISSN: 2226-6348

Bridging the Resource Gap for Malay-to-Arabic Translation: Evaluating Machine Translation of News Headlines

Puteri Alimatul Hakim Bt. Mohamad Salleh, Muhammad Ridhuan Tony, Mashitah Sabdin, Kais Amir Kadhim

http://dx.doi.org/10.6007/IJARPED/v13-i3/22148

Open access

Machine translation (MT) technology has become essential for cross-lingual communication, surpassing traditional human-centric translation methods. Many translation studies and MT developments have focused primarily on English, leading to significant improvements in MT systems for English. However, the quality may not be as reliable when translating between less widely spoken languages, such as Malay and Arabic, due to the scarcity of resources and research on improving MT for these language pairs. While MT systems provide good translations for widely spoken languages like English, there is a need for more research and development to improve the quality of translation for less common language pairs like Malay–Arabic. This study aims to address this gap by focusing on translating Malay news headlines into Arabic, contributing to improving MT systems for these language pairs, and providing a resource for translation students and professionals, where Arabic translation materials can be scarce, especially in Malaysian institutions. This study used a mixed-methods approach, integrating quantitative analysis of 20 news headlines scored on accuracy, style, and clarity by evaluators with 5-27 years of expertise in Arabic language and translation. A qualitative thematic analysis was conducted by the researcher to achieve the aim of this study. Results showed significant variations in MT system performance. While some systems preserved linguistic features and accuracy, cultural nuances were often lost, with common errors in idioms and structure. This study evaluates user feedback on Google Translate (GT) and Bing Microsoft Translator (BMT). Due to unequal participant distribution, potential bias exists. The findings highlight the need for advanced MT tools for Malay-Arabic translation and enhance MT technology, promoting cross-cultural understanding in the news industry. Future studies should aim for a more balanced sample size for better comparability.

Abdullahi, A., Rouyan, N. B., & Noor, S. S. B. (2018). The use of Web 2.0 technologies to determine receptive skills among Malay learners of Arabic language. Archives of Business Research, 6(7), 1-20.
Abdul Majid, M. A., Isa, A. A., Zakaria, M. Z., & Al-Islami, A. J. (2022). Aplikasi Strategi terjemahan arab-melayu dalam ungkapan idiomatik Anggota Badan. E-Bangi Journal of Social Science and Humanities, 19(5).
Ahmed, B. H., & Saad, M. (2021). The use of machine translation to provide resources for under-resourced languages - image captioning task. 2021 Palestinian International Conference on Information and Communication Technology (PICICT).
Aiken, M., & Balan, S. (2011). An analysis of Google Translate accuracy. Translation Journal, 16(2). Retrieved from http://translationjournal.net/journal/56google.htm
Albrecht, J., & Hwa, R. (2007). A re-examination of machine learning approaches to sentence-level MT evaluation. In Proceedings of the 45th Annual Meeting of the Association of Computational Linguistics (ACL) (pp. 880-887). Association for Computational Linguistics.
Al-Haj, H., & Lavie, A. (2011). The impact of Arabic morphological segmentation on broad-coverage English-to-arabic statistical machine translation. Machine Translation, 26(1–2), 3–24.
Alsaffar, A., & Omar, N. (2014). Study on feature selection and machine learning algorithms for Malay sentiment classification. Proceedings of the 6th International Conference on Information Technology and Multimedia.
Shahrul, A., Nazlena, M. S., & Noah. (2015). Malay text features for automatic news headline generation. Journal of Theoretical and Applied Information Technology. 76. 36-41.
Bahdanau, D., Cho, K., & Bengio, Y. (2015). Neural machine translation by jointly learning to align and translate. ICLR 2015.
Bentivogli, L., Bisazza, A., Cettolo, M., & Federico, M. (2016). Neural versus phrase-based machine translation quality: A case study. Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, 257-267.
Bies, A., DiPersio, D., & Maamouri, M. (2012). Linguistic Resources for Arabic Machine Translation. Challenges for Arabic Machine Translation, 15–22.
Bin Zabidin, M. A., & Binti Abbas, U. H. (2021). Translating proverbs between Malay and Arabic from a linguistic perspective to semantic change (strategy and pillars). IJAS: Indonesian Journal of Arabic Studies, 3(1), 19.
Brown, T., Mann, B., Ryder, N., Subbiah, M., Kaplan, J. D., Dhariwal, P., ... & Amodei, D. (2020). Language models are few-shot learners. Advances in Neural Information Processing Systems, 33, 1877-1901.
Callison-Burch, C., Koehn, P., & Osborne, M. (2006). Improved statistical machine translation using paraphrases. Proceedings of the Main Conference on Human Language Technology Conference of the North American Chapter of the Association of Computational Linguistics -.
Chapter 12. the semiotic machine, linguistic work and translation. (2014). Sign Studies and Semioethics, 248–268.
Chen, M. X., Firat, O., Bapna, A., Johnson, M., Macherey, W., Foster, G., Jones, L., Schuster, M., Shazeer, N., Parmar, N., Vaswani, A., Uszkoreit, J., Kaiser, L., Chen, Z., Wu, Y., & Hughes, M. (2018). The best of both worlds: Combining recent advances in neural machine translation. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers).
Costa-Jussà, M. R., & Fonollosa, J. A. R. (2015). Latest trends in hybrid machine translation and its applications. Computer Speech & Language, 32(1), 3-10.
Farghaly, A., & Shaalan, K. (2009). Arabic natural language processing: Challenges and solutions. ACM Transactions on Asian Language Information Processing (TALIP), 8(4), 14. https://doi.org/10.1145/1644879.1644880
Garcia, I. (2011). Translating by post-editing: Is it the way forward? Machine Translation, 25(3), 217–237.
Habash, N., & Sadat, F. (2006). Arabic preprocessing schemes for statistical machine translation. In Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers (pp. 49-52). Association for Computational Linguistics. https://aclanthology.org/N06-2013
Hassan, H., Menezes, A., & Sawaf, H. (2014). Statistical machine translation for Arabic dialects. In Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 177-184). Association for Computational Linguistics. https://aclanthology.org/D14-1019
Ismail, O. L. (2021). The use and abuse of machine translation in vocabulary acquisition among L2 Arabic-speaking learners. SSRN Electronic Journal.
Johnson, M., Schuster, M., Le, Q. V., Krikun, M., Wu, Y., Chen, Z., ... & Dean, J. (2017). Google's multilingual neural machine translation system: Enabling zero-shot translation. Transactions of the Association for Computational Linguistics, 5, 339-351.
Kirchhoff, K., Tam, Y.-C., Richey, C., & Wang, W. (2015). Morphological modeling for machine translation of English-iraqi arabic spoken dialogs. Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies.
Koehn, P. (2009). Statistical Machine Translation.
Mitchell, L., O’Brien, S., & Roturier, J. (2014). Quality Evaluation in community post-editing. Machine Translation, 28(3–4), 237–262.
Munday, J. (2016). Introducing translation studies: Theories and applications (4th ed.). Routledge.
Muurisep K. & P. Mutso, (2005) “ESTSUM – Estonian newspaper texts summarizer”, Proceedings of the Second Baltic Conference on Human Language Technologies, pp. 311 – 316.
Salleh, N. M., Ahmad, M. (2015). Literature Review on the Translation of Malay Novels into Arabic.
Zulkafli, N. A., Omar, B., and Hashim, N. H. (2014). School of Communication, Universiti Sains Malaysia. Selective Exposure to Berita Harian Online and Utusan Malaysia Online: The Roles of Surveillance Motivation, Website Usability and Website Attractiveness.
O’Brien, S. (2012). Towards a dynamic quality evaluation model for translation. Journal of Specialized Translation, 17, 55-77.
Papineni, K., Roukos, S., Ward, T., & Zhu, W.-J. (2001). Bleu. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics - ACL ’02.
Ryding, K. C. (2005). A Reference Grammar of Modern Standard Arabic. https://doi.org/10.1017/cbo9780511486975
Sennrich, R., Haddow, B., & Birch, A. (2016). Neural machine translation of rare words with subword units. Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), 1715-1725.
Tezcan, A., & Bulté, B. (2022). Evaluating the impact of integrating similar translations into Neural Machine Translation. Information, 13(1), 19. https://doi.org/10.3390/info13010019
Translation into the foreign language. (2014). Translation, 154–178. https://doi.org/10.4324/9781315760315-8
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. Advances in Neural Information Processing Systems, 30, 5998-6008.
Xu, S. Yang & F. C. M Lau, (2010) “Keyword extraction and headline generation using novel word features”, Proceedings of the Twenty-Fourth AAAI Conference on Artificial Intelligence 2010, pp. 1461 – 1466
Wu, Y., Schuster, M., Chen, Z., Le, Q. V., Norouzi, M., Macherey, W., Krikun, M., Cao, Y., Gao, Q., Macherey, K., Klingner, J., Shah, A., Johnson, M., Liu, X., Kaiser, ?., Gouws, S., Kato, Y., Kudo, T., Kazawa, H., Stevens, K., ... Dean, J. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144. (2016). Google's neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144
Zainal, A. E. Z., Mustapha, N. F., Abd Rahim, N., & Syed Abdullah, S. N. (2020). Penterjemahan idiom Arab-melayu melalui google translate: Apakah Yang perlu dilakukan? (translation of idioms from Arabic into Malay via google translate: What needs to be done?). GEMA Online® Journal of Language Studies, 20(3), 156–180.
Zajic D., B. Dorr & R. Schwartz, (2005) “Headline Generation for Written and Broadcast News”, Technical Report of the Language and Media Processing Laboratory, Institute for Advanced Computer Studies, University of Maryland, Report No: UMIACS-TR- 2005-07.
Zbib, R., & Soudi, A. (2012). Introduction. Challenges for Arabic Machine Translation, 1–14. https://doi.org/10.1075/nlp.9.01zbi
Zhou L. & E. Hovy, (2004) “Template-filtered headline summarization”, Proceedings of the Association for Computational Linguistics (ACL-04) Workshop on Text Summarization Branches Out, pp. 56 – 60.

(Salleh et al., 2024)
Salleh, P. A. H. B. M., Ton, D. M. R., Sabdin, M. M., & Kadhim, A. D. K. A. (2024). Bridging the Resource Gap for Malay-to-Arabic Translation: Evaluating Machine Translation of News Headlines. International Journal of Academic Research in Progressive Education and Development, 13(3), 2450–2470.