1887
Volume 28, Issue 1
  • ISSN 0142-5471
  • E-ISSN: 1569-979X
USD
Buy:$35.00 + Taxes

Abstract

Abstract

Readability formulas are used to assess the level of difficulty of a text. These language dependent formulas are introduced with pre-defined parameters. Deep reinforcement learning models can be used for parameter optimization. In this article we argue that an Actor-Critic based model can be used to optimize the parameters in the readability formulas. Furthermore, a selection model is proposed for selecting the most suitable formula to assess the readability of the input text. English and Persian data sets are used for both training and testing. The experimental results of the parameter optimization model show that, on average, the F-score of the model for English increases from 24.7% in the baseline to 38.8%, and for Persian from 23.5% to 47.7%. The proposed algorithm selection model further improves the parameter optimization model to 65.5% based on F-score for both English and Persian.

Loading

Article metrics loading...

/content/journals/10.1075/idj.22015.had
2023-05-25
2024-12-07
Loading full text...

Full text loading...

References

  1. Al Qundus, J., Paschke, A., Gupta, S., Alzouby, A. M., & Yousef, M.
    (2020) Exploring the impact of short-text complexity and structure on its quality in social media. Journal of Enterprise Information Management. 10.1108/JEIM‑06‑2019‑0156
    https://doi.org/10.1108/JEIM-06-2019-0156 [Google Scholar]
  2. Amstad, T.
    (1978) Wie verständlich sind unsere zeitungen? [How Readable Are Our Newspapers?] Zurich, Switzerland: University of Zurich.
    [Google Scholar]
  3. Antunes, H., & Lopes, C. T.
    (2019) Analyzing the adequacy of readability indicators to a non-English language. International Conference of the Cross-Language Evaluation Forum for European Languages, (pp.149–155). 10.1007/978‑3‑030‑28577‑7_10
    https://doi.org/10.1007/978-3-030-28577-7_10 [Google Scholar]
  4. Azpiazu, I. M., & Pera, M. S.
    (2019) Multiattentive recurrent neural network architecture for multilingual readability assessment. Transactions of the Association for Computational Linguistics, 71, 421–436. 10.1162/tacl_a_00278
    https://doi.org/10.1162/tacl_a_00278 [Google Scholar]
  5. Balyan, R., McCarthy, K. S., & McNamara, D. S.
    (2018) Comparing machine learning classification approaches for predicting expository text difficulty. Grantee Submission.
    [Google Scholar]
  6. Bijankhan, M.
    (2004) The role of corpora in writing a grammar: Introducing a software. Journal of Linguistics, 19(2), 48–67.
    [Google Scholar]
  7. Bohnet, B.
    (2009) Efficient parsing of syntactic and semantic dependency structures. Proceedings of the 13th Conference on Computational Natural Language Learning: Shared Task, (pp.67–72). 10.3115/1596409.1596421
    https://doi.org/10.3115/1596409.1596421 [Google Scholar]
  8. Cha, M., Gwon, Y., & Kung, H.
    (2017) Language modeling by clustering with word embeddings for text readability assessment. Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, (pp.2003–2006). 10.1145/3132847.3133104
    https://doi.org/10.1145/3132847.3133104 [Google Scholar]
  9. Cuayáhuitl, H., Lee, D., Ryu, S., Cho, Y., Choi, S., Indurthi, S., Yu, S., Choi, H., Hwang, I., & Kim, J.
    (2019) Ensemble-based deep reinforcement learning for chatbots. Neurocomputing, 3661, 118–130. 10.1016/j.neucom.2019.08.007
    https://doi.org/10.1016/j.neucom.2019.08.007 [Google Scholar]
  10. Dale, E., & Chall, J. S.
    (1948) A formula for predicting readability: Instructions. Educational research bulletin, 37–54.
    [Google Scholar]
  11. Dayani, M.
    (2000) A criteria for assessing the Persian texts’ readability. Journal of Social Science and Humanities, 101, 35–48.
    [Google Scholar]
  12. DuBay, W. H.
    (2004) The principles of readability. Impact Information.
    [Google Scholar]
  13. Dueppen, A. J., Bellon-Harn, M. L., Radhakrishnan, N., & Manchaiah, V.
    (2019) Quality and readability of English-language internet information for voice disorders. Journal of Voice, 33(3), 290–296. 10.1016/j.jvoice.2017.11.002
    https://doi.org/10.1016/j.jvoice.2017.11.002 [Google Scholar]
  14. Eslami, M., SharifiAtashgah, M., Lamjiri, S. A., & Zandi, T.
    (2004) Persian productive lexicon. Proceedings of the 1st Workshop on the Persian Language and Computer.
    [Google Scholar]
  15. Flesch, R.
    (1979) How to Write Plain English: A Book for Lawyers and Consumers. Harper & Row.
    [Google Scholar]
  16. (1948) A new readability yardstick. Journal of Applied Psychology, 32(3), 221. 10.1037/h0057532
    https://doi.org/10.1037/h0057532 [Google Scholar]
  17. Franccois, T., & Fairon, C.
    (2012) An “AI readability” formula for French as a foreign language. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, (pp.466–477).
    [Google Scholar]
  18. Ghaffari, M., MahmoodiBakhtiyari, B., & Zolfaghari, H.
    (2004) Let’s learn Persian (Vol. 1–3). Madreseh Publication.
    [Google Scholar]
  19. Ghayoomi, M.
    (2012) Bootstrapping the development of an HPSG-based treebank for Persian. Linguistic Issues in Language Technology, 7(1). 10.33011/lilt.v7i.1301
    https://doi.org/10.33011/lilt.v7i.1301 [Google Scholar]
  20. (2013) Introducing a treebank and a statistical parser for Persian. Proceedings of the 8th Conference of Iranian Linguistics, 21, 666–679.
    [Google Scholar]
  21. (2019) Transition from rule-based to statistical lemmatization in Persian. Proceedings of the 5th National Conference on Computational Linguistics, (pp.57–86).
    [Google Scholar]
  22. (2022) Application of computational linguistics to predict language proficiency level of Persian learners’ textbooks. Journal of Language Horizons, 6(1), 29–52. 10.22051/lghor.2021.32656.1354
    https://doi.org/10.22051/lghor.2021.32656.1354 [Google Scholar]
  23. Ghayoomi, M., & Kuhn, J.
    (2014) Converting an HPSG-based treebank into its parallel dependency-based treebank. Proceedings of the 9th International Conference on Language Resources and Evaluation, (pp.802–809).
    [Google Scholar]
  24. Goudjil, M., Koudil, M., Bedda, M., & Ghoggali, N.
    (2018) A novel active learning method using svm for text classification. International Journal of Automation and Computing, 15(3), 290–298. 10.1007/s11633‑015‑0912‑z
    https://doi.org/10.1007/s11633-015-0912-z [Google Scholar]
  25. Gunning, R.
    (1952) The Technique of Clear Writing. McGraw-Hill.
    [Google Scholar]
  26. Hafner, R., & Riedmiller, M.
    (2011) Reinforcement learning in feedback control. Machine Learning, 84(1–2), 137–169. 10.1007/s10994‑011‑5235‑x
    https://doi.org/10.1007/s10994-011-5235-x [Google Scholar]
  27. Hausknecht, M., & Stone, P.
    (2015) Deep reinforcement learning in parameterized action space. arXiv preprint arXiv:1511.04143. https://arxiv.org/abs/1511.04143
  28. Jiang, Z., Gu, Q., Yin, Y., & Chen, D.
    (2018) Enriching word embeddings with domain knowledge for readability assessment. Proceedings of the 27th International Conference on Computational Linguistics, (pp.366–378).
    [Google Scholar]
  29. Karačić, J., Dondio, P., Buljan, I., Hren, D., & Marušić, A.
    (2019) Languages for different health information readers: Multitrait-multimethod content analysis of cochrane systematic reviews textual summary formats. BMC Medical Research Methodology, 19(1), 75. 10.1186/s12874‑019‑0716‑x
    https://doi.org/10.1186/s12874-019-0716-x [Google Scholar]
  30. Keneshloo, Y., Ramakrishnan, N., & Reddy, C. K.
    (2019) Deep transfer reinforcement learning for text summarization. Proceedings of the 2019 SIAM International Conference on Data Mining, 675–683. 10.1137/1.9781611975673.76
    https://doi.org/10.1137/1.9781611975673.76 [Google Scholar]
  31. Kincaid, J. P., Fishburne Jr, R. P., Rogers, R. L., & Chissom, B. S.
    (1975) Derivation of new readability formulas (automated readability index, Fog count and Flesch reading ease formula) for navy enlisted personnel. Technical Report. Naval Technical Training Command Millington TN Research Branch. 10.21236/ADA006655
    https://doi.org/10.21236/ADA006655 [Google Scholar]
  32. Kingma, D. P., & Ba, J.
    (2014) Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980. https://arxiv.org/abs/1412.6980
  33. Klein, D., & Manning, C. D.
    (2003) Accurate unlexicalized parsing. Proceedings of the 41st Meeting of the Association for Computational Linguistics, (pp.423–430). 10.3115/1075096.1075150
    https://doi.org/10.3115/1075096.1075150 [Google Scholar]
  34. Lillicrap, T. P., Hunt, J. J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., & Wierstra, D.
    (2015) Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971. https://arxiv.org/abs/1509.02971
  35. Lively, B. A., & Pressey, S. L.
    (1923) A method for measuring the “vocabulary Burden” of textbooks. Educational Administration and Supervision, 91, 389–398.
    [Google Scholar]
  36. Manek, A. S., Shenoy, P. D., Mohan, M. C., & Venugopal, K.
    (2017) Aspect term extraction for sentiment analysis in large movie reviews using Gini index feature selection method and SVM classifier. World wide web, 20(2), 135–154. 10.1007/s11280‑015‑0381‑x
    https://doi.org/10.1007/s11280-015-0381-x [Google Scholar]
  37. Mikolov, T., Sutskever, I., Chen, K., Corrado, G. S., & Dean, J.
    (2013) Distributed representations of words and phrases and their compositionality. InC. J. C. Burges, L. Bottou, M. Welling, Z. Ghahramani, & K. Q. Weinberger (Eds.), Advances in neural information processing systems261 (pp.3111–3119). Curran Associates, Inc. 10.5555/2999792.2999959
    https://doi.org/10.5555/2999792.2999959 [Google Scholar]
  38. Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., & Riedmiller, M.
    (2013) Playing Atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602. https://arxiv.org/abs/1312.5602
  39. Mohammadi, H., & Khasteh, S. H.
    (2020) A machine learning approach to Persian text readability assessment using a crowdsourced dataset. 2020 28th Iranian Conference on Electrical Engineering (ICEE), 1–7. 10.1109/ICEE50131.2020.9260933
    https://doi.org/10.1109/ICEE50131.2020.9260933 [Google Scholar]
  40. (2019) Text as environment: A deep reinforcement learning text readability assessment model. arXiv preprint arXiv:1912.05957. https://arxiv.org/abs/1912.05957
  41. Müller, T., Cotterell, R., Fraser, A., & Schütze, H.
    (2015) Joint lemmatization and morphological tagging with lemming. Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing, (pp.2268–2274). 10.18653/v1/D15‑1272
    https://doi.org/10.18653/v1/D15-1272 [Google Scholar]
  42. Müller, T., Schmid, H., & Schütze, H.
    (2013) Efficient higher-order CRFs for morphological tagging. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, (pp.322–332).
    [Google Scholar]
  43. Narayan, S., Cohen, S. B., & Lapata, M.
    (2018) Ranking sentences for extractive summarization with reinforcement learning. arXiv preprint arXiv:1802.08636. https://arxiv.org/abs/1802.08636. 10.18653/v1/N18‑1158
    https://doi.org/10.18653/v1/N18-1158
  44. Ngo-Ye, T. L., Sinha, A. P., & Sen, A.
    (2017) Predicting the helpfulness of online reviews using a scripts-enriched text regression model. Expert Systems with Applications, 711, 98–110. 10.1016/j.eswa.2016.11.029
    https://doi.org/10.1016/j.eswa.2016.11.029 [Google Scholar]
  45. Nuruzzaman, M., & Hussain, O. K.
    (2018) A survey on chatbot implementation in customer service industry through deep neural networks. 2018 IEEE 15th International Conference on e-Business Engineering (ICEBE), (pp.54–61). 10.1109/ICEBE.2018.00019
    https://doi.org/10.1109/ICEBE.2018.00019 [Google Scholar]
  46. Pancer, E., Chandler, V., Poole, M., & Noseworthy, T. J.
    (2019) How readability shapes social media engagement. Journal of Consumer Psychology, 29(2), 262–270. 10.1002/jcpy.1073
    https://doi.org/10.1002/jcpy.1073 [Google Scholar]
  47. Poornamdariyan, T.
    (1994) The Persian Lesson for Foreign Persian Learners (For Beginners). Institute for Humanities; Cultural Studies Publications.
    [Google Scholar]
  48. Rottensteiner, S.
    (2010) Structure, function and readability of new textbooks in relation to comprehension. Procedia-Social and Behavioral Sciences, 21, 3892–3898. 10.1016/j.sbspro.2010.03.611
    https://doi.org/10.1016/j.sbspro.2010.03.611 [Google Scholar]
  49. SaffarMoghaddam, A.
    (2003) General Persian: Basic constructions. Council of Extending Persian Language; Linguistics at the Institute for Humanities; Cultural Studies.
    [Google Scholar]
  50. (2008) The Persian language (Vol. 1–4). Council of Extending Persian Language; Linguistics at the Institute for Humanities; Cultural Studies.
    [Google Scholar]
  51. Salton, G. M., Wong, A., & Yang, C.-S.
    (1975) A vector space model for automatic indexing. Communications of the ACM, 18(11), 613–620. 10.1145/361219.361220
    https://doi.org/10.1145/361219.361220 [Google Scholar]
  52. Samareh, Y.
    (1989) Teaching the Persian Language (Vol. 1). Alhoda International Publications.
    [Google Scholar]
  53. (2005) Teaching the Persian Language (Vol. 2–4). Alhoda International Publications.
    [Google Scholar]
  54. Senter, R., & Smith, E. A.
    (1967) Automated readability index (tech. rep.). CINCINNATI UNIV OH.
    [Google Scholar]
  55. Serban, I. V., Sankar, C., Germain, M., Zhang, S., Lin, Z., Subramanian, S., Kim, T., Pieper, M., Chandar, S., Ke, N. R.,
    (2017) A deep reinforcement learning chatbot. arXiv preprint arXiv:1709.02349. https://arxiv.org/abs/1709.02349
  56. Shen, C., Gonzalez, Y., Chen, L., Jiang, S. B., & Jia, X.
    (2018) Intelligent parameter tuning in optimization-based iterative ct reconstruction via deep reinforcement learning. IEEE transactions on medical imaging, 37(6), 1430–1439. 10.1109/TMI.2018.2823679
    https://doi.org/10.1109/TMI.2018.2823679 [Google Scholar]
  57. Sherman, L.
    (1893) Analytics of Literature: A Manual for the Objective Study of English Prose and Poetry. Ginn. https://books.google.com/books?id=SWe0U%5Czp6M8C
    [Google Scholar]
  58. Silveira, N., Dozat, T., de Marneffe, M. C., Bowman, S., Connor, M., Bauer, J., & Manning, C. D.
    (2014) A gold standard dependency corpus for English. Proceedings of the 9th International Conference on Language Resources and Evaluation, (pp.2897–2904).
    [Google Scholar]
  59. Song, S., Huang, H., & Ruan, T.
    (2019) Abstractive text summarization using lstm-cnn based deep learning. Multimedia Tools and Applications, 78(1), 857–875. 10.1007/s11042‑018‑5749‑3
    https://doi.org/10.1007/s11042-018-5749-3 [Google Scholar]
  60. Sutton, R. S., & Barto, A. G.
    (1998) Reinforcement Learning: An Introduction (Vol. 135). MIT press Cambridge. 10.1109/TNN.1998.712192
    https://doi.org/10.1109/TNN.1998.712192 [Google Scholar]
  61. Temnikova, I., Vieweg, S., & Castillo, C.
    (2015) The case for readability of crisis communications in social media. Proceedings of the 24th International Conference on World Wide Web, (pp.1245–1250). 10.1145/2740908.2741718
    https://doi.org/10.1145/2740908.2741718 [Google Scholar]
  62. Vajjala, S., & Lučić, I.
    (2018) OnestopEnglish corpus: A new corpus for automatic readability assessment and text simplification. Proceedings of the 13th Workshop on Innovative Use of NLP for Building Educational Applications, (pp.297–304). 10.18653/v1/W18‑0535
    https://doi.org/10.18653/v1/W18-0535 [Google Scholar]
  63. Wang, Y., & Jin, H.
    (2019) A deep reinforcement learning based multi-step coarse to fine question answering (MSCQA) system. Proceedings of the AAAI Conference on Artificial Intelligence, 331, 7224–7232. 10.1609/aaai.v33i01.33017224
    https://doi.org/10.1609/aaai.v33i01.33017224 [Google Scholar]
  64. Wasike, B.
    (2018) Preaching to the choir? An analysis of newspaper readability vis-a-vis public literacy. Journalism, 19(11), 1570–1587. 10.1177/1464884916673387
    https://doi.org/10.1177/1464884916673387 [Google Scholar]
  65. Watkins, C. J., & Dayan, P.
    (1992) Q-learning. Machine Learning, 8(3–4), 279–292. 10.1007/BF00992698
    https://doi.org/10.1007/BF00992698 [Google Scholar]
  66. Watkins, C. J. C. H.
    (1989) Learning from Delayed Rewards. Doctoral Dissertation. King’s College. Cambridge, UK.
  67. Xia, M., Kochmar, E., & Briscoe, T.
    (2019) Text readability assessment for second language learners. arXiv preprint arXiv:1906.07580. https://arxiv.org/abs/1906.07580
  68. Zalmout, N., Saddiki, H., & Habash, N.
    (2016) Analysis of foreign language teaching methods: An automatic readability approach. Proceedings of the 3rd workshop on natural language processing techniques for educational applications (NLPTEA2016), (pp.122–130).
    [Google Scholar]
  69. Zarghamiyan, M.
    (1998) Series of Teaching the Persian Language (From Beginner to Advanced) (Vol. 1). Council of Extending Persian Language; Linguistics.
    [Google Scholar]
  70. (2001) Series of Teaching the Persian Language (From Beginner to Advanced) (Vol. 2–3). Council of Extending Persian Language; Linguistics.
    [Google Scholar]
/content/journals/10.1075/idj.22015.had
Loading
/content/journals/10.1075/idj.22015.had
Loading

Data & Media loading...

  • Article Type: Research Article
Keyword(s): deep reinforcement learning; parameter optimization; text readability
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error