1887
Volume 175, Issue 1
  • ISSN 0019-0829
  • E-ISSN: 1783-1490

Abstract

Abstract

The article introduces a novel lexical resource for Swedish based on word family principles. The development of the Swedish Word Family (SweWF) resource is set into the context of linguistic complexity in second language acquisition. The SweWF is particularly appropriate for that, given that it contains lexical items used in second language corpora, namely, in a corpus of coursebook texts, and in a corpus of learner essays. The main focus of the article is on the construction of the resource with its user interface and on its applicability for research, although it also opens vast possibilities for practical applications for language learning, testing and assessment. We demonstrate the value of the resource through several case studies.

Available under the CC BY 4.0 license.
Loading

Article metrics loading...

/content/journals/10.1075/itl.22026.vol
2024-02-26
2024-10-11
Loading full text...

Full text loading...

/deliver/fulltext/itl.22026.vol.html?itemId=/content/journals/10.1075/itl.22026.vol&mimeType=html&fmt=ahah

References

  1. Allén, Sture, Berg, Sture, Järborg, Jerker, Löfström, Jonas, Ralph, Bo, Sjögreen, Christian
    (1980) Nusvensk frekvensordbok baserad på tidningstext. Frequency Dictionary of Present-Day Swedish based on newspaper material. 41. Ordled Betydelser. Morphemes Meanings. Stockholm: Almqvist & Wiksell.
    [Google Scholar]
  2. Anthony, Laurence
    (2022) AntWordprofiler [computer software]. https://www.laurenceanthony.net/software
  3. Baroni, Marco, & Evert, Stefan
    (2014) The zipfR package for lexical statistics: A tutorial introduction. [https://cran.microsoft.com/snapshot/2018-04-24/web/packages/zipfR/vignettes/zipfr-tutorial.pdf]
  4. Bauer, Laurie, & Nation, Paul
    (1993) Word families. International journal of Lexicography, 6(4), 253–279. 10.1093/ijl/6.4.253
    https://doi.org/10.1093/ijl/6.4.253 [Google Scholar]
  5. Baayen, R. Harald, Piepenbrock, Richard, & Gulikers, Leon
    (1996) The CELEX lexical database (CD-rom).
    [Google Scholar]
  6. Bolshakova, Elena, & Sapin, Alexander
    (2020) An experimental study of neural morpheme segmentation models for Russian word forms. InCMCL (pp.79–89).
    [Google Scholar]
  7. Borin, Lars, Forsberg, Markus, & Roxendal, Johan
    (2012) Korp – the corpus infrastructure of Språkbanken. InProceedings of the Eighth International Conference on Language Resources and Evaluation (LREC’12), pp.474–478.
    [Google Scholar]
  8. Bratlie, Siri Steffensen, Brinchmann, Ellen Irén, Melby-Lervåg, Monica, & Torkildsen, Janne von Koss
    (2022) Morphology – A Gateway to Advanced Language: Meta-Analysis of Morphological Knowledge in Language Minority Children. Review of Educational Research, 92(4), 614–650. 10.3102/00346543211073186
    https://doi.org/10.3102/00346543211073186 [Google Scholar]
  9. Brezina, Vaclav, & Pallotti, Gabriele
    (2019) Morphological complexity in written L2 texts. Second language research, 35(1), 99–119. 10.1177/0267658316643125
    https://doi.org/10.1177/0267658316643125 [Google Scholar]
  10. Brown, Dale
    (2018) Examining the word family through word lists. Vocabulary Learning and Instruction, 7(1), 51–65. 10.7820/vli.v07.1.brown
    https://doi.org/10.7820/vli.v07.1.brown [Google Scholar]
  11. Brown, Dale, Stoeckel, Tim, Mclean, Stuart, & Stewart, Jeff
    (2022) The most appropriate lexical unit for L2 vocabulary research and pedagogy: A brief review of the evidence. Applied Linguistics, 43(3), 596–602. 10.1093/applin/amaa061
    https://doi.org/10.1093/applin/amaa061 [Google Scholar]
  12. Capel, Annette
    (2012) Completing the English vocabulary profile: C1 and C2 vocabulary. English Profile Journal31, pp.1–14. 10.1017/S2041536212000013
    https://doi.org/10.1017/S2041536212000013 [Google Scholar]
  13. Cobb, Tom
    (2021) Compleat Web VP v.2.5. [Computer programme]. www.lextutor.ca/vp/comp/
  14. Cobb, Tom, & Laufer, Batia
    (2021) The nuclear word family list: A list of the most frequent family members, including base and affixed words. Language Learning, 71(3), 834–871. 10.1111/lang.12452
    https://doi.org/10.1111/lang.12452 [Google Scholar]
  15. Coulange, Sylvain, Jouannaud, Marie-Pierre, Cervini, Cristiana, & Masperi, Monica
    (2020) From placement to diagnostic testing: Improving feedback to learners and other stakeholders in SELF (Système d’Evaluation en Langues à visée Formative). Language Learning in Higher Education, 10(1), 195–205. 10.1515/cercles‑2020‑2015
    https://doi.org/10.1515/cercles-2020-2015 [Google Scholar]
  16. Council of Europe [COE]
    Council of Europe [COE] (2020) Common European Framework of Reference for Languages: learning, teaching, assessment: companion volume. Council of Europe Publishing.
    [Google Scholar]
  17. Coxhead, Averil
    (1998) An academic word list. Vol.181. School of Linguistics and Applied Language Studies, Victoria University of Wellington.
    [Google Scholar]
  18. De Clercq, Bastien, & Housen, Alex
    (2019) The development of morphological complexity: A cross-linguistic study of L2 French and English. Second Language Research, 35(1), 71–97. 10.1177/0267658316674506
    https://doi.org/10.1177/0267658316674506 [Google Scholar]
  19. de la Torre García, Nuria, Ainciburu, María Cecilia, & Buyse, Kris
    (2021) Morphological complexity and rated writing proficiency: The case of verbal inflectional diversity in L2 Spanish. ITL-International Journal of Applied Linguistics, 172(2), 290–318. 10.1075/itl.20009.del
    https://doi.org/10.1075/itl.20009.del [Google Scholar]
  20. Creutz, Mathias & Lagus, Krista
    (2007) Unsupervised models for morpheme segmentation and morphology learning. ACM Transactions on Speech and Language Processing (TSLP), 4(1):1–34. 10.1145/1187415.1187418
    https://doi.org/10.1145/1187415.1187418 [Google Scholar]
  21. Dijkstra, Ton, Martín, Fermín Moscoso del Prado, Schulpen, Béryl, Schreuder, Robert, & Baayen, R. Harald
    (2005) A roommate in cream: Morphological family size effects on interlingual homograph recognition. Language and cognitive processes, 20(1/2), 7–41. 10.1080/01690960444000124
    https://doi.org/10.1080/01690960444000124 [Google Scholar]
  22. Dokulil, Miloš
    (1962) Tvoření slov v češtině: Dokulil, M. Teorie odvozování slov. Nakl. Československé akademie věd.
    [Google Scholar]
  23. Fellner, Hannes A., & Hill, Nathan
    (2019) Word families, allofams, and the comparative method. Cahiers de linguistique Asie orientale, 48(2), 91–124. 10.1163/19606028‑04802001
    https://doi.org/10.1163/19606028-04802001 [Google Scholar]
  24. Fliessbach, K., Weis, S., Klaver, P., Elger, C. E., & Weber, B.
    (2006) The effect of word concreteness on recognition memory. NeuroImage (Orlando, Fla.), 32(3), 1413–1421. 10.1016/j.neuroimage.2006.06.007
    https://doi.org/10.1016/j.neuroimage.2006.06.007 [Google Scholar]
  25. Forsberg, Fanny, & Bartning, Inge
    (2010) Can linguistic features discriminate between the communicative CEFR-levels?: A pilot study of written L2 French. InBarthing, I., Martin, M. and Vedder, I.Communicative proficiency and linguistic development: Intersections between SLA and language testing research (2010): 81–99.
    [Google Scholar]
  26. François, Thomas, Volodina, Elena, Pilán, Ildikó, & Tack, Anaïs
    (2016) SVALex: a CEFR-graded lexical resource for Swedish foreign and second language learners. InProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp.213–219.
    [Google Scholar]
  27. Gaillat, Thomas, Knefati, Anas, & Lafontaine, Antoine
    (2021) Towards a Data Analytics Pipeline for the Visualisation of Complexity Metrics in L2 writings. In16th Workshop on Innovative Use of NLP for Building Educational Applications.
    [Google Scholar]
  28. Gardner, Dee, & Davies, Mark
    (2014) A new academic vocabulary list. Applied linguistics, 35(3), 305–327. 10.1093/applin/amt015
    https://doi.org/10.1093/applin/amt015 [Google Scholar]
  29. Haspelmath, Martin
    (2023) Defining the word. Word69(3):283–297. 10.1080/00437956.2023.2237272
    https://doi.org/10.1080/00437956.2023.2237272 [Google Scholar]
  30. Heatley, A., Nation, Paul, & Coxhead, Averil
  31. Hiebert, Elfrieda H., Goodwin, Amanda P., & Cervetti, Gina N.
    (2018) Core vocabulary: Its morphological content and presence in exemplar texts. Reading Research Quarterly, 53(1), 29–49. 10.1002/rrq.183
    https://doi.org/10.1002/rrq.183 [Google Scholar]
  32. Housen, Alex, & Kuiken, Folkert
    (2009) Complexity, accuracy and fluency in second language acquisition. Applied Linguistics, 30(4), 461–473. 10.1093/applin/amp048
    https://doi.org/10.1093/applin/amp048 [Google Scholar]
  33. Karlgren, Bernhard
    (1933) Word families in Chinese. Stockholm.
    [Google Scholar]
  34. Kilgarriff, Adam, Charalabopoulou, Frieda, Gavrilidou, Maria, Johannessen, Janne Bondi, Khalil, Saussan, Johansson Kokkinakis, Sofie, Lew, Robert, Sharoff, Serge, Vadlapudi, Ravikiran, & Volodina, Elena
    (2014) Corpus-based vocabulary lists for language learners for nine languages. Language resources and evaluation, 48(1), 121–163. 10.1007/s10579‑013‑9251‑2
    https://doi.org/10.1007/s10579-013-9251-2 [Google Scholar]
  35. Kimppa, Lilli, Shtyrov, Yury, Hut, Suzanne C. A., Hedlund, Laura, Leminen, Miika, & Leminen, Alina
    (2019) Acquisition of L2 morphology by adult language learners. Cortex, 1161, 74–90. 10.1016/j.cortex.2019.01.012
    https://doi.org/10.1016/j.cortex.2019.01.012 [Google Scholar]
  36. Krippendorff, Klaus
    (2011) Computing Krippendorff’s alpha-reliability. Annenberg School for Communication Departmental Papers: Philadelphia.
    [Google Scholar]
  37. Körtvélyessy, Lívia, Bagasheva, Alexandra, & Štekauer, Pavol
    (eds.) (2020) Derivational networks across languages. De Gruyter Mouton. 10.1515/9783110686630
    https://doi.org/10.1515/9783110686630 [Google Scholar]
  38. Lango, Mateusz, Žabokrtský, Zdeněk, & Ševčíková, Magda
    (2021) Semi-automatic construction of word-formation networks. Language Resources and Evaluation, 55(1), 3–32. 10.1007/s10579‑019‑09484‑2
    https://doi.org/10.1007/s10579-019-09484-2 [Google Scholar]
  39. Laufer, Batia
    (2021) LEMMAS, FLEMMAS, WORD FAMILIES, AND COMMON SENSE. Studies in Second Language Acquisition, 43(5), 965–968. 10.1017/S0272263121000656
    https://doi.org/10.1017/S0272263121000656 [Google Scholar]
  40. Laufer, Batia, & Nation, Paul
    (1995) Vocabulary Size and Use: Lexical Richness in L2 Written Production, Applied Linguistics, 16(3), 307–322. 10.1093/applin/16.3.307
    https://doi.org/10.1093/applin/16.3.307 [Google Scholar]
  41. Laufer, Batia, Webb, Stuart, Kim, Su Kyung, & Yohanan, Beverley
    (2021) How well do learners know derived words in a second language? The effect of proficiency, word frequency and type of affix. ITL-International Journal of Applied Linguistics172:2, pp.229–258. 10.1075/itl.20020.lau
    https://doi.org/10.1075/itl.20020.lau [Google Scholar]
  42. Leminen, Alina, Smolka, Eva, Dunabeitia, Jon A., & Pliatsikas, Christos
    (2019) Morphological processing in the brain: The good (inflection), the bad (derivation) and the ugly (compounding). Cortex, 1161, 4–44. 10.1016/j.cortex.2018.08.016
    https://doi.org/10.1016/j.cortex.2018.08.016 [Google Scholar]
  43. Leontjev, Dmitri, Huhta, Ari, & Tolvanen, Asko
    (2022) L2 English Vocabulary breadth and knowledge of derivational morphology: One or two constructs?Language testing, 39(1), 1–25.
    [Google Scholar]
  44. Li, Juan, Hongquan Jiang, Aihua Shang, and Jingli Chen
    (2021) Research on associative learning mechanisms of L2 learners based on complex network theory. Computer Assisted Language Learning34, no.5–6: 637–662. 10.1080/09588221.2019.1633356
    https://doi.org/10.1080/09588221.2019.1633356 [Google Scholar]
  45. Lindström Tiedemann, Therese [Google Scholar]
  46. Lindström Tiedemann, Therese, Alfter, David, Mohammed, Yousuf Ali, Piipponen, Daniela, Silén, Beatrice, Volodina, Elena
    . (in press). Multiword expressions in Swedish as a second language: taxonomy, annotation and initial results. In: Giouli, Voula & Mititelu, Verginica Barbu eds. Multiword expressions in language resources. Linguistic, Lexicographic and Computational Considerations. Berlin: Language Science Press.
    [Google Scholar]
  47. Lindström Tiedemann, Therese, Alfter, David, & Volodina, Elena
    (2022) CEFR-nivåer och svenska flerordsuttryck. In: S. Björklund, B. Haagensen, M. Nordman & A. Westerlund (eds.), Svenskan i Finland 19: Föredrag vid den nittonde sammankomsten för beskrivningen av svenskan i Finland, Vasa den 6–7 maj 2021. Vasa: Svensk-österbottniska samfundet, pp.218–233.
    [Google Scholar]
  48. Lüdeling, Anke, Hirschmann, Hagen, & Shadrova, Anna
    (2017) Linguistic models, Acquisition Theories, and Learner Corpora: Morphological productivity in SLA research exemplified by complex verbs in German. Language learning, 67(S1), 96–129. 10.1111/lang.12231
    https://doi.org/10.1111/lang.12231 [Google Scholar]
  49. Michel, Marije
    (2017) Complexity, accuracy, and fluency in L2 production. InThe Routledge handbook of instructed second language acquisition, pp.50–68. Routledge. 10.4324/9781315676968‑4
    https://doi.org/10.4324/9781315676968-4 [Google Scholar]
  50. Morin, Regina
    (2006) Building depth of Spanish L2 vocabulary by building and using word families. Hispania89:1: 170–182. 10.2307/20063269
    https://doi.org/10.2307/20063269 [Google Scholar]
  51. Nation, Paul
  52. (2021) Thoughts on word families. Studies in Second Language Acquisition, 43(5), 969–972. 10.1017/S027226312100067X
    https://doi.org/10.1017/S027226312100067X [Google Scholar]
  53. Nation, Paul, & Heatley, A.
    (1996) VocabProfile, Word and Range: Programs for Processing Text. LALS, Victoria University of Wellington.
    [Google Scholar]
  54. Nikolaev, Alexandre, Ashaie, Sameer, Hallikainen, Merja, Hänninen, Tuomo, Higby, Eve, Hyun, JungMoon, Lehtonen, Minna, & Soininen, Hilkka
    (2019) Effects of morphological family on word recognition in normal aging, mild cognitive impairment, and Alzheimer’s disease. Cortex, 1161, 91–103. 10.1016/j.cortex.2018.10.028
    https://doi.org/10.1016/j.cortex.2018.10.028 [Google Scholar]
  55. Sasao, Yosuke, & Webb, Stuart
    (2017) The word part levels test. Language Teaching Research, 21(1), 12–30. 10.1177/1362168815586083
    https://doi.org/10.1177/1362168815586083 [Google Scholar]
  56. Schmitt, Norbert, & Zimmerman, Cheryl Boyd
    (2002) Derivative word forms: What do learners know?. TESOL quarterly, 36(2), 145–171. 10.2307/3588328
    https://doi.org/10.2307/3588328 [Google Scholar]
  57. Smit, Peter, Virpioja, Sami, Grönroos, Stig-Arne & Kurimo, Mikko
    (2014) Morfessor 2.0: Toolkit for statistical morphological segmentation. InThe 14th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Gothenburg, Sweden, April 26–30, 2014. Aalto University.
  58. Šnajder, Jan
    (2014) DerivBase.hr: A high-coverage derivational morphology resource for Croatian. InProceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), pp.3371–3377.
    [Google Scholar]
  59. Snoder, Per, & Laufer, Batia
    (2022) EFL Learners’ Receptive Knowledge of Derived Words: The Case of Swedish Adolescents. TESOL Quarterly. 10.1002/tesq.3101
    https://doi.org/10.1002/tesq.3101 [Google Scholar]
  60. Sorokin, Alexey & Kravtsova, Anastasia
    (2018) Deep convolutional networks for supervised morpheme segmentation of Russian language. InConference on Artificial Intelligence and Natural Language, p.3–10. Springer. 10.1007/978‑3‑030‑01204‑5_1
    https://doi.org/10.1007/978-3-030-01204-5_1 [Google Scholar]
  61. Stoeckel, Tim, Ishii, Tomoko, & Bennett, Phil
    (2020) Is the lemma more appropriate than the flemma as a word counting unit?Applied Linguistics, 41(4), 601–606. 10.1093/applin/amy059
    https://doi.org/10.1093/applin/amy059 [Google Scholar]
  62. Svensk ordbok utgiven av Svenska akademien
    Svensk ordbok utgiven av Svenska akademien (2009) Stockholm: Norstedts.
  63. Svensson, Anders
    (2022) Tre av fyra nyord är substantiv. Språktidningen21Jan 2022.
    [Google Scholar]
  64. Talamo, Luigi, Celata, Chiara, & Bertinetto, Pier Marco
    (2016) DerIvaTario: An annotated lexicon of Italian derivatives. Word Structure, 9(1), 72–102. 10.3366/word.2016.0087
    https://doi.org/10.3366/word.2016.0087 [Google Scholar]
  65. Teleman, Ulf, Hellberg, Staffan, Andersson, Erik, & Christensen, Lisa
    (1999) Svenska Akademiens Grammatik. Stockholm: Svenska Akademien & Norstedts ordbok.
    [Google Scholar]
  66. Volodina, Elena, Mohammed, Yousuf Ali, & Lindström Tiedemann, Therese
    (2021) CoDeRooMor: A new dataset for non-inflectional morphology studies of Swedish. InProceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa), pp.178–189.
    [Google Scholar]
  67. Volodina, Elena, Mohammad, Yousuf Ali & Tiedemann Lindström, Therese
    (2022) Lyxig språklig födelsedagspresent from the Swedish Word Family. InVolodina, Dannélls, Berdicevskis, Forsberg and Virk (editors), Live and Learn – Festschrift in honor of Lars Borin, pages153–160.
    [Google Scholar]
  68. Volodina, Elena, Pilán, Ildikó, Enström, Ingegerd, Llozhi, Lorena, Lundkvist, Peter, Sundberg, Gunlög, & Sandell, Monica
    (2016) SweLL on the rise: Swedish Learner Language corpus for European Reference Level studies. Proceedings of LREC 2016, Slovenia.
    [Google Scholar]
  69. Volodina, Elena, Pilán, Ildikó, Rødven Eide, Stian, & Heidarsson, Hannes
    (2014) You get what you annotate: a pedagogically annotated corpus of coursebooks for Swedish as a Second Language. Proceedings of the third workshop on NLP for computer-assisted language learning. NEALT Proceedings Series 22 / Linköping Electronic Conference Proceedings 107: 128–144.
    [Google Scholar]
  70. Webb, Stuart
    (2021) Word families and lemmas, not a real dilemma: Investigating lexical units. Studies in Second Language Acquisition, 43(5), 973–984. 10.1017/S0272263121000760
    https://doi.org/10.1017/S0272263121000760 [Google Scholar]
  71. Žabokrtský, Zdeněk, Ševčíková, Magda, Straka, Milan, Vidra, Jonáš, & Limburská, Adéla
    (2016) Merging data resources for inflectional and derivational morphology in Czech. InProceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp.1307–1314.
    [Google Scholar]
  72. Zeller, Britta, Šnajder, Jan, & Padó, Sebastian
    (2013) DErivBase: Inducing and evaluating a derivational morphology resource for German. InProceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp.1201–1211.
    [Google Scholar]
  73. Zhang, Dongbo, & Koda, Keiko
    (2012) Contribution of morphological awareness and lexical inferencing ability to L2 vocabulary knowledge and reading comprehension among advanced EFL learners: testing direct and indirect effects. Reading and writing251, 1195–1216. 10.1007/s11145‑011‑9313‑z
    https://doi.org/10.1007/s11145-011-9313-z [Google Scholar]
/content/journals/10.1075/itl.22026.vol
Loading
/content/journals/10.1075/itl.22026.vol
Loading

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error