Volume 20, Issue 1
  • ISSN 1387-6732
  • E-ISSN: 1570-6001
Buy:$35.00 + Taxes


The increased globalization of science and technology and the growing number of bilinguals and multilinguals in the world have made research with multiple languages a mainstay for scholars who study human function and especially those who focus on language, cognition, and the brain. Such research can benefit from large-scale databases and online resources that describe and measure lexical, phonological, orthographic, and semantic information. The present paper discusses currently-available resources and underscores the need for tools that enable measurements both within and across multiple languages. A general review of language databases is followed by a targeted introduction to databases of orthographic and phonological neighborhoods. A specific focus on CLEARPOND illustrates how databases can be used to assess and compare neighborhood information across languages, to develop research materials, and to provide insight into broad questions about language. As an example of how using large-scale databases can answer questions about language, a closer look at neighborhood effects on lexical access reveals that not only orthographic, but also phonological neighborhoods can influence visual lexical access both within and across languages. We conclude that capitalizing upon large-scale linguistic databases can advance, refine, and accelerate scientific discoveries about the human linguistic capacity.


Article metrics loading...

Loading full text...

Full text loading...


  1. Andrews, Sally
    (1992) Frequency and neighborhood effects on lexical access: Lexical similarity or orthographic redundancy?Journal of Experimental Psychology: Learning, Memory, and Cognition18(2): 234.
    [Google Scholar]
  2. Baayen, R. Harald , Richard Piepenbrock & Léon Gulikers
    (1995) The CELEX lexical database (CD-ROM). Philadelphia, PA: Linguistic Data Consortium, University of Pennsylvania.
    [Google Scholar]
  3. Bartolotti, James & Viorica Marian
    (2017) Bilinguals’ existing languages benefit vocabulary learning in a third language. Language Learning67(1): 110–140. doi: 10.1111/lang.12200
    https://doi.org/10.1111/lang.12200 [Google Scholar]
  4. Brysbaert, Marc , Matthias Buchmeier , Markus Conrad , Arthur M. Jacobs , Jens Bölte & Andrea Böhl
    (2011) The word frequency effect: a review of recent developments and implications for the choice of frequency estimates in German. Experimental Psychology58(5): 412. doi: 10.1027/1618‑3169/a000123
    https://doi.org/10.1027/1618-3169/a000123 [Google Scholar]
  5. Brysbaert, Marc & Boris New
    (2009) Moving beyond Kučera and Francis: A critical evaluation of current word frequency norms and the introduction of a new and improved word frequency measure for American English. Behavior research methods41(4): 977–990. doi: 10.3758/BRM.41.4.977
    https://doi.org/10.3758/BRM.41.4.977 [Google Scholar]
  6. Brysbaert, Marc , Evelyne Lagrou & Michael Stevens
    (2017) Visual word recognition in a second language: A test of the lexical entrenchment hypothesis with lexical decision times. Bilingualism: Language and Cognition. 20(3): 530–548. doi: 10.1017/S1366728916000353
    https://doi.org/10.1017/S1366728916000353 [Google Scholar]
  7. Brysbaert, Marc , Michaël Stevens , Simon De Deyne , Wouter Voorspoels & Gert Storms
    (2014) Norms of age of acquisition and concreteness for 30,000 Dutch words. Acta psychological150: 80–84. doi: 10.1016/j.actpsy.2014.04.010
    https://doi.org/10.1016/j.actpsy.2014.04.010 [Google Scholar]
  8. Brysbaert, Marc , Michael Stevens , Pawel Mandera & Emmanuel Keuleers
    (2016) The impact of word prevalence on lexical decision times: Evidence from the Dutch Lexicon Project 2. Journal of Experimental Psychology: Human Perception and Performance42(3), 441–458.
    [Google Scholar]
  9. Cai, Qing & Marc Brysbaert
    (2010) SUBTLEX-CH: Chinese word and character frequencies based on film subtitles. PLoS One5(6): e10729. doi: 10.1371/journal.pone.0010729
    https://doi.org/10.1371/journal.pone.0010729 [Google Scholar]
  10. Coltheart, Max
    (1981) The MRC psycholinguistic database. The Quarterly Journal of Experimental Psychology33(4): 497–505. doi: 10.1080/14640748108400805
    https://doi.org/10.1080/14640748108400805 [Google Scholar]
  11. Cuetos, Fernando , Maria Glez-Nosti , Analía Barbón & Marc Brysbaert
    (2011) SUBTLEX-ESP: Spanish word frequencies based on film subtitles. Psicologica32: 133–143.
    [Google Scholar]
  12. Davies, Mark
    (2009) The 385+ million word Corpus of Contemporary American English (1990–2008+): Design, architecture, and linguistic insights. International Journal of Corpus Linguistics14(2): 159–190. doi: 10.1075/ijcl.14.2.02dav
    https://doi.org/10.1075/ijcl.14.2.02dav [Google Scholar]
  13. Davis, Colin J.
    (2005) N-Watch: A program for deriving neighborhood size and other psycholinguistic statistics. Behavior research methods37(1): 65–70. doi: 10.3758/BF03206399
    https://doi.org/10.3758/BF03206399 [Google Scholar]
  14. Davis, Colin J. & Manuel Perea
    (2005) BuscaPalabras: A program for deriving orthographic and phonological neighborhood statistics and other psycholinguistic indices in Spanish. Behavior Research Methods37 (4): 665–671. doi: 10.3758/BF03192738
    https://doi.org/10.3758/BF03192738 [Google Scholar]
  15. De Deyne & Gert Storms
    (n.d.). Word association study. Retrieved fromwww.smallworldofwords.com
    [Google Scholar]
  16. de Groot, Annette M. B. , Susanne Borgwaldt , Mieke Bos & Ellen van den Eijnden
    (2002) Lexical decision and word naming in bilinguals: Language effects and task effects. Journal of Memory and Language47(1): 91–124. doi: 10.1006/jmla.2001.2840
    https://doi.org/10.1006/jmla.2001.2840 [Google Scholar]
  17. Dimitropoulou, Maria , Jon Andoni Duñabeitia , Alberto Avilés , José Corral & Manuel Carreiras
    (2010) Subtitle-based word frequencies as the best estimate of reading behavior: The case of Greek. Frontiers in psychology1: 218. doi: 10.3389/fpsyg.2010.00218
    https://doi.org/10.3389/fpsyg.2010.00218 [Google Scholar]
  18. Duyck, Wouter , Timothy Desmet , Lieven P. C. Verbeke & Marc Brysbaert (2004) WordGen: A tool for word selection and nonword generation in Dutch, English, German, and French. Behavior Research Methods, Instruments, & Computers36: 488–499 doi: 10.3758/BF03195595
    https://doi.org/10.3758/BF03195595 [Google Scholar]
  19. EsPal
    EsPal. Retrieved fromwww.bcbl.eu/databases/espal/
  20. Frisch, Stefan A. , Nathan R. Large & David B. Pisoni
    (2000) Perception of wordlikeness: Effects of segment probability and length on the processing of nonwords. Journal of memory and language42(4): 481–496. doi: 10.1006/jmla.1999.2692
    https://doi.org/10.1006/jmla.1999.2692 [Google Scholar]
  21. Grainger, Jonathan , Mathilde Muneaux , Fernand Farioli & Johannes C. Ziegler
    (2005) Effects of phonological and orthographic neighbourhood density interact in visual word recognition. The Quarterly Journal of Experimental Psychology Section A58(6): 981–998. doi: 10.1080/02724980443000386
    https://doi.org/10.1080/02724980443000386 [Google Scholar]
  22. Grainger, Jonathan
    (1990) Word frequency and neighborhood frequency effects in lexical decision and naming. Journal of Memory and Language29: 228–244. doi: 10.1016/0749‑596X(90)90074‑A
    https://doi.org/10.1016/0749-596X(90)90074-A [Google Scholar]
  23. Keuleers, Emmanuel
  24. Keuleers, Emmanuel , Marc Brysbaert & Boris New
    (2010) SUBTLEX-NL: A new measure for Dutch word frequency based on film subtitles. Behavior research methods42(3): 643–650. doi: 10.3758/BRM.42.3.643
    https://doi.org/10.3758/BRM.42.3.643 [Google Scholar]
  25. Kiss, George R. , Christine Armstrong , Robert Milroy & James Piper
    (1973) An associative thesaurus of English and its computer analysis. In Adam Jack Aitken & Richard W. Bailey (eds.), The computer and literary studies, 153–165. Edinburgh: University Press.
    [Google Scholar]
  26. Kuperman, Victor , Hans Stadthagen-Gonzalez & Marc Brysbaert
    (2012) Age-of-acquisition ratings for 30,000 English words. Behavior Research Methods44(4): 978–990. doi: 10.3758/s13428‑012‑0210‑4
    https://doi.org/10.3758/s13428-012-0210-4 [Google Scholar]
  27. Luce, Paul A. & David B. Pisoni
    (1998) Recognizing spoken words: The neighborhood activation model. Ear and hearing19(1): 1. doi: 10.1097/00003446‑199802000‑00001
    https://doi.org/10.1097/00003446-199802000-00001 [Google Scholar]
  28. Luce, Paul A. & Nathan R. Large
    (2001) Phonotactics, density, and entropy in spoken word recognition. Language and Cognitive Processes16: 565–581. doi: 10.1080/01690960143000137
    https://doi.org/10.1080/01690960143000137 [Google Scholar]
  29. MacWhinney, Brian
    (2000) The CHILDES Project: Tools for Analyzing Talk. Mahwah, NJ: Lawrence Erlbaum Associates.
    [Google Scholar]
  30. Marian, Viorica & Henrike Blumenfeld
    (2006) Phonological neighborhood density guides lexical access in native and non-native language production. Journal of Social and Ecological Boundaries2: 3–35.
    [Google Scholar]
  31. Marian, Viorica , James Bartolotti , Sarah Chabal & Anthony Shook
    (2012) CLEARPOND: Cross-linguistic easy-access resource for phonological and orthographic neighborhood densities. PloS one7(8): e43230. doi: 10.1371/journal.pone.0043230
    https://doi.org/10.1371/journal.pone.0043230 [Google Scholar]
  32. McRae, Ken , George S. Cree , Mark S. Seidenberg & Chris McNorgan
    (2005) Semantic feature production norms for a large set of living and nonliving things. Behavior research methods37(4): 547–559. doi: 10.3758/BF03192726
    https://doi.org/10.3758/BF03192726 [Google Scholar]
  33. MCWord
    MCWord. Retrieved fromwww.neuro.mcw.edu/mcword/
  34. Nelson, Douglas L. , Cathy L. McEvoy & Thomas A. Schreiber
    (2004) The University of South Florida free association, rhyme, and word fragment norms. Behavior Research Methods, Instruments, & Computers36(3): 402–407. doi: 10.3758/BF03195588
    https://doi.org/10.3758/BF03195588 [Google Scholar]
  35. New, Boris , Christophe Pallier , Marc Brysbaert & Ludovic Ferrand
    (2004) Lexique 2: A new French lexical database. Behavior Research Methods, Instruments, & Computers36(3): 516–524. doi: 10.3758/BF03195598
    https://doi.org/10.3758/BF03195598 [Google Scholar]
  36. Nusbaum, Howard C. , David B. Pisoni & Christopher K. Davis
    (1984) Sizing up the Hoosier mental lexicon: Measuring the familiarity of 20,000 words. (Progress Report No. 10; pp.357–376). Bloomington, IN: Speech Research Laboratory, Indiana University.
    [Google Scholar]
  37. Roodenrys, Steven & Melinda Hinton (2002) Sublexical or lexical effects on serial recall of nonwords?Journal of Experimental Psychology: Learning, Memory, and Cognition28(1): 29.
    [Google Scholar]
  38. Shaoul, Cyrus & Chris Westbury
    (2010) Exploring lexical co-occurrence space using HiDEx. Behavior Research Methods42(2): 393–413. doi: 10.3758/BRM.42.2.393
    https://doi.org/10.3758/BRM.42.2.393 [Google Scholar]
  39. Storkel, Holly L. , Jonna Armbruster & Tiffany P. Hogan
    (2006) Differentiating phonotactic probability and neighborhood density in adult word learning. Journal of Speech, Language, and Hearing Research49 (6): 1175–1192. doi: 10.1044/1092‑4388(2006/085)
    https://doi.org/10.1044/1092-4388(2006/085) [Google Scholar]
  40. The Irvine Phonotactic Online Dictionary (IPhOD)
    The Irvine Phonotactic Online Dictionary (IPhOD). Retrieved fromwww.iphod.com/
  41. Thorn, Annabel S. C. & Clive R. Frankish
    (2005) Long-term knowledge effects on serial recall of nonwords are not exclusively lexical. Journal of Experimental Psychology: Learning, Memory, and Cognition31(4): 729.
    [Google Scholar]
  42. Tiedemann, Jörg , & Lars Nygaard
    (2004) The OPUS Corpus-Parallel and Free. InProceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04). Lisbon, Portugal.
    [Google Scholar]
  43. Tsai, Jie-Li , Chia-Ying Lee , Ying-Chun Lin , Ovid J. L. Tzeng & Daisy L. Hung
    (2006) Neighborhood size effects of Chinese words in lexical decision and reading. Language and Linguistics7(3): 659–675.
    [Google Scholar]
  44. Van Heuven, Walter , Dijkstra, Ton , & Grainger, Jonathan
    (1998) Orthographic neighborhood effects in bilingual word recognition. Journal of Memory and Language, 39(3): 458–483. doi: 10.1006/jmla.1998.2584
    https://doi.org/10.1006/jmla.1998.2584 [Google Scholar]
  45. Vitevitch, Michael S. & Paul A. Luce
    (1999) Probabilistic phonotactics and neighborhood activation in spoken word recognition. Journal of Memory and Language40(3): 374–408. doi: 10.1006/jmla.1998.2618
    https://doi.org/10.1006/jmla.1998.2618 [Google Scholar]
  46. Vitevitch, Michael S. & Eva Rodríguez
    (2006) Neighborhood density effects in spoken word recognition in Spanish. Journal of Multilingual Communication Disorders3 (1): 64–73. doi: 10.1080/14769670400027332
    https://doi.org/10.1080/14769670400027332 [Google Scholar]
  47. Vitevitch, Michael S. & Melissa K. Stamer
    (2006) The curious case of competition in Spanish speech production. Language and cognitive processes21(6): 760–770. doi: 10.1080/01690960500287196
    https://doi.org/10.1080/01690960500287196 [Google Scholar]
  48. Washington University Speech and Hearing Lab database
    Washington University Speech and Hearing Lab database. Retrieved fromwww.neuro.mcw.edu/mcword/
  49. Westbury Lab.
  50. WikiPedia
    WikiPedia. Retrieved fromhttps://www.wikipedia.org/
  51. WordNet
    WordNet. Retrieved fromwordnet.princeton.edu
  52. Yarkoni, Tal , David Balota & Melvin Yap
    (2008) Moving beyond Coltheart’s N: A new measure of orthographic similarity. Psychonomic Bulletin and Review, 15, 5, pp.971–979. doi: 10.3758/PBR.15.5.971
    https://doi.org/10.3758/PBR.15.5.971 [Google Scholar]
  53. Yates, Mark
    (2005) Phonological neighbors speed visual word processing: evidence from multiple tasks. Journal of Experimental Psychology: Learning, Memory, and Cognition31(6): 1385.
    [Google Scholar]
  54. Yates, Mark , Lawrence Locker & Greg B. Simpson
    (2004) The influence of phonological neighborhood on visual word perception. Psychonomic Bulletin & Review11(3): 452–457. doi: 10.3758/BF03196594
    https://doi.org/10.3758/BF03196594 [Google Scholar]
  55. Ziegler, Johannes C. , Mathilde Muneaux & Jonathan Grainger
    (2003) Neighborhood effects in auditory word recognition: Phonological competition and orthographic facilitation. Journal of Memory and Language48(4): 779–793. doi: 10.1016/S0749‑596X(03)00006‑8
    https://doi.org/10.1016/S0749-596X(03)00006-8 [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error