Volume 19, Issue 4
  • ISSN 1606-822X
  • E-ISSN: 2309-5067



Recurrent word sequences, referred to as “lexical bundles”, may be structurally incomplete, but they serve important communicative functions. Despite the essential roles of lexical bundles in discourse, many methodological issues have been raised in the process of identifying lexical bundles, which is generally frequency-based. The present study identifies three-word and four-word bundles in Chinese conversation and news, and efforts are made to respond to methodological challenges encountered in previous studies. We employ a more sensitive dispersion measure, DP, and an internal association measure, G, which help filter out high-frequency word sequences with no identifiable function and reduce the workload of further manual interventions. An exploratory data analysis is then conducted to compare the distributional patterns of lexical bundles in Chinese conversation and news. In Chinese, both the type number and the density of lexical bundles are higher in conversation than in news. This appears to be a strong cross-linguistic tendency that reflects the real-time pressure speakers face in spontaneous speech. The exploratory data analysis also shows that the elements in Chinese bundles are closely associated with each other. This suggests that lexical bundles are useful phrasal units in Chinese discourse, and thus invites further investigations of how lexical bundles are used in Chinese.

Available under the CC BY 4.0 license.

Article metrics loading...

Loading full text...

Full text loading...



  1. Altenberg, Bengt & Eeg-Olofsson, Mats
    1990 Phraseology in spoken English: Presentation of a project. In Aarts, Jan & Meijs, Willem (eds.), Theory and practice in corpus linguistics, 1–26. Amsterdam: Rodopi.
    [Google Scholar]
  2. Biber, Douglas
    2009 A corpus-driven approach to formulaic language in English: Multi-word patterns in speech and writing. International Journal of Corpus Linguistics14(3). 275–311. 10.1075/ijcl.14.3.08bib
    https://doi.org/10.1075/ijcl.14.3.08bib [Google Scholar]
  3. Biber, Douglas & Barbieri, Federica
    2007 Lexical bundles in university spoken and written registers. English for Specific Purposes26(3). 263–286. 10.1016/j.esp.2006.08.003
    https://doi.org/10.1016/j.esp.2006.08.003 [Google Scholar]
  4. Biber, Douglas & Conrad, Susan & Cortes, Viviana
    2004 If you look at…: Lexical bundles in university teaching and textbooks. Applied Linguistics25(3). 371–405. 10.1093/applin/25.3.371
    https://doi.org/10.1093/applin/25.3.371 [Google Scholar]
  5. Biber, Douglas & Johansson, Stig & Leech, Geoffrey & Conrad, Susan & Finegan, Edward
    1999Longman grammar of spoken and written English. London: Longman.
    [Google Scholar]
  6. Butler, Christopher S.
    1997 Repeated word combinations in spoken and written text: Some implications for functional grammar. In Butler, Christopher S. & Connolly, John H. & Gatward, Richard A. & Vismans, Roel M. (eds.), A fund of ideas: Recent developments in functional grammar (Studies in Language and Language Use 31), 60–77. Amsterdam: IFOTT.
    [Google Scholar]
  7. Bybee, Joan
    2007Frequency of use and the organization of language. Oxford: Oxford University Press. 10.1093/acprof:oso/9780195301571.001.0001
    https://doi.org/10.1093/acprof:oso/9780195301571.001.0001 [Google Scholar]
  8. Carroll, John B.
    1970 An alternative to Juilland’s usage coefficient for lexical frequencies and a proposal for a standard frequency index (SFI). Computer Studies in the Humanities and Verbal Behavior3(2). 61–65.
    [Google Scholar]
  9. Chen, Lin
    2010 An investigation of lexical bundles in ESP textbooks and electrical engineering introductory textbooks. In Wood, David (ed.), Perspectives on formulaic language: Acquisition and communication, 107–125. London: Continuum.
    [Google Scholar]
  10. Chen, Yu-Hua & Baker, Paul
    2010 Lexical bundles in L1 and L2 academic writing. Language Learning & Technology14(2). 30–49.
    [Google Scholar]
  11. Conklin, Kathy & Schmitt, Norbert
    2008 Formulaic sequences: Are they processed more quickly than nonformulaic language by native and nonnative speakers?Applied Linguistics29(1). 72–89. 10.1093/applin/amm022
    https://doi.org/10.1093/applin/amm022 [Google Scholar]
  12. Conrad, Susan & Biber, Douglas
    2004 The frequency and use of lexical bundles in conversation and academic prose. Lexicographica201. 56–71. 10.1515/9783484604674.56
    https://doi.org/10.1515/9783484604674.56 [Google Scholar]
  13. Cortes, Viviana
    2002 Lexical bundles in freshman composition. In Reppen, Randi & Fitzmaurice, Susan M. & Biber, Douglas (eds.), Using corpora to explore linguistic variation (Studies in Corpus Linguistics 9), 131–145. Amsterdam: John Benjamins. 10.1075/scl.9.09cor
    https://doi.org/10.1075/scl.9.09cor [Google Scholar]
  14. 2004 Lexical bundles in published and student disciplinary writing: Examples from history and biology. English for Specific Purposes23(4). 397–423. 10.1016/j.esp.2003.12.001
    https://doi.org/10.1016/j.esp.2003.12.001 [Google Scholar]
  15. 2008 A comparative analysis of lexical bundles in academic history writing in English and Spanish. Corpora3(1). 43–57. 10.3366/E1749503208000063
    https://doi.org/10.3366/E1749503208000063 [Google Scholar]
  16. Cortes, Viviana & Csomay, Eniko
    2007 Positioning lexical bundles in university lectures. In Campoy, Mari Carmen & Luzón, María José (eds.), Spoken corpora in applied linguistics (Linguistic Insights 51), 57–76. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  17. Culpeper, Jonathan & Kytö, Merja
    2002 Lexical bundles in Early Modern English dialogues: A window into the speech-related language of the past. In Fanego, Teresa & Méndez-Naya, Belén & Seoane, Elena (eds.), Sounds, words, texts, and change, vol.21 (Current Issues in Linguistic Theory 224), 45–63. Amsterdam: John Benjamins. 10.1075/cilt.224.06cul
    https://doi.org/10.1075/cilt.224.06cul [Google Scholar]
  18. De Cock, Sylvie
    1998 A recurrent word combination approach to the study of formulae in the speech of native and non-native speakers of English. International Journal of Corpus Linguistics3(1). 59–80. 10.1075/ijcl.3.1.04dec
    https://doi.org/10.1075/ijcl.3.1.04dec [Google Scholar]
  19. Gries, Stefan Th
    2008 Dispersion and adjusted frequencies in corpora. International Journal of Corpus Linguistics13(4). 403–437. 10.1075/ijcl.13.4.02gri
    https://doi.org/10.1075/ijcl.13.4.02gri [Google Scholar]
  20. 2009Quantitative corpus linguistics with R: A practical introduction. London: Routledge. 10.4324/9780203880920
    https://doi.org/10.4324/9780203880920 [Google Scholar]
  21. Hyland, Ken
    2008 As can be seen: Lexical bundles and disciplinary variation. English for Specific Purposes27(1). 4–21. 10.1016/j.esp.2007.06.001
    https://doi.org/10.1016/j.esp.2007.06.001 [Google Scholar]
  22. 2012 Bundles in academic discourse. Annual Review of Applied Linguistics321. 150–169. 10.1017/S0267190512000037
    https://doi.org/10.1017/S0267190512000037 [Google Scholar]
  23. Institute of Information Science & CKIP Group in Academia Sinica
    Institute of Information Science & CKIP Group in Academia Sinica 2013Academia Sinica Balanced Corpus of Modern Chinese. 4th edn. (asbc.iis.sinica.edu.tw/) (Accessed2016-10-04.)
    [Google Scholar]
  24. Jiang, Nan & Nekrasova, Tatiana M.
    2007 The processing of formulaic sequences by second language speakers. The Modern Language Journal91(3). 433–445. 10.1111/j.1540‑4781.2007.00589.x
    https://doi.org/10.1111/j.1540-4781.2007.00589.x [Google Scholar]
  25. Johnstone, Barbara
    2002Discourse analysis (Introducing Linguistics). Malden: Blackwell.
    [Google Scholar]
  26. Kim, YouJin
    2009 Korean lexical bundles in conversation and academic texts. Corpora4(2). 135–165. 10.3366/E1749503209000288
    https://doi.org/10.3366/E1749503209000288 [Google Scholar]
  27. Kopaczyk, Joanna
    2012 Applications of the lexical bundles method in historical corpus research. In Pęzik, Piotr (ed.), Corpus data across languages and disciplines (Łódź Studies in Language 28), 83–95. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  28. Leńko-Szymańska, Agnieszka
    2014 The acquisition of formulaic language by EFL learners: A cross-sectional and cross-linguistic perspective. International Journal of Corpus Linguistics19(2). 225–251. 10.1075/ijcl.19.2.04len
    https://doi.org/10.1075/ijcl.19.2.04len [Google Scholar]
  29. Li, Charles N. & Thompson, Sandra A.
    1981Mandarin Chinese: A functional reference grammar. Berkeley: University of California Press.
    [Google Scholar]
  30. McEnery, Tony & Xiao, Richard & Tono, Yukio
    2006Corpus-based language studies: An advanced resource book (Routledge Applied Linguistics). London: Routledge.
    [Google Scholar]
  31. Nesi, Hilary & Basturkmen, Helen
    2006 Lexical bundles and discourse signalling in academic lectures. International Journal of Corpus Linguistics11(3). 283–304. 10.1075/ijcl.11.3.04nes
    https://doi.org/10.1075/ijcl.11.3.04nes [Google Scholar]
  32. O’Keeffe, Anne & McCarthy, Michael & Carter, Ronald
    2007From corpus to classroom: Language use and language teaching. Cambridge: Cambridge University Press. 10.1017/CBO9780511497650
    https://doi.org/10.1017/CBO9780511497650 [Google Scholar]
  33. Partington, Alan & Morley, John
    2004 At the heart of ideology: Word and cluster/bundle frequency in political debate. In Lewandowska-Tomaszczyk, Barbara (ed.), Practical applications in language and computers: PALC 2003 (Łódź Studies in Language 9), 179–192. Frankfurt am Main: Peter Lang.
    [Google Scholar]
  34. Pawley, Andrew & Syder, Frances Hodgetts
    1983 Two puzzles for linguistic theory: Nativelike selection and nativelike fluency. In Richards, Jack C. & Schmidt, Richard W. (eds.), Language and communication, 191–226. London: Longman.
    [Google Scholar]
  35. Salazar, Danica
    2014Lexical bundles in native and non-native scientific writing: Applying a corpus-based study to language teaching (Studies in Corpus Linguistics 65). Amsterdam: John Benjamins. 10.1075/scl.65
    https://doi.org/10.1075/scl.65 [Google Scholar]
  36. Simpson-Vlach, Rita & Ellis, Nick C.
    2010 An academic formulas list: New methods in phraseology research. Applied Linguistics31(4). 487–512. 10.1093/applin/amp058
    https://doi.org/10.1093/applin/amp058 [Google Scholar]
  37. Stubbs, Michael
    2007 Quantitative data on multi-word sequences in English: The case of the word world. In Hoey, Michael & Mahlberg, Michaela & Stubbs, Michael & Teubert, Wolfgang (eds.), Text, discourse and corpora: Theory and analysis, 163–189. London: Continuum.
    [Google Scholar]
  38. Tannen, Deborah
    1982 Oral and literate strategies in spoken and written narratives. Language58(1). 1–21. 10.2307/413530
    https://doi.org/10.2307/413530 [Google Scholar]
  39. Tao, Hongyin
    2015 Profiling the Mandarin spoken vocabulary based on corpora. In Wang, William S-Y. & Sun, Chaofen (eds.), The Oxford handbook of Chinese linguistics, 336–347. Oxford: Oxford University Press.
    [Google Scholar]
  40. Tracy-Ventura, Nicole & Cortes, Viviana & Biber, Douglas
    2007 Lexical bundles in speech and writing. In Parodi, Giovanni (ed.), Working with Spanish corpora (Research in Corpus and Discourse), 217–231. London: Continuum.
    [Google Scholar]
  41. Tremblay, Antoine & Derwing, Bruce & Libben, Gary
    2009 Are lexical bundles stored and processed as single units?Working Papers of the Linguistics Circle of the University of Victoria191. 258–279.
    [Google Scholar]
  42. Wei, Naixing & Li, Jingjie
    2013 A new computing method for extracting contiguous phraseological sequences from academic text corpora. International Journal of Corpus Linguistics18(4). 506–535. 10.1075/ijcl.18.4.03wei
    https://doi.org/10.1075/ijcl.18.4.03wei [Google Scholar]
  43. Wood, David
    2010 Lexical clusters in an EAP textbook corpus. In Wood, David (ed.), Perspectives on formulaic language: Acquisition and communication, 88–106. London: Continuum.
    [Google Scholar]
  44. Wray, Alison
    2002Formulaic language and the lexicon. Cambridge: Cambridge University Press. 10.1017/CBO9780511519772
    https://doi.org/10.1017/CBO9780511519772 [Google Scholar]
  45. Xu, Jiajin
    2015 Corpus-based Chinese studies: A historical review from the 1920s to the present. Chinese Language and Discourse6(2). 218–244. 10.1075/cld.6.2.06xu
    https://doi.org/10.1075/cld.6.2.06xu [Google Scholar]
  46. Zipf, George Kingsley
    1949Human behavior and the principle of least effort: An introduction to human ecology. Cambridge, MA: Addison-Wesley.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error