Volume 14, Issue 1
  • ISSN 1877-7031
  • E-ISSN: 1877-8798
Buy:$35.00 + Taxes



This article presents a corpus-based distributional analysis of the usage patterns of a cluster of words and compounds containing the morpheme () ‘just, exactly’, by the aid of an extended concordancer to retrieve representative collocations from their adjacent contexts in Chinese Gigaword. Upon a survey of the historical evolution of the cluster with exemplar data and an overview of existing proposals to account for their usages in terms of expectational match, our distributional analysis is conducted to identify the salient collocational or contextual features that lead to a number of interesting findings. Substantial evidences are provided for clarifying the non-word status of (恰如) and (恰似) and their similarities, the exchangeability of (恰好) and (恰巧), distinct collocational preferences of the adverbs (), (恰恰) and the others with different subsets of verbs, the prosodic requirement of an even number of syllables for a -adverb and its main verb, and the contrastive popularity of (恰恰) vs (恰当) to reveal different usage tendencies between speakers in Taiwan and the Mainland. All these novel findings and insights about the subtle (dis)similarities in the usage and meanings of the () cluster suggest that distributional analysis of contextual collocations using large-scale language data remains a powerful tool that can complement other analytical approaches for the advancement of lexical semantic research.


Article metrics loading...

Loading full text...

Full text loading...


  1. Firth, John Rupert
    (1957) A synopsis of linguistic theory, 1930–1955. InJohn Rupert Firth, (Ed.), Studies in Linguistic Analysis, pp.1–32. Oxford: Philological Society. Reprinted inF. R. Palmer (Ed.), Selected Papers of J. R. Firth 1952–1959, pp.168–205. Bloomington: Indiana University Press 1968.
    [Google Scholar]
  2. Harris, Zellig S.
    (1951) Methods in Structural Linguistics. Chicago: University of Chicago Press.
    [Google Scholar]
  3. (1954) Distributional Structure, WORD, 101:2–3, 146–162. 10.1080/00437956.1954.11659520
    https://doi.org/10.1080/00437956.1954.11659520 [Google Scholar]
  4. Hinton, G. E., J. L. McClelland and D. E. Rumelhart
    (1986) Distributed representations. InDavid E. Rumelhart, James L. McClelland and the PDP Research Group (Eds.), Parallel Distributed Processing: Explorations in the microstructure of cognition, Volume11: Foundations. Cambridge, MA: MIT Press.
    [Google Scholar]
  5. Kit, Chunyu
    (1998) Ba and bei as multi-valence prepositions in Chinese. InBenjamin K. T’sou (ed.), Studia Linguistica Sinica, pp.497–522. Language Information Sciences Research Centre, City University of Hong Kong.
    [Google Scholar]
  6. Kit, Chunyu and Yorick Wilks
    (1998) The Virtual Corpus approach to deriving n-gram statistics from large scale corpora. InChangning Huang (Ed.), Proceedings of 1998 International Conference on Chinese Information Processing Conference, pp.223–229.
    [Google Scholar]
  7. Lu, Shuxiang
    (1980) Xiandai Hanyu Babai Ci (Modern Chinese Eight Hundred Words). Beijing: The Commercial Press.
    [Google Scholar]
  8. Lu, Ying and Meichun Liu
    (2021) Grammatical development and semantic change of the qià-based lexical cluster: From objective match to subjective evaluation. InLiu, Meichun, Chunyu Kit & Qi Su (Eds.), Chinese Lexical Semantics: 21st Workshop, CLSW 2020, Hong Kong, China, May 28–30, 2020, Revised Selected Papers, pp.235–252. LNAI 12278. Switzerland: Springer. 10.1007/978‑3‑030‑81197‑6_21
    https://doi.org/10.1007/978-3-030-81197-6_21 [Google Scholar]
  9. Luhn, Hans Peter
    (1960) Keyword-in-context index for technical literature (KWIC index). American Documentation, 11(4):288–295. 10.1002/asi.5090110403
    https://doi.org/10.1002/asi.5090110403 [Google Scholar]
  10. Manning, Christopher D., and Hinrich Schütze
    (1999) Foundations of Statistical Natural Language Processing. Cambridge, Mass: MIT Press.
    [Google Scholar]
  11. Mikolov, Tomás, Kai Chen, Greg Corrado and Jeffrey Dean
    (2013) Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781
    [Google Scholar]
  12. Saussure, Ferdinand de
    (1916/1959) Course in General Linguistics. Charles Bally and Albert Sechehaye (Eds.), Wade Baskin (Trans.). New York: Philosophical Library.
    [Google Scholar]
  13. Wittgenstein, Ludwig
    (1953) Philosophical Investigations. G. E. M. Anscombe and R. Rhees (Eds.), G. E. M. Anscombe (Trans.). Oxford: Blackwell.
    [Google Scholar]
  14. Xun, Endong, Gaoqi Rao, Xiaoyue Xiao and Jiaojiao Zang
    (2016) Da shuju xia BBC yuliaoku de yanzhi [Building the BCC corpus in the background of big data], Yuliaoku Yuyanxue [Corpus Linguistics], 3(1):93–109.
    [Google Scholar]
  15. Xun, Endong, Gaoqi Rao, Jiali Xie,
    (2015) Xiandai Hanyu cihui lishi jiansuo xitong de jianshe yu yingyong [Building and applications of a diachronic retrieval system for modern Chinese vocabulary], Zhongwen Xinxi Xuebao [Journal of Chinese Information Processing], 29(3):169–176.
    [Google Scholar]
  16. Zhan, Weidong, Rui Guo, Baobao Chang, Yirong Chen and Long Chen
    (2019) Beijing Daxue CCL yuliaoku de yanzhi [The building of the CCL corpus: Its design and implementation]. Yuliaoku Yuyanxue [Corpus Linguistics], 6(1): 71–86.
    [Google Scholar]

Data & Media loading...

This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error