1887

23. Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus

image of 23. Combined statistical and grammatical criteria for the retrieval of phraseological units in an electronic corpus

The aim of this study is to refine and optimise the mainly statistical and distributional approach to the automatic extraction of phraseological units (PUs) from text corpora, by introducing minimal linguistic elements (lemmatisation and grammatical tagging). These operations were first tested using the same corpora as in our previous research (Pamies & Pazos 2003 & 2004). This provided us with a new set of results, which we compared with the previous ones.We found that the detection ability had improved substantially, especially when dealing with verb + noun and verb + adjective collocations. This methodology was then applied to a larger corpus. Again, the results were encouraging, with phraseological densities up to 64.5% for the verb + noun category.

  • Affiliations: 1: Universidad de Granada, Spain
/content/books/9789027290113-z.139.31bre
dcterms_subject,pub_keyword
-contentType:Journal
10
5
Chapter
content/books/9789027290113
Book
false
Loading
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error