1887
Analyse Lexicale et Syntaxique: Le système INTEX
  • ISSN 0378-4169
  • E-ISSN: 1569-9927
GBP
Buy:£15.00 + Taxes

Abstract

The present study deals with the pre-processing of texts. This pre-processing is performed in three steps, which are: the segmentation of the texts into textual units (sentences), the re-writing of contracted forms into a standard form, and the tagging of unambiguous compounds. We describe here two of the three steps: text segmentation, and the re-writing of contracted forms. The segmentation of the texts into textual units is made possible by using the transducer Sentence. The re-writing of contracted forms into their standard forms is done by applying the transducer Normalisation. We describe in detail the various steps involved in the development of both transducers.

Loading

Article metrics loading...

/content/journals/10.1075/li.22.1-2.18zel
1998-01-01
2024-04-19
Loading full text...

Full text loading...

http://instance.metastore.ingenta.com/content/journals/10.1075/li.22.1-2.18zel
Loading
  • Article Type: Research Article
This is a required field
Please enter a valid email address
Approval was successful
Invalid data
An Error Occurred
Approval was partially successful, following selected items could not be processed due to error