Abstract
A methodology for automatically summarising scientific texts is presented using the patterns of lexical cohesion found in such texts. Lexical cohesion is a type of cohesion whereby certain lexical features of the text connect sentences with each other in the text. An analysis of lexical cohesion in text, primarily by counting repetitions, synonyms, and paraphrase, leads to the establishment of a network of sentences, some tightly bonded through lexical cohesion relations, some others having weak bonds or no bonds at all. The strength of connections in this cohesion network is used to identify key sentences in a text. Some sentences open key topics, some close topics, whilst others consolidate a given topic. Topic opening, closing, and consolidating, or central sentences, have different strengths and different connectivity patterns. A selection of these sentences can be construed as a summary of a given text. TELE-PATTAN (TExt and LExical cohesion PATTerns ANalysis), a system for summarising text automatically, extracts patterns of lexical cohesion in a text, categorises its sentences, and subsequently produces summaries of the text on the basis of these patterns. Experiments were conducted with human subjects to evaluate the summaries. The results of this preliminary evaluation are encouraging.