Automaic Text Summarization

Kumar, Trun (2014) Automaic Text Summarization. BTech thesis.

[img]
Preview
PDF
582Kb

Abstract

Automatic summarization is the procedure of decreasing the content of a document with a machine (computer) program so as to make a summary that holds the most critical sentences of the text file (document). Extracting summary from the document is a difficult task for human beings. Therefore to generate summary automatically has to facilitate several challenges; as the system automates it can only extract the required information from the original document. As the issue of information overload has grown - trouble has been initiated, and as the measure of data has extended, so has eagerness to customize it. It is uncommonly troublesome for individuals to physically condense broad reports of substance. Automatic Summarization systems may be classified into extractive and abstractive summary. An extractive summary method involves selecting indispensable sentences from the record and interfacing them into shorter structure. The vitality of sentences chosen is focused around factual and semantic characteristics of sentences. Extractive method work by selecting a subset of existing words, or sentences in the text file (content document) to produce the summary of input text file. The looking of important data from a huge content document is exceptionally difficult occupation for the user consequently to programmed concentrate the imperative information or summary of the content record. This summary helps the users to reduce time instead Of reading the whole text document and it provide quick knowledge from the large text file. The extractive summarization are commonly focused around techniques for sentence extraction to blanket the set of sentences that are most important for the general understanding of a given text file. In frequency based technique, obtained summary makes more meaning. But in k-means clustering due to out of order extraction, summary might not make sense

Item Type:Thesis (BTech)
Uncontrolled Keywords:Extractive;Semantic;Clustering
Subjects:Engineering and Technology > Computer and Information Science > Data Mining
Divisions: Engineering and Technology > Department of Computer Science
ID Code:5619
Deposited By:Hemanta Biswal
Deposited On:21 Jul 2014 14:38
Last Modified:21 Jul 2014 14:38
Supervisor(s): Babu, K S

Repository Staff Only: item control page