Document summarization has become a topic among web developers of today. Why? It may seem like a very mundane and menial task to some, but it has become a necessary tool in dealing with the information overload rampant among many web sites and web pages of today. Some of the reasons why document summarization skills and teaching summarizing skills are important today are that people are relying more and more on the Internet thus uploading more documents into the web, too much time is wasted in browsing through web pages to get to the desired text and the growing popularity of using the Internet using smaller handheld devices.

Doing summaries are really easy, as you only have to choose the right website or program for your document. Therefore, any tip on document summarization is also just determining which summarizer software is most appropriate for you. Each website and/or program uses a specific kind of approach or a mixture of approaches in web summarization.

Extraction-based Approach

Most of the popular summarizer software websites use this approach in their program. From its name alone, the extraction-based approach is a technique wherein the software determines key phrases and key sentences, ranks their order and puts them together to produce a summary. This type of software uses a special algorithm that determines which words, phrases or sentences are most important based on the frequency of their appearance in the text, how they are related to other often-occurring works, phrases, sentences and others.

If you want this approach, you should try Clipped. Although for now it is only used for summarize article, it is better than most other websites in summarizing. It determines the most important information and the company behind it have developed software that ensures that grammar is correct and accurate when the summary is done.

For a free summarizer software, check out a simple tool developed by Andreas Gohr in his website A simple interface allows you to place the link of the web page you want and lets you determine the summarization ratio and language you want the results in.

Abstraction-based Approach

Another approach in document summarization is the abstraction-based summarization. This approach is more in depth and inevitably more tedious and time-consuming than the previous one. If extraction-based summarization takes out and ranks the most important word, phrases and sentences, abstraction entails paraphrasing and natural language generation technology. It aims to produce summaries like a human would.

A good website that uses this is tldr. Tldr is a community of netizens who share the same problem of reading more but learning less. To address this members of the community create summaries of articles for each other. These articles are categorized based on content so it is easy to browse. The news section is particularly update regularly so you might want to check there when you want a summary of a specific news.

While there is still so much to develop with this approach, efforts have been made and some AI technologies are in the making. The website Clipped (mentioned above) uses a bit of this as well although not extensively. In a few years, we might see more tips for summary writing
in document summarization using this technology

