This is a port to C# of the fantastic Open Text Summarizer (http://libots.sourceforge.net/) . It uses the same dictionary files and algorithms of the original OTS, though all of the code was rewritten.
This version of OTS.Net is only a library so far. The OTS command line tool was not ported. Additionally (and necessarily) the public interface for the tool has been changed. It should be rather simple and easy to use.
See the OTSTester test project for sample code on how to call it.
So What Is It?
The most famous text summarizer I am aware of is the Copernic Sumarizer
. Copernic's summarizer software lets you take web pages and other documents (PDFs, Word Documents, etc...) and creates a concise summary of that document which highlights the most important concepts and
ideas from that document. Sort of like Cliff's Notes for every web page anywhere.
Well, the Open Text Summarizer.Net is a free, open source library that performs a similar function. It was first written by Nadav Rotem for the Linux platform (of
C to Verilog
fame). It was so successful and such a useful tool it started shipping with all the major Linux distributions. It was ground breaking enough that it was mentioned in several academic publications (see
the original OTS site for details http://libots.sourceforge.net/
OpenTextSummarizer.Net uses the same techniques, algorithms, and dictionaries to perform the same task. It is a direct port from the Linux version to C#.
Using the Open Text Summarizer.Net you can quickly determine the top concepts that are used in a document. You can summarize the document and display as much of the document as you want (expressed either as a percentage of the document, or as a number of sentences.)
It is a standard C# assembly and can be consumed easily by any .Net application.