Using Tika in .NET for extracting text out of documents

added by chadmyers
7/5/2010 10:45:51 AM

8 Kicks, 2108 Views

Tika is an open source Java-based tool for extracting information out of various different document formats. It can be used with Lucene for indexing and searching documents, among other things. In this post, Kevin Miller talks about how use Tika in .NET via the IKVM.net utility.


1 comments

KevM
7/2/2010 4:42:54 PM
Feel so guilty kicking my own post... But not that guilty.