net.sf.regain.crawler.preparator
Class PoiMsOfficePreparator
java.lang.Object
net.sf.regain.crawler.document.AbstractPreparator
net.sf.regain.crawler.preparator.PoiMsOfficePreparator
- All Implemented Interfaces:
- Pluggable, Preparator, WriteablePreparator
public class PoiMsOfficePreparator
- extends AbstractPreparator
Prepares all MS*-documents using POI
POI-API.
The preparator use the generic extractor possibilities of POI.
Contributions from Jorge Corona.
- Author:
- Thomas Tesche, www.thtesche.com
Field Summary |
private static org.apache.log4j.Logger |
mLog
The logger for this class |
Methods inherited from class net.sf.regain.crawler.document.AbstractPreparator |
accepts, addAdditionalField, cleanUp, close, concatenateStringParts, getAdditionalFields, getCleanedContent, getCleanedMetaData, getHeadlines, getPath, getPriority, getSummary, getTitle, init, setCleanedContent, setCleanedMetaData, setHeadlines, setPath, setPriority, setSummary, setTitle, setUrlRegex |
Methods inherited from class java.lang.Object |
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
mLog
private static org.apache.log4j.Logger mLog
- The logger for this class
PoiMsOfficePreparator
public PoiMsOfficePreparator()
throws RegainException
- Creates a new instance of PoiMsOfficePreparator.
- Throws:
RegainException
- If creation of the preparator failed.
prepare
public void prepare(RawDocument rawDocument)
throws RegainException
- Prepares the document.
- Parameters:
rawDocument
- the document to prepare
- Throws:
RegainException
- thrown in case of errors
createMetaDataMap
private Map<String,String> createMetaDataMap(String rawLine)
Regain 2.1.0-STABLE, Copyright (C) 2004-2010 Til Schneider, www.murfman.de, Thomas Tesche, www.clustersystems.info