Regain 2.1.0-STABLE API

net.sf.regain.crawler.preparator
Class PoiMsOfficePreparator

java.lang.Object
  extended by net.sf.regain.crawler.document.AbstractPreparator
      extended by net.sf.regain.crawler.preparator.PoiMsOfficePreparator
All Implemented Interfaces:
Pluggable, Preparator, WriteablePreparator

public class PoiMsOfficePreparator
extends AbstractPreparator

Prepares all MS*-documents using POI POI-API.

The preparator use the generic extractor possibilities of POI. Contributions from Jorge Corona.

Author:
Thomas Tesche, www.thtesche.com

Field Summary
private static org.apache.log4j.Logger mLog
          The logger for this class
 
Fields inherited from interface net.sf.regain.crawler.document.Preparator
DEFAULT_BUFFER_SIZE
 
Constructor Summary
PoiMsOfficePreparator()
          Creates a new instance of PoiMsOfficePreparator.
 
Method Summary
private  Map<String,String> createMetaDataMap(String rawLine)
           
 void prepare(RawDocument rawDocument)
          Prepares the document.
 
Methods inherited from class net.sf.regain.crawler.document.AbstractPreparator
accepts, addAdditionalField, cleanUp, close, concatenateStringParts, getAdditionalFields, getCleanedContent, getCleanedMetaData, getHeadlines, getPath, getPriority, getSummary, getTitle, init, setCleanedContent, setCleanedMetaData, setHeadlines, setPath, setPriority, setSummary, setTitle, setUrlRegex
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

mLog

private static org.apache.log4j.Logger mLog
The logger for this class

Constructor Detail

PoiMsOfficePreparator

public PoiMsOfficePreparator()
                      throws RegainException
Creates a new instance of PoiMsOfficePreparator.

Throws:
RegainException - If creation of the preparator failed.
Method Detail

prepare

public void prepare(RawDocument rawDocument)
             throws RegainException
Prepares the document.

Parameters:
rawDocument - the document to prepare
Throws:
RegainException - thrown in case of errors

createMetaDataMap

private Map<String,String> createMetaDataMap(String rawLine)

Regain 2.1.0-STABLE API

Regain 2.1.0-STABLE, Copyright (C) 2004-2010 Til Schneider, www.murfman.de, Thomas Tesche, www.clustersystems.info