Regain 2.1.0-STABLE API

net.sf.regain.crawler.preparator
Class XmlPreparator

java.lang.Object
  extended by net.sf.regain.crawler.document.AbstractPreparator
      extended by net.sf.regain.crawler.preparator.XmlPreparator
All Implemented Interfaces:
Pluggable, Preparator, WriteablePreparator

public class XmlPreparator
extends AbstractPreparator

Präpariert ein XML-Dokument für die Indizierung.

Dabei werden die Rohdaten des Dokuments von Formatierungsinformation befreit.

Author:
Til Schneider, www.murfman.de

Field Summary
 
Fields inherited from interface net.sf.regain.crawler.document.Preparator
DEFAULT_BUFFER_SIZE
 
Constructor Summary
XmlPreparator()
          Creates a new instance of XmlPreparator.
 
Method Summary
 void prepare(RawDocument rawDocument)
          Präpariert ein Dokument für die Indizierung.
 
Methods inherited from class net.sf.regain.crawler.document.AbstractPreparator
accepts, addAdditionalField, cleanUp, close, concatenateStringParts, getAdditionalFields, getCleanedContent, getCleanedMetaData, getHeadlines, getPath, getPriority, getSummary, getTitle, init, setCleanedContent, setCleanedMetaData, setHeadlines, setPath, setPriority, setSummary, setTitle, setUrlRegex
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

XmlPreparator

public XmlPreparator()
              throws RegainException
Creates a new instance of XmlPreparator.

Throws:
RegainException - If creating the preparator failed.
Method Detail

prepare

public void prepare(RawDocument rawDocument)
             throws RegainException
Präpariert ein Dokument für die Indizierung.

Parameters:
rawDocument - Das zu pr�pariernde Dokument.
Throws:
RegainException - Wenn die Pr�paration fehl schlug.

Regain 2.1.0-STABLE API

Regain 2.1.0-STABLE, Copyright (C) 2004-2010 Til Schneider, www.murfman.de, Thomas Tesche, www.clustersystems.info