| 
Regain 2.1.0-STABLE API | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectnet.sf.regain.crawler.document.AbstractPreparator
net.sf.regain.crawler.preparator.AbstractJacobMsOfficePreparator
net.sf.regain.crawler.preparator.JacobMsPowerPointPreparator
public class JacobMsPowerPointPreparator
Präpariert ein Microsoft-Powerpoint-Dokument für die Indizierung mit Hilfe der Jacob-API, wobei Jacobgen genutzt wurde, um den Zugriff zu erleichtern.
Dabei werden die Rohdaten des Dokuments von Formatierungsinformation befreit, es wird der Titel extrahiert.
| Field Summary | |
|---|---|
private  de.filiadata.lucene.spider.generated.msoffice2000.powerpoint.Application | 
mPowerPointApplication
Die PowerPoint-Applikation.  | 
private static int | 
MSOGROUP
 | 
| Fields inherited from interface net.sf.regain.crawler.document.Preparator | 
|---|
DEFAULT_BUFFER_SIZE | 
| Constructor Summary | |
|---|---|
JacobMsPowerPointPreparator()
Creates a new instance of JacobMsPowerPointPreparator.  | 
|
| Method Summary | |
|---|---|
 void | 
close()
Frees all resources reserved by the preparator.  | 
private  void | 
extractTextFrom(de.filiadata.lucene.spider.generated.msoffice2000.powerpoint.Shape shape,
                StringBuffer contentBuf)
Extrahiert den Text aus einem Powerpoint-Form-Objekt und tr�gt ihn in den StringBuffer ein.  | 
 void | 
init(PreparatorConfig config)
Initializes the preparator.  | 
 void | 
prepare(RawDocument rawDocument)
Präpariert ein Dokument für die Indizierung.  | 
private  String | 
removeHyphenation(String text)
RB: Eliminates hyphenation either -\n\r or -\013  | 
| Methods inherited from class net.sf.regain.crawler.preparator.AbstractJacobMsOfficePreparator | 
|---|
readProperties | 
| Methods inherited from class net.sf.regain.crawler.document.AbstractPreparator | 
|---|
accepts, addAdditionalField, cleanUp, concatenateStringParts, getAdditionalFields, getCleanedContent, getCleanedMetaData, getHeadlines, getPath, getPriority, getSummary, getTitle, setCleanedContent, setCleanedMetaData, setHeadlines, setPath, setPriority, setSummary, setTitle, setUrlRegex | 
| Methods inherited from class java.lang.Object | 
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait | 
| Field Detail | 
|---|
private de.filiadata.lucene.spider.generated.msoffice2000.powerpoint.Application mPowerPointApplication
null, solange noch kein Dokument
 bearbeitet wurde.
private static int MSOGROUP
| Constructor Detail | 
|---|
public JacobMsPowerPointPreparator()
                            throws RegainException
RegainException - If creating the preparator failed.| Method Detail | 
|---|
public void init(PreparatorConfig config)
          throws RegainException
init in interface Pluggableinit in class AbstractJacobMsOfficePreparatorconfig - The configuration
RegainException - If the configuration has an error.
public void prepare(RawDocument rawDocument)
             throws RegainException
rawDocument - Das zu pr�pariernde Dokument.
RegainException - Wenn die Pr�paration fehl schlug.
private void extractTextFrom(de.filiadata.lucene.spider.generated.msoffice2000.powerpoint.Shape shape,
                             StringBuffer contentBuf)
shape - Das zu durchsuchende Powerpoint-Form-Objekt.contentBuf - Der Puffer in den der evtl. gefundene Text einzutragen
        ist.private String removeHyphenation(String text)
public void close()
           throws RegainException
Is called at the end of the crawler process after all documents were processed.
close in interface Preparatorclose in class AbstractPreparatorRegainException - If freeing the resources failed.
  | 
Regain 2.1.0-STABLE API | ||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||