|
Regain 2.1.0-STABLE API | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectnet.sf.regain.crawler.plugin.CrawlerPluginManager
public class CrawlerPluginManager
Guarantees: - If one plugin throws an exception, the other plugins will be executed none-the-less - Every argument of a plugin call is non-null Singleton pattern: get the only instance by calling getInstance().
Field Summary | |
---|---|
private int |
insertIndex
Count up for every inserted Plugin. |
private static CrawlerPluginManager |
instance
The single Manager Instance. |
private static int |
MAX_PLUGINS
Guessed maximum number of plugins. |
private static org.apache.log4j.Logger |
mLog
Logger instance |
private int |
nextOrder
Keep a record of the next value "order" so that the plugin is inserted at the end of queue |
private SortedMap<Integer,CrawlerPlugin> |
plugins
List of registered Plugins (in order of call) (Dev note: Priority Queue didn't work out: iterator is not ordered, only poll is) |
Constructor Summary | |
---|---|
protected |
CrawlerPluginManager()
|
Method Summary | |
---|---|
private String |
argTypesToString(Class<?>[] argTypes)
Convert argument Types into a string represantation |
private void |
checkArgsNotNull(Object[] args)
Check if the array does not contain any null value. |
protected void |
checkIfEventExists(String methodName,
Class<?>[] argTypes)
Check if a certain eventName exists in the CrawlerPlugin Interface |
void |
clear()
Unregister all Plugins |
void |
eventAcceptURL(String url,
CrawlerJob job)
Trigger Event: onAcceptURL |
void |
eventAfterPrepare(RawDocument document,
WriteablePreparator preparator)
Trigger Event: onAfterPrepare |
boolean |
eventAskDynamicBlacklist(String url,
String sourceUrl,
String sourceLinkText)
Trigger Event: checkDynamicBlacklist (This is not lazy: all plugins are called even if the first returns true.) |
void |
eventBeforePrepare(RawDocument document,
WriteablePreparator preparator)
Trigger Event: onBeforePrepare |
void |
eventCreateIndexEntry(org.apache.lucene.document.Document doc,
org.apache.lucene.index.IndexWriter index)
Trigger Event: onCreateIndexEntry |
void |
eventDeclineURL(String url)
Trigger Event: onDeclineURL |
void |
eventDeleteIndexEntry(org.apache.lucene.document.Document doc,
org.apache.lucene.index.IndexReader index)
Trigger Event: onDeleteIndexEntry |
void |
eventFinishCrawling(Crawler crawler)
Trigger Event: onFinishCrawling |
void |
eventStartCrawling(Crawler crawler)
Trigger Event: onStartCrawling |
static CrawlerPluginManager |
getInstance()
Instead of Constructor: get a singleton instance of the Manager, so that only one manager exists at a time. |
void |
registerPlugin(CrawlerPlugin plugin)
Register a Plugin at the end of the current queue. |
void |
registerPlugin(CrawlerPlugin plugin,
int order)
Register a Plugin at a certain position |
String |
toString()
Lists contained plugins for debugging purposes |
protected List<Object> |
triggerEvent(String methodName,
Class<?>[] argTypes,
Object... args)
Trigger an event: call the corresponding plugins. |
void |
unregisterPlugin(CrawlerPlugin plugin)
Unregister an already registered plugin. |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Field Detail |
---|
private static final int MAX_PLUGINS
private SortedMap<Integer,CrawlerPlugin> plugins
private static CrawlerPluginManager instance
private static org.apache.log4j.Logger mLog
private int nextOrder
private int insertIndex
Constructor Detail |
---|
protected CrawlerPluginManager()
Method Detail |
---|
public static CrawlerPluginManager getInstance()
public void registerPlugin(CrawlerPlugin plugin)
plugin
- Plugin to registerpublic void registerPlugin(CrawlerPlugin plugin, int order)
plugin
- Plugin to registerorder
- Place where to insert the plugin
(The lower the order, the earlier the plugin is called
relatively to other plugins)
NullPointerException
- if plugin is nullpublic void unregisterPlugin(CrawlerPlugin plugin)
plugin
- public void clear()
protected List<Object> triggerEvent(String methodName, Class<?>[] argTypes, Object... args)
methodName
- Name of Event (as in the interface: onEvent)args
- Args of Event (as in the interface)
private void checkArgsNotNull(Object[] args)
args
- Array of arguments
IllegalArgumentException
- if a null value is detectedprotected void checkIfEventExists(String methodName, Class<?>[] argTypes)
methodName
- "on" + eventNameargTypes
- Types of the argumentsprivate String argTypesToString(Class<?>[] argTypes)
argTypes
- Types of the arguments
public void eventStartCrawling(Crawler crawler)
crawler
- Crawler instance (caller)CrawlerPlugin.onStartCrawling(Crawler)
public void eventFinishCrawling(Crawler crawler)
crawler
- Crawler instance (caller)CrawlerPlugin.onFinishCrawling(Crawler)
public void eventBeforePrepare(RawDocument document, WriteablePreparator preparator)
document
- Document to preparepreparator
- Preparator that will prepareCrawlerPlugin.onBeforePrepare(RawDocument, WriteablePreparator)
public void eventAfterPrepare(RawDocument document, WriteablePreparator preparator)
document
- Document to preparepreparator
- Preparator that preparedCrawlerPlugin.onAfterPrepare(RawDocument, WriteablePreparator)
public void eventCreateIndexEntry(org.apache.lucene.document.Document doc, org.apache.lucene.index.IndexWriter index)
doc
- Document to addindex
- Index where it will be addedCrawlerPlugin.onCreateIndexEntry(Document, IndexWriter)
public void eventDeleteIndexEntry(org.apache.lucene.document.Document doc, org.apache.lucene.index.IndexReader index)
doc
- Document to deleteindex
- Index where it will be deletedCrawlerPlugin.onDeleteIndexEntry(Document, IndexReader)
public void eventAcceptURL(String url, CrawlerJob job)
url
- URL that was acceptedjob
- Resulting JobCrawlerPlugin.onAcceptURL(String, CrawlerJob)
public void eventDeclineURL(String url)
url
- URL that was declinedCrawlerPlugin.onDeclineURL(String)
public boolean eventAskDynamicBlacklist(String url, String sourceUrl, String sourceLinkText)
url
- sourceUrl
- sourceLinkText
-
CrawlerPlugin.checkDynamicBlacklist(String, String, String)
public String toString()
toString
in class Object
|
Regain 2.1.0-STABLE API | ||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |