News
2007-12-01
Version 1.2.3 has been released
- Bugfix: In some cases no file contents were indexed.
2007-11-01
Version 1.2.2 has been released
- It is now possible to use any Lucene analyzer.
- Bugfix: The attibute beautified of hit_url tag was missing in the TLD definition.
- Bugfix: Fixed URL-encoding problems when using the file-to-http-bridge.
2007-10-30
Version 1.2.1 has been released
- Bugfix: In regain 1.2 some libs were missing. This was fixed with version 1.2.1.
2007-10-20
Version 1.2 has been released
- The search result show now icons indicating the file's type.
- The index fields "size" and "last-modified" are now searchable.
- New preparator: EmptyPreparator (Contributed by Gerhard Olsson). This
preparator extracts no content files assigned to it. Therefore only the path
and the filename is written to the index (helpful for all file types having no
matching preparator).
- The maximum number of terms per document is now configurable using the
maxFieldLength tag in the CrawlerConfiguration.xml. Default is 10000.
- The IfilterPreparator works now under Windows Server 2003, too.
- The values for the search:input_fieldlist tag may now be determined when
indexing. Therefore this operation beeing slow for large indexes must not be
executed any more when searching the first time. This may be configured
using the valuePrefetchFields tag in the CrawlerConfiguration.xml.
- Several bugfixes
2006-03-27
Version 1.1.1 has been released
There were two bugs in the server variant, which are fixed in the new version.
2006-02-26
Final version 1.1 has been released
- regain now searches the URLs, too.
- The desktop search now shows the last log messages.
- Better handling of HTTP-Redirects. (Thanks to Gerhard Olsson)
- Auxiliary fields have new options: "tokenize", "store" and "index".
- Added documentation of the Tag Library.
- The search mask now accepts multiple "query" parameters
(they are just concatinated)
- The Jacob preparators have been improved. (Thanks to Reinhard Balling)
- New preparator ExternalPrepartor: This preparator calls an external program
or script in order to get the text of documents. (Thanks to Paul Ortyl)
- Completed italian localization. (Thanks to Franco Lombardo)
- Some Bugfixes
2005-12-05
Version 1.1 Beta 6 has been released
- New preparator: With the PoiMsPowerPointPreparator there is now a platform
independant preparator for Powerpoint. (Thanks to Gerhard Olsson)
- New preparator: The IfilterPreparator uses the I-Filter interface of
Microsoft to read various file formats. Unfortunately it only works on Windows.
- Multi index search: In the SearchConfiguration.xml now several indexes may be
specified as default.
- The auxiliary fields have now a better handling for case sensitivity.
- The HTTP agent sent by the crawler to the web servers may now be configured in
the CrawlerConfiguration.xml. This way the crawler can identify itself as
Internet Explorer, for example.
- Several bugfixes
2005-08-15
Error in version 1.1 Beta 5
Unfortunatly there are two libraries missing in version 1.1 Beta 5, so the
search mask does not work. Thus, I provided a corrected version
1.1 Beta 5a. This error only concerns the server-variant, not the
desktop-variant.
2005-08-13
Version 1.1 Beta 5 has been released
- Multi index search: It is now possible to search several search indexes over
one search mask. The search query is executed on every index and is merged
afterwards.
- The white and the black list now allow regular expressions, too.
- Search mask: The location of ressources and the configuration is better
detected now. Therefor regain works properly now, even if Tomcat is running
as service.
- Search mask: The file-to-http-bridge is may be switched off now.
- Crawler: The crawler needs less memory now when crawling directories.
- Crawler: The crawler now adds failed documents as well to the index. Therefore
they are not retried the next time the crawler is running. But if the crawler
is executed with the option "-retryFailedDocs", all failed documents are
retried.
- The HTML preparator now preparates the extensions .jsp, .php, .php3, .php4
and .asp as well.
- It's now possible to specify in the CrawlerConfiguration.xml which documents
should be preparated with a certain preparator.
- Several bugfixes
2005-04-13
Version 1.1 Beta 4 has been released
- Access rights management: It is now possible to integrate an access rights
management, that ensures that a user only sees results for documents he has
reading rights for.
- Search: The search taglib has now a tag "hit_field", that writes an index
field value. The tag "hit_summary" was thereby removed.
- Search: If you don't want to read the search config from an XML file or if
you don't want to write the location of the XML file in the web.xml, you
may write your own SearchConfigFactory class and create the config on your
own. The SearchConfigFactory class is specified in the web.xml.
- Server search: The enclosed JSP pages did not work.
2005-03-17
Version 1.1 Beta 3 has been released
- Crawler: Bugfix: The PoiMsExcelPreparator did not get on with all number and date formats.
- Crawler: The error log is now more detailed (with stacktraces).
- Crawler: Preparators are now encapsulated in own jars. Thus the regain.jar only contains what regain itself needs and preparators may be replaced more easily. Also other developers may provide preparators that can be mounted very easily.
The configuration of the preparators is still in the CrawlerConfiguration.xml. But now not all preparators must be declared. The preparators are executed in the same order as they are configured, the unconfigured preparators afterwards.
- Desktop search: The desktop search now runs under Linux, too.
- Search: Bugfix: Files whichs URL contains a double slash (e.g. of network drives: //fileserver/bla/blubb) couldn't be loaded.
- Desktop search: Bugfix: At the search results umlauts were presentet wrong.
- Desktop search: On the status page a currently running index update can now be paused and an index update can be started.
- Crawler: Bugfix: The HtmlPräparator did not get on with all files.
2005-03-12
Version 1.1 Beta 2 has been released
- Crawler: The crawler now creates periodically so called breakpoint. When doing so the current state of the search index is copied into a separate directory. If the index update should be cancelled (e.g. if the computer is shut down), the crawler will go on from the last breakpoint.
- Desktop search: The status page now shows the timing results.
2005-03-10
Version 1.1 Beta 1 has been released
- Desktop search: reagain now provides a desktop search besides the server search. The desktop searchs provides many features that makes the use as easy as winking:
- An installer for Windows.
- Integration in the task bar under Linux and Windows.
- Configuration over the browser.
- Status monitoring over the browser.
- Crawler: There is now a preparator for OpenOffice and StarOffice documents.
- All: Updated to the newest versions of the used projects.
- Crawler: Preparators are now configurable by the CrawlerConfiguration.xml.
- Search: The Search is now configured by the SearchConfiguration.xml, not the web.xml any more. There is only the path to the SearchConfiguration.xml any more.
- Search: The Search now provides URL rewriting. This way you can index documents in file://c:/www-data/intranet/docs and show the documents in the browser as http://intranet.murfman.de/docs.
- Crawler: Auxiliary fields: The index may now be extended by auxiliary fields that are extracted from a document's URL.
Example: Assumed you have a directory with a sub directory for every project. Then you can generate an auxiliary field with the project name. Doing so you
get only documents from that directory when searching for "Offer project:otto23".
- Search: Advanced search: All values that are in the index for one field may now be provided as a combo box on the search page. Particularly together with auxiliary fields this is very useful.
- Search: Some browsers load for security reasons no file links from http pages. Thus all documents that are in the index are now provided over HTTP. Of corse at the desktop search these documents are only accessible from the local host.
- Crawler: The JacobMsWordPreparator now regards styles. Thus it is possible to extract headlines that will be weight more when searching.
- Crawler: The JacobMsOfficePreparators are now able to extract the description
fields (Title, author, etc.)
2004-07-28
The first version of regain has been released!
Today I received the official confirmation from dm-drogerie markt, which
permits it to me to publish regain under the LGPL. The project is as from now
available at the download area resp. from the CVS server of sourceforge.
|