I just developped a very simple but efficient search system, in Perl, especially targetted at mailing-lists and newsgroups. What I think is unique with my approach (at least when compared to htdig + mhonarc, for example) is that the mailing-lists can be given by archive files (uncompressed for now), we do not require one-message-per-htmlized file. Additionnally, the indexing is done locally, not through WWW.
The search engine is designed so to be widely compatible with all conformant WWW clients, including text-only clients such as UNIX/w3m or UNIX/lynx. Test is especially done using UNIX/w3m and UNIX/Konqueror-2.1.
However it still needs some work to be more useful.
It is released under the GNU GPL copyleft license.
The principle is quite simple: there are four different ways to query the search engine:
Simple example: (network --OR networking) --AND ethernet --NAND
copper
Keywords are alphanumeric, plus a few special characters, such as
-._
. However keywords must start with alphanumeric, be more
than 2 characters and less than 15.
In addition, all queries can be short or long: this changes whether the whole message is displayed or not. When a short result is displayed, you can select the Message-ID:to get the equivalent long query on that message only.
All queries can also request a maximum number of results. Additional results are then available through the next link which is usually displayed at the end of the page. There is a built-in maximum of 100 results at a time.
To get all messages with a specific subject, activate the Subject: link. To get a threaded view starting at the current message, using the References: and/or In-Reply-To:, select Threaded.
All links are done through GET anchors: this allows you to bookmark the results; but also to see what are the articles you already read, for example.
There are no cookies, no logins, no advertising banners nor anything that could get in your way.
Language is English because it's easier to be understood widely that way. Feel free to submit languages or other patches, but contact me before.
$Id: search-engine.html,v 1.7 2003/01/27 10:24:29 schaefer Exp $