antispam is a small program detect spam on blogs and forums. As SpamAssassin, it uses rules to compute a score. If the score is bigger than zero, it's a spam.

antispam is written in Python and is distributed under GNU GPL license.

antispam goal is to be light, fast, easy to configure and block all spam (it may block non-spam...).

Rules

Line rules

Line rules compute a score for each line of text. It uses text patterns with black list and white list. Examples:

Match text pattern (-5.0): debian
Match text pattern (-1.0): linux
Match text pattern (10.0): viagra

URL rules

URL rules find all urls. Whitelist urls has negative score, and other gets default score of 1. Examples:

Match URL (-5.0): http://software.inl.fr/trac/trac.cgi/wiki/
Match URL (-1.0): http://www.nufw.org/
Match URL (+1.0): http://el-diario-de-juarez.acdiplomf.cn

After first filtering (skip url with negative score), domain rate is computed: number of unique main domain / number of domains. Main domain is "inl.fr" for "software.inl.fr". If the rate is bigger than 3, it gets a score of +5. Example with 10 subdomains of acdiplomf.cn:

Match URL (+1.0): http://el-diario-de-juarez.acdiplomf.cn
Match URL (+1.0): http://aruba-teen-missing.acdiplomf.cn
Match URL (+1.0): http://lisa-raye-wedding.acdiplomf.cn
...
Match URL (+1.0): http://bach-pamela-picture.acdiplomf.cn
Domain rate (+5.0): 10.0 url/domain

Text rules

Rules applied to the whole text.

ShortText removes all links, HTML tags, characters different than letters, and then count text length:

CumForCover!  :) 

<a href="http://groups.google.com/group/cumforcover/web/">Cumforcover</a> | http://groups.google.com/group/cumforcover/web/ 

Message score:

Match URL (+1.0): http://groups.google.com/group/cumforcover/web/
Match URL (+1.0): http://groups.google.com/group/cumforcover/web/
Short text (+4.0): (len=11) "CumForCover"
-stdin- score: +6.00 ***SPAM***

Email rules

Find all email addresses in the text. DomainRateRule computes email score depending of the domain. Example:

Match email domain (+1.0): gmail.com

Configure

To configure antispam, you have to define whitelists: use --whitelist and --domain options. To avoid false positive, you can use --default=SCORE with negative score. Eg. --default=-2 allows 2 externals URLs.

Download

svn co http://haypo.hachoir.org/svn/antispam/trunk antispam

Browse antispam source code.

Why not using xxx project?

SpamAssassin and Bogofilter targets email spam which is different than blog or forum spam. We have few informations about the sender (only the IP), no attachment, no MIME encoding, etc. Other service like Akismet are commercial and unfree (source code is not available).

Similar projects

Non-free: Akismet, ...