procmail filter to identify Russian or Arabic characters

Dave Smith dave at thesmithfam.org
Wed May 28 06:30:26 MDT 2008


Dave Smith wrote:
> Most of the spam that gets through my spamassasin setup is not 
> English. It seems to be Russian or Arabic. Does anyone have a cool 
> procmail rule that will filter out Russian or Arabic emails?

It seems that the subject lines of many (all?) Cyrillic emails look 
something like this:

Subject: =?KOI8-R?Q?=CF=D4=DE=C5=D4=D9_=F2=F6=E4?=

Which gets rendered like this:

отчеты РЖД

Here's another example:

Subject: =?koi8-r?B?7e/06ffh4+nxIPTy9eTh?=

Which appears thusly:

МОТИВАЦИЯ ТРУДА

The "KOI8-R" you see in the above Subject line refers to a Cyrillic 
encoding and indicates to the mail client that the rest of the text is 
thusly encoded. Wikipedia has a nice article on KOI8-R:

http://en.wikipedia.org/wiki/KOI8-R

To filter out these messages, I added a super simple procmail rule:

:0:
* Subject:.*koi8-r
$HOME/Maildir/.crap/

I haven't done extensive testing, but will let this run for the coming 
weeks and report.

--Dave



More information about the PLUG mailing list