Discussion:
Spam in the lists out of control
Santiago Vila
2004-05-09 14:04:49 UTC
Permalink
Greetings.

The level of spam in the lists is something like 20% these days [*].

Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.

The current level of spam is not acceptable by any standard,
so "do nothing" does not count as a "solution".

[*] Examples, for the lists I'm subscribed to:

86 junk messages and 440 good messages yesterday.
76 junk messages and 200 good messages today.
Marco d'Itri
2004-05-09 15:30:05 UTC
Permalink
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.

Or at least the listmasters should block the spam sources which are
reported to them week after week.
--
ciao, |
Marco | [6183 soylJ5l/XYEwA]
Carlos Perelló Marín
2004-05-09 16:33:11 UTC
Permalink
Post by Marco d'Itri
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
BlackLists are NOT an option, they should be killed because they are
maintained incorrectly. Here in Spain some people does SPAM from ADSL
connections and thus from time to time one of those black lists add ALL
ADSL connections to the list and we cannot send mail to some places from
our own mail server (some ADSL connections have static IP address and
our ISP lets us setup our own MTA if we want). I'm not a spammer but
since two or three years ago I was added to three or four lists of
spammers and it's not a trivial task to be removed from them.
Post by Marco d'Itri
Or at least the listmasters should block the spam sources which are
reported to them week after week.
That's a better solution also, a good spamassassin with some feedback
could help. I think Santiago said here some months ago that the Debian'
Spam daemon was not working correctly, perhaps it's only a matter of fix
it.

Cheers.
Post by Marco d'Itri
--
ciao, |
Marco | [6183 soylJ5l/XYEwA]
--
--
Carlos Perelló Marín
Debian GNU/Linux Sid (PowerPC)
Linux Registered User #121232
mailto:***@pemas.net || mailto:***@gnome.org
http://carlos.pemas.net
Valencia - Spain
Santiago Vila
2004-05-09 18:13:46 UTC
Permalink
Post by Carlos Perelló Marín
Post by Marco d'Itri
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
BlackLists are NOT an option, they should be killed because they are
maintained incorrectly. Here in Spain some people does SPAM from ADSL
connections and thus from time to time one of those black lists add ALL
ADSL connections to the list and we cannot send mail to some places from
our own mail server (some ADSL connections have static IP address and
our ISP lets us setup our own MTA if we want). I'm not a spammer but
since two or three years ago I was added to three or four lists of
spammers and it's not a trivial task to be removed from them.
That's the typical FUD against blacklists.

There are blacklists and blacklists. Some of them block individual IPs.
Some of them block IP ranges. To be removed from a blacklist may be
difficult, easy, or (in some cases) absolutely trivial, depending on
the blacklist (see cbl.abuseat.org for an example of the last case).

Please let us not repeat the mistake of putting all the blacklists
in the same bag. They are very different.
Post by Carlos Perelló Marín
Post by Marco d'Itri
Or at least the listmasters should block the spam sources which are
reported to them week after week.
That's a better solution also, a good spamassassin with some feedback
could help. I think Santiago said here some months ago that the Debian'
Spam daemon was not working correctly, perhaps it's only a matter of fix
it.
No matter how well the filters at lists.debian.org may work, there is
no "fix" for the fact that list volume increase linearly (at most)
while spam increase exponentially.

We can change list policy now and require people to subscribe or
register to the white list for they to be able to post, or we can
continue to discuss about this while the spam level reaches 50%, then
80%, then 95% and we collectively spend several hundreds more time
deleting junk than the time a few people would spend subscribing or
registering in the white list.

Why is the time of the current subscribers several orders of magnitude
less valuable than the time of those who did not subscribe, that's one
of the mysteries of debian list policy.
Neil McGovern
2004-05-09 21:37:00 UTC
Permalink
Post by Santiago Vila
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
We can change list policy now and require people to subscribe or
register to the white list for they to be able to post, or we can
[...]
Sounds like a fairly sensible solution to me.

Another list that I'm on requires subscription to the list, and
subscription to a seperate post-access list, eg:
debian-devel (subscription for receiving the list)
debian-devel-post (subscription for posting rights)

However, this may cause problems with the newsgroups, unless posts to
the newsgroups aren't copied to the list (it's been a while since I used
usenet).

Regards,
Neil McGovern
--
A. Because it breaks the logical sequence of discussion
Q. Why is top posting bad?
gpg key - http://www.halon.org.uk/pubkey.txt ; the.earth.li B345BDD3
Santiago Vila
2004-05-10 09:45:15 UTC
Permalink
Post by Neil McGovern
Post by Santiago Vila
We can change list policy now and require people to subscribe or
register to the white list for they to be able to post, or we can
[...]
Sounds like a fairly sensible solution to me.
Another list that I'm on requires subscription to the list, and
debian-devel (subscription for receiving the list)
debian-devel-post (subscription for posting rights)
debian-devel-post would be debian-devel + whitelist

whitelist already exists.

This would eliminate almost all the spam without a lot of complexity.

Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
Matthew Garrett
2004-05-10 11:51:31 UTC
Permalink
Post by Santiago Vila
Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
This is not a desirable policy. If you want solutions, could you please
consider those that don't contain an implicit change in current policy?
--
Matthew Garrett | mjg59-***@srcf.ucam.org
Santiago Vila
2004-05-10 12:27:58 UTC
Permalink
Post by Matthew Garrett
Post by Santiago Vila
Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
This is not a desirable policy. If you want solutions, could you please
consider those that don't contain an implicit change in current policy?
For example? Did you read my initial message? I'll summarize:

I propose to close the lists.

Those who disagree please propose a *better* solution.


What I propose would be simple, effective, it would only require a
small amount of time from posters (the time required to subscribe to
the white list, which has to be done only once), and it would be easy
to implement (since the white list already exists).
Wouter Verhelst
2004-05-10 12:48:33 UTC
Permalink
Post by Santiago Vila
Post by Matthew Garrett
Post by Santiago Vila
Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
This is not a desirable policy. If you want solutions, could you please
consider those that don't contain an implicit change in current policy?
I propose to close the lists.
Those who disagree please propose a *better* solution.
What I propose would be simple, effective, it would only require a
small amount of time from posters (the time required to subscribe to
the white list, which has to be done only once), and it would be easy
to implement (since the white list already exists).
And it wouldn't be very useful, since all it'd take to circumvent would
be to have a spammer send mail with the right from address. I've seen
this before; whitelisting doesn't often help, really.
--
EARTH
smog | bricks
AIR -- mud -- FIRE
soda water | tequila
WATER
-- with thanks to fortune
Pascal Hakim
2004-05-10 13:03:59 UTC
Permalink
Post by Wouter Verhelst
And it wouldn't be very useful, since all it'd take to circumvent would
be to have a spammer send mail with the right from address. I've seen
this before; whitelisting doesn't often help, really.
<queue Santiago suggesting we use moderation instead in that case>


Cheers,

Pasc
Adrian 'Dagurashibanipal' von Bidder
2004-05-11 07:19:03 UTC
Permalink
On Monday 10 May 2004 14.48, Wouter Verhelst wrote:
[whitelisting]
Post by Wouter Verhelst
And it wouldn't be very useful, since all it'd take to circumvent
would be to have a spammer send mail with the right from address.
I've seen this before; whitelisting doesn't often help, really.
Looking at the sender addresses of the spam that currently plagues the
Debian lists, it would probably help a lot right now.

Of course, I don't know who's subscribed to the list, but all the recent
spam senders have email sernders that look quite spammy.

cheers
-- vbi
--
Umlaut Zebra über alles!
Pascal Hakim
2004-05-10 13:01:35 UTC
Permalink
Post by Santiago Vila
Post by Matthew Garrett
Post by Santiago Vila
Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
This is not a desirable policy. If you want solutions, could you please
consider those that don't contain an implicit change in current policy?
I propose to close the lists.
Those who disagree please propose a *better* solution.
What I propose would be simple, effective, it would only require a
small amount of time from posters (the time required to subscribe to
the white list, which has to be done only once), and it would be easy
to implement (since the white list already exists).
I personally refuse to close the lists in general, and will only close
individual lists if a large enough proportion of the list agrees to do
so. Too many people choose not to post again when they get a "posting
has been denied" email, to consider it a viable thing to do. There is
still a large number of people posting on lists who are not subscribed
to the whitelist or the list they're posting to.


Cheers,

Pasc
César Martínez Izquierdo
2004-05-10 14:21:18 UTC
Permalink
Post by Pascal Hakim
I personally refuse to close the lists in general, and will only close
individual lists if a large enough proportion of the list agrees to do
so. Too many people choose not to post again when they get a "posting
has been denied" email, to consider it a viable thing to do. There is
still a large number of people posting on lists who are not subscribed
to the whitelist or the list they're posting to.
Cheers,
Pasc
We could close the lists AND have a big team of moderatators which allow to
pass true messages to the list. If we had a team of 15-20 moderators from all
the corners of the world, I'm pretty sure that we will not notice the
difference.

Cesar
Santiago Vila
2004-05-10 15:44:50 UTC
Permalink
There is still a large number of people posting on lists who are not
subscribed to the whitelist or the list they're posting to.
Because it's currently not required. That's a bogus argument.

I would only expect most people to subscribe to the white list when it
becomes a requirement. The number of people currently subscribed is
quite meaningless.
Andreas Barth
2004-05-10 16:19:35 UTC
Permalink
Post by Santiago Vila
There is still a large number of people posting on lists who are not
subscribed to the whitelist or the list they're posting to.
Because it's currently not required. That's a bogus argument.
I would only expect most people to subscribe to the white list when it
becomes a requirement. The number of people currently subscribed is
quite meaningless.
If you don't mind too much, you may perhaps read Pasc's mail about the
experiences with this (there _is_ such a list, so there is no reason
to guess).


Cheers,
Andi
--
http://home.arcor.de/andreas-barth/
PGP 1024/89FB5CE5 DC F1 85 6D A6 45 9C 0F 3B BE F1 D0 C5 D1 D9 0C
Matthew Garrett
2004-05-10 12:56:48 UTC
Permalink
Post by Santiago Vila
I propose to close the lists.
Those who disagree please propose a *better* solution.
No problem. Don't close the lists. Reducing functionality is not the
right way of fixing bugs.
Post by Santiago Vila
What I propose would be simple, effective, it would only require a
small amount of time from posters (the time required to subscribe to
the white list, which has to be done only once), and it would be easy
to implement (since the white list already exists).
It solves the problem by making it harder to participate in the project.
I disagree that this is a reasonable tradeoff.
--
Matthew Garrett | mjg59-***@srcf.ucam.org
Pascal Hakim
2004-05-10 12:16:38 UTC
Permalink
Post by Matthew Garrett
Post by Santiago Vila
Messages from non-subscribers or non-registered people would not count as
"false positives" since the list policy would be to not allow such posts.
This is not a desirable policy. If you want solutions, could you please
consider those that don't contain an implicit change in current policy?
The one list we currently have that has a subscribers-only policy has
also been a pretty good failure. The debian-***@l.d.o list has been
subscribers only for quite a while now, and it has resulted in the
number of posts going down substantially.

When non-subscribers try to post on that list, they get a message
warning them that their message wasn't posted because of
subscribers-only policy. From what I've seen, basically no one bothers
subscribing to the list, and sending the message again. If this is what
happens on a list such as debian-ctte, imagine what it would be like on
a more user-oriented list.

Cheers,

Pasc
Adam McKenna
2004-05-10 19:07:42 UTC
Permalink
Post by Neil McGovern
Another list that I'm on requires subscription to the list, and
debian-devel (subscription for receiving the list)
debian-devel-post (subscription for posting rights)
I assume you are talking about NANOG. That mechanism only exists so that
the nanog "powers that be" can suspend people's posting privileges.

Anyone who is subscribed to a list should be able to post to it. If
additional people want to post to a list without subscribing it should be
fairly trivial to add them to a whitelist.

--Adam
--
Adam McKenna <***@debian.org> <***@flounder.net>
Marco d'Itri
2004-05-09 20:32:10 UTC
Permalink
Post by Carlos Perelló Marín
Post by Marco d'Itri
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
BlackLists are NOT an option, they should be killed because they are
Please refrain from commenting on subjects you do not know about. I'm
suggesting the use of DNSBLs which have near-zero false positives rate
and are professionally maintained.
Post by Carlos Perelló Marín
That's a better solution also, a good spamassassin with some feedback
could help. I think Santiago said here some months ago that the Debian'
Bayesian filtering is too much expensive, our hardware apparently is not
even close enough to what would be needed to use it.
--
ciao, |
Marco | [6185 sucdpVQ1q8taQ]
Marek Habersack
2004-05-10 02:02:16 UTC
Permalink
On Sun, May 09, 2004 at 10:32:10PM +0200, Marco d'Itri scribbled:
[snip]
Post by Marco d'Itri
Please refrain from commenting on subjects you do not know about. I'm
suggesting the use of DNSBLs which have near-zero false positives rate
and are professionally maintained.
Post by Carlos Perelló Marín
That's a better solution also, a good spamassassin with some feedback
could help. I think Santiago said here some months ago that the Debian'
Bayesian filtering is too much expensive, our hardware apparently is not
even close enough to what would be needed to use it.
http://www.nuclearelephant.com/projects/dspam/

not really a resource-intensive thing

regards,

marek
Pascal Hakim
2004-05-10 12:19:41 UTC
Permalink
[snip]
Post by Marco d'Itri
Please refrain from commenting on subjects you do not know about. I'm
suggesting the use of DNSBLs which have near-zero false positives rate
and are professionally maintained.
Post by Carlos Perelló Marín
That's a better solution also, a good spamassassin with some feedback
could help. I think Santiago said here some months ago that the Debian'
Bayesian filtering is too much expensive, our hardware apparently is not
even close enough to what would be needed to use it.
http://www.nuclearelephant.com/projects/dspam/
not really a resource-intensive thing
7:16 murphy:~% uptime
07:16:56 up 13 days, 11:20, 1 user, load average: 13.39, 6.02, 3.86

That's pretty standard for murphy.

Cheers,

Pasc
Florian Weimer
2004-05-11 05:56:48 UTC
Permalink
Here in Spain some people does SPAM from ADSL connections and thus
from time to time one of those black lists add ALL ADSL connections
to the list and we cannot send mail to some places from our own mail
server (some ADSL connections have static IP address and our ISP
lets us setup our own MTA if we want).
Well, switch to a decent ISP that (a) doesn't mark ADSL space as
obviously as your current one and (b) is less lenient on spam.
Alternatively, you could route mail over your ISP's mail relay.

There is no reason to accept mail from mass-market ADSL subscribers.
--
Current mail filters: many dial-up/DSL/cable modem hosts, and the
following domains: atlas.cz, bigpond.com, di-ve.com, hotmail.com,
jumpy.it, libero.it, netscape.net, postino.it, simplesnet.pt,
tiscali.co.uk, tiscali.cz, tiscali.it, voila.fr, yahoo.com.
Milan P. Stanic
2004-05-09 18:11:48 UTC
Permalink
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
[snip] or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
Please, no! If Blars puts his recipes I will not be able to post
to Debian mailing lists anymore, because his lists is bad.

My mail server isn't source of spam for sure, but Blars doesn't think
so because my server is on the network (194.247.0.0/16) which is
(probably) source of spam. :-(
Bob Proulx
2004-05-09 18:45:05 UTC
Permalink
Post by Milan P. Stanic
[snip] or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
Please, no! If Blars puts his recipes I will not be able to post
to Debian mailing lists anymore, because his lists is bad.
I do not know what RBLs are under discussion.
Post by Milan P. Stanic
My mail server isn't source of spam for sure, but Blars doesn't think
so because my server is on the network (194.247.0.0/16) which is
(probably) source of spam. :-(
The IP address you posted from appears to be 194.247.213.11. It does
not appear in any of the realtime blackhole lists that I checked. Of
course there are others which I did not check. But this is a pretty
wide sampling of the popular ones.

rblcheck.pl 194.247.213.11
194.247.213.11 not RBL filtered by list.dsbl.org
194.247.213.11 not RBL filtered by rbl-plus.mail-abuse.org
194.247.213.11 not RBL filtered by dul.dnsbl.sorbs.net
194.247.213.11 not RBL filtered by bl.spamcop.net
194.247.213.11 not RBL filtered by relays.ordb.org
194.247.213.11 not RBL filtered by cbl.abuseat.org
194.247.213.11 not RBL filtered by sbl.spamhaus.org

Bob
Kenneth Pronovici
2004-05-09 18:58:47 UTC
Permalink
Post by Bob Proulx
Post by Milan P. Stanic
[snip] or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
Please, no! If Blars puts his recipes I will not be able to post
to Debian mailing lists anymore, because his lists is bad.
I do not know what RBLs are under discussion.
AFAICT, Blars has his own lists that include things like "attbi.com" or
"comcast.net". Any mail I send him - whether the mail originates from
my own SMTP server or from Comcast's official one - gets bounced back.
There are ways to contact him, but they usually involve having someone
else (who's not blocked) find him for you.

I won't quarrel with Blars' recipes for filtering mail to his personal
addresses - and I also appreciate the work he's done related to SPAM on
Debian lists up until this point - but I agree with Marco that Blars'
personal filtering recipes aren't appropriate for Debian lists.

KEN
--
Kenneth J. Pronovici <***@debian.org>
William Ballard
2004-05-09 19:31:04 UTC
Permalink
Post by Kenneth Pronovici
"comcast.net". Any mail I send him - whether the mail originates from
my own SMTP server or from Comcast's official one - gets bounced back.
I submit bugs to the BTS using smtp.comcast.net smarthost all the time.
The suggestion was whatever he uses for the BTS, not what he uses
personally.
Kenneth Pronovici
2004-05-10 00:54:54 UTC
Permalink
Post by William Ballard
Post by Kenneth Pronovici
"comcast.net". Any mail I send him - whether the mail originates from
my own SMTP server or from Comcast's official one - gets bounced back.
I submit bugs to the BTS using smtp.comcast.net smarthost all the time.
Yes, so do I.
Post by William Ballard
The suggestion was whatever he uses for the BTS, not what he uses
personally.
I still feel it's important to highlight the distinction.

KEN
--
Kenneth J. Pronovici <***@debian.org>
Milan P. Stanic
2004-05-09 21:28:39 UTC
Permalink
[ Sorry for OT post ]
Post by Bob Proulx
I do not know what RBLs are under discussion.
http://www.blars.org/sapaf.html (IIRC).
Post by Bob Proulx
The IP address you posted from appears to be 194.247.213.11. It does
Right.
Post by Bob Proulx
not appear in any of the realtime blackhole lists that I checked. Of
course there are others which I did not check. But this is a pretty
wide sampling of the popular ones.
rblcheck.pl 194.247.213.11
194.247.213.11 not RBL filtered by list.dsbl.org
194.247.213.11 not RBL filtered by rbl-plus.mail-abuse.org
194.247.213.11 not RBL filtered by dul.dnsbl.sorbs.net
194.247.213.11 not RBL filtered by bl.spamcop.net
194.247.213.11 not RBL filtered by relays.ordb.org
194.247.213.11 not RBL filtered by cbl.abuseat.org
194.247.213.11 not RBL filtered by sbl.spamhaus.org
Your check is right. This mail server is listed only in Blars's (did
I wrote correctly?), but I forgot exact URL where that can be checked.

We have 194.247.213.0/24 net-block, I contacted Blars and here is
his answer:
---------------------------------------------------------------------
194.247.192.0/19 is in BlarsBL for sending spam and hosting spammer
web sites. ***@eunet.yu was informed about the problem multiple
times and the spam continued.
--------------------------------------------------------------------

Tried to post answer with explanation, but I give up after I've got
bounce :-(

You see, because some machine on 194.247.192.0/19 are source of spam
(I didn't check, but that is likely true) Blars blocked network which
is not. Hi could block 194.247.0.0/16 or 194.0.0.0/8

So, IMHO RBL is evil.

With amavisd-new, spamassassin, clamav and some postfix tweaking I
have less then 5% of spam in my users mailboxes.

Again, sorry for of-topic post.
Marco d'Itri
2004-05-10 08:37:05 UTC
Permalink
Post by Milan P. Stanic
Post by Bob Proulx
I do not know what RBLs are under discussion.
http://www.blars.org/sapaf.html (IIRC).
It's not, try again.
--
ciao, |
Marco | [6189 anCKaHVMKnOQk]
Colin Watson
2004-05-10 01:34:55 UTC
Permalink
Post by Milan P. Stanic
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
[snip] or have Blars Blarson try on murphy the same recipes he uses
to filter spam in the BTS.
Please, no! If Blars puts his recipes I will not be able to post
to Debian mailing lists anymore, because his lists is bad.
No, Blars' personal spam filtering is considerably stricter than what he
uses in the BTS. He was quite clear when he joined ***@bugs that he
thought his personal filtering was too strict for the BTS (to general
agreement); I see no reason why he'd think differently for
lists.debian.org.

Blars' work has done fantastic things to bugs.debian.org's ham/spam
ratio, with an acceptably low rate of false positives as far as I've
been able to tell.
--
Colin Watson [***@flatline.org.uk]
Andreas Metzler
2004-05-09 19:12:55 UTC
Permalink
Post by Marco d'Itri
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
[...]

At least XBL and DSBL have false positives.
DSBL:
From: Andreas Barth <aba-1hoGZ/***@public.gmane.org>
Cc: debian-devel-***@public.gmane.org
Subject: Re: incoming/katie monitoring
Message-ID: <20040418153240.GA12258-vyQ8VuFkTFKJ3FcYf6S9R9i2O/***@public.gmane.org>
[...]

XBL
From: Bernd Eckenfels <lists-***@public.gmane.org>
To: debian-devel-***@public.gmane.org, 245825-61a8vm9lEZVf4u+***@public.gmane.org
Message-ID: <20040425180604.GA25292-***@public.gmane.org>

cu andreas
--
NMUs aren't an insult, they're not an attack, and they're
not something to avoid or be ashamed of.
Anthony Towns in 2004-02 on debian-devel
Pascal Hakim
2004-05-10 12:12:38 UTC
Permalink
Post by Marco d'Itri
Or at least the listmasters should block the spam sources which are
reported to them week after week.
And which ones are those? Not that you should bother emailing me, since
you have told me before that you are not interested in getting my replies.

Cheers,

Pasc
Santiago Vila
2004-05-10 15:28:57 UTC
Permalink
Post by Pascal Hakim
Post by Marco d'Itri
Or at least the listmasters should block the spam sources which are
reported to them week after week.
And which ones are those? Not that you should bother emailing me, since
you have told me before that you are not interested in getting my replies.
I was sending procmail recipes to the listmasters nearly on a daily
basis for a while, but get bored of doing that when I stopped getting
replies.
Andrew Lau
2004-05-09 16:27:56 UTC
Permalink
Post by Santiago Vila
The level of spam in the lists is something like 20% these days [*].
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?

Cheers,
Andrew "Netsnipe" Lau
--
---------------------------------------------------------------------------
Andrew "Netsnipe" Lau <http://www.cse.unsw.edu.au/~alau/>
Debian GNU/Linux Maintainer & UNSW Computing Students' Society President
-
"Nobody expects the Debian Inquisition!
Our two weapons are fear and surprise...and ruthless efficiency!"
---------------------------------------------------------------------------
Eike "zyro" Sauer
2004-05-09 16:44:36 UTC
Permalink
Post by Andrew Lau
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
I'd "donate" 6000 spam mails, if this helps.

Ciao,
Eike
Marek Habersack
2004-05-10 02:09:33 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Andrew Lau
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
I'd "donate" 6000 spam mails, if this helps.
I could add my 14845 spams, too :)

marek
Duncan Findlay
2004-05-10 03:57:17 UTC
Permalink
Post by Marek Habersack
Post by Eike "zyro" Sauer
Post by Andrew Lau
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
I'd "donate" 6000 spam mails, if this helps.
I could add my 14845 spams, too :)
Pfff... you can have my 63,286 spams if you really want, but it won't
really help you. The thing with a Bayesian database is that the mail
it's trained on needs to be similar to the mail it will be tested
against.

For what it's worth, empirical evidence indicates that SpamAssassin's
Bayesian database is difficult to poison, since it's difficult for
spammers to pick words that are learned as non-spammy (since everyone
has their own set of non-spammy words). But, since lists.debian.org
doesn't use bayes, this point is moot.

What is more likely an issue is that the scores are not ideally set to
debian's needs. I have previously volunteered my assistance to run the
"perceptron" to generate better scores for Debian; however the problem
seems to be compiling a relatively large corpus of hand-sorted spam
and non-spam from debian lists.

SpamAssassin's scores are (as of the "soon" to be released version
3.0.0) chosen using a "Stochastic Gradient Descent" method based on
results from running tens/hundreds of thousands of messages through
SpamAssassin. This is an attempt to have results that are okay for
most, but given the Debian has unique characteristics in its mail,
different scores could be generated that would improve results. (Less
allowance would be needed for HTML mail, etc, so the score could be
set higher)
--
Duncan Findlay
Eike "zyro" Sauer
2004-05-10 08:07:40 UTC
Permalink
Post by Duncan Findlay
Pfff... you can have my 63,286 spams if you really want, but it won't
really help you. The thing with a Bayesian database is that the mail
it's trained on needs to be similar to the mail it will be tested
against.
That's true for legitimate mail, but spam is very similiar
for all people. I do get spam in Chinese although I can't
read a single glyph of it.
Post by Duncan Findlay
What is more likely an issue is that the scores are not ideally set to
debian's needs. I have previously volunteered my assistance to run the
"perceptron" to generate better scores for Debian; however the problem
seems to be compiling a relatively large corpus of hand-sorted spam
and non-spam from debian lists.
It should not be too hard for a large project to have one person
per mailing list to find hundreds of legitimate mails, which would
add up to thousands of mails. That should result in good filtering,
at least it does for me.
But as Marco d'Itri pointed out, bayesian filtering is not an option
due to CPU limitation.

Ciao,
Eike
Bartosz Fenski aka fEnIo
2004-05-10 09:07:40 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Duncan Findlay
Pfff... you can have my 63,286 spams if you really want, but it won't
really help you. The thing with a Bayesian database is that the mail
it's trained on needs to be similar to the mail it will be tested
against.
That's true for legitimate mail, but spam is very similiar
for all people. I do get spam in Chinese although I can't
read a single glyph of it.
To be honest I don't think there is any efficient way to filter out this
spam.

Take a look at spam which has flooded me yesterday privately:

http://skawina.eu.org/spam.gpg

It's almost empty message with some html and one gif picture which
includes some Viagra prices.

And the worst thing... it is GPG signed.

How to filter such stuff?

regards
fEnIo
--
_ Bartosz Fenski | mailto:***@o2.pl | pgp:0x13fefc40 | IRC:fEnIo
_|_|_ 32-050 Skawina - Glowackiego 3/15 - w. malopolskie - Polska
(0 0) phone:+48602383548 | Slackware - the weakest link
ooO--(_)--Ooo http://skawina.eu.org | JID:***@jabber.org | RLU:172001
Eike "zyro" Sauer
2004-05-10 10:29:27 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
How to filter such stuff?
"Your mailer do not support HTML messages. Switch to a better mailer."
will be sure death for the next mail of this kind if you feed this one
to your filter.
If you are not getting legitimate HTML mails often, the HTML tags
plus no legitimate content will suffice anyway to sort these out.

Ciao,
Eike
Adrian 'Dagurashibanipal' von Bidder
2004-05-10 11:06:16 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
http://skawina.eu.org/spam.gpg
It's almost empty message with some html and one gif picture which
includes some Viagra prices.
And the worst thing... it is GPG signed.
How to filter such stuff?
spamassassin, from current testing:

| Content analysis details: (6.1 points, 5.0 required)
|
| pts rule name description
| ---- ---------------------- --------------------------------------------------
| -0.0 BAYES_44 BODY: Bayesian spam probability is 44 to 50%
| [score: 0.4630]
| 1.0 HTML_MESSAGE BODY: HTML included in message
| 1.0 HTML_70_80 BODY: Message is 70% to 80% HTML
| 0.1 BIZ_TLD URI: Contains a URL in the BIZ top-level domain
| 1.1 RCVD_IN_SORBS_HTTP RBL: SORBS: sender is open HTTP proxy server
| [194.96.20.68 listed in dnsbl.sorbs.net]
| 1.5 RCVD_IN_BL_SPAMCOP_NET RBL: Received via a relay in bl.spamcop.net
| [Blocked - see <http://www.spamcop.net/bl.shtml?194.96.20.68>]
| 0.1 RCVD_IN_SORBS RBL: SORBS: sender is listed in SORBS
| [194.96.20.68 listed in dnsbl.sorbs.net]
| 1.3 MIME_BOUND_NEXTPART Spam tool pattern in MIME boundary


Whereas some of the spam polluting the Debian lists scores only 1.3 or
so. I'm currently looking into tweaking my scores to catch these, too,
so at least I don't have to deal with spammy Debian lists.

Some scores on this message are nonstandard:
score HTML_MESSAGE 1 (default .1)
score HTML_70_80 1 (default .1)
score MIME_BOUND_NEXTPART 1.307 (default .499)

I haven't tweaked my scores in the last 2 months, and I have had only
two false positives recently (both were HTML messages).

cheers
-- vbi
--
The content of this message may or may not reflect the opinion of me, my
employer, my girlfriend, my cat or anybody else, regardless of the fact
whether such an employer, girlfriend, cat, or anybody else exists. I
(or my employer, girlfriend, cat or whoever) disclaim any legal
obligations resulting from the above message. You, as the reader of
this message, may or may not have the permission to redistribute this
message as a whole or in parts, verbatim or in modified form, or to
distribute any message at all.
Marek Habersack
2004-05-10 12:35:25 UTC
Permalink
On Mon, May 10, 2004 at 11:07:40AM +0200, Bartosz Fenski aka fEnIo scribbled:
[snip]
Post by Bartosz Fenski aka fEnIo
http://skawina.eu.org/spam.gpg
It's almost empty message with some html and one gif picture which
includes some Viagra prices.
And the worst thing... it is GPG signed.
How to filter such stuff?
On the image URL and the GPG sig?

marek
Bartosz Fenski aka fEnIo
2004-05-10 22:44:32 UTC
Permalink
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
It's almost empty message with some html and one gif picture which
includes some Viagra prices.
And the worst thing... it is GPG signed.
How to filter such stuff?
On the image URL and the GPG sig?
Well I was asking generaly. What if almost every future spam would
consist of some image and GPG signature?

Filtering every such mail isn't a solution for me.
In fact mails with GPG signatures had some possitive score in my
procmail. Now I have to remove it :/

regards
fEnIo
--
_ Bartosz Fenski | mailto:***@o2.pl | pgp:0x13fefc40 | IRC:fEnIo
_|_|_ 32-050 Skawina - Glowackiego 3/15 - w. malopolskie - Polska
(0 0) phone:+48602383548 | Slackware - the weakest link
ooO--(_)--Ooo http://skawina.eu.org | JID:***@jabber.org | RLU:172001
Marek Habersack
2004-05-10 23:00:29 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
It's almost empty message with some html and one gif picture which
includes some Viagra prices.
And the worst thing... it is GPG signed.
How to filter such stuff?
On the image URL and the GPG sig?
Well I was asking generaly. What if almost every future spam would
consist of some image and GPG signature?
Each of those elements have some constant characteristic. In fact, having
spam signed with GPG would make it easier to filter out - you could have
your LDA check the signature, verify it and cast away should it fail
verification.
Post by Bartosz Fenski aka fEnIo
Filtering every such mail isn't a solution for me.
How come? You have to filter every mail in order to see whether it's spam or
not anyway... There is a tool that does a very good job for keeping spam
away from your box if you're willing to put some effort in configuring it
(I'm not using it personally, but my boss is - with a great success) -
http://www.tmda.net/
Post by Bartosz Fenski aka fEnIo
In fact mails with GPG signatures had some possitive score in my
procmail. Now I have to remove it :/
I don't think it is a good idea anyway, it's like leaving a passage for
possible spam.

regards,

marek
John Hasler
2004-05-11 00:33:00 UTC
Permalink
In fact, having spam signed with GPG would make it easier to filter out -
you could have your LDA check the signature, verify it and cast away
should it fail verification.
I certainly don't want to reject all signed messages that fail to verify.
--
John Hasler
***@dhh.gt.org (John Hasler)
Dancing Horse Hill
Elmwood, WI
Marek Habersack
2004-05-11 00:39:06 UTC
Permalink
Post by John Hasler
In fact, having spam signed with GPG would make it easier to filter out -
you could have your LDA check the signature, verify it and cast away
should it fail verification.
I certainly don't want to reject all signed messages that fail to verify.
I didn't mean only the GPG verification. With some code you could check
where the key is coming from, is it signed, by how many people etc. and
assign a score to the result based on the checks. There would probably be
quite a few checks possible using that information.

regards,

marek
Eike "zyro" Sauer
2004-05-11 07:31:13 UTC
Permalink
Post by Marek Habersack
I didn't mean only the GPG verification. With some code you could check
where the key is coming from, is it signed, by how many people etc. and
assign a score to the result based on the checks. There would probably be
quite a few checks possible using that information.
Is this really less expensive than Bayesian filtering?

Ciao,
Eike
Marek Habersack
2004-05-11 12:07:26 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Marek Habersack
I didn't mean only the GPG verification. With some code you could check
where the key is coming from, is it signed, by how many people etc. and
assign a score to the result based on the checks. There would probably be
quite a few checks possible using that information.
Is this really less expensive than Bayesian filtering?
Oh, I didn't say that in the context of Bayesian vs. GPG-checking. It was
only a thought related to what the original poster wrote. I don't think it
would be less expensive than Bayesian, though. And Bayesian is not that
expensive if implemented properly, it seems (judging by the time required by
SpamAssassin to perform it vs time required by DSPAM, as reported on their
page)

regards,

marek
Bartosz Fenski aka fEnIo
2004-05-11 08:49:47 UTC
Permalink
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
How to filter such stuff?
On the image URL and the GPG sig?
Well I was asking generaly. What if almost every future spam would
consist of some image and GPG signature?
Each of those elements have some constant characteristic. In fact, having
spam signed with GPG would make it easier to filter out - you could have
your LDA check the signature, verify it and cast away should it fail
verification.
It's not so easy. In fact checking GPG signatures when fetchmail
downloads mails will kill my machine.
Right now after night I have to download about 200 mails. Bayesian
filtering + procmail takes my machine about 10-15 minutes to sort out
this. With GPG signatures I will have to get up one hour earlier ;)
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Filtering every such mail isn't a solution for me.
How come? You have to filter every mail in order to see whether it's spam or
not anyway...
Yes but there are less or more complicated filtering solutions.
Sure I can write very complicated rules for procmail + bogofilter
+ spamassasin + gnupg checks + <put whatever you want>, but hey... every
check needs CPU power and harddrive access.
Post by Marek Habersack
There is a tool that does a very good job for keeping spam
away from your box if you're willing to put some effort in configuring it
(I'm not using it personally, but my boss is - with a great success) -
http://www.tmda.net/
That looks interesting. Thanks for pointing it out to me.
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
In fact mails with GPG signatures had some possitive score in my
procmail. Now I have to remove it :/
I don't think it is a good idea anyway, it's like leaving a passage for
possible spam.
Yes... but this worked perfectly so far... Mail mentioned by me was the
*first* GPG signed spam I ever seen ;)

regards
fEnIo
--
_ Bartosz Fenski | mailto:***@o2.pl | pgp:0x13fefc40 | IRC:fEnIo
_|_|_ 32-050 Skawina - Glowackiego 3/15 - w. malopolskie - Polska
(0 0) phone:+48602383548 | Slackware - the weakest link
ooO--(_)--Ooo http://skawina.eu.org | JID:***@jabber.org | RLU:172001
Wouter Verhelst
2004-05-11 11:36:51 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
There is a tool that does a very good job for keeping spam
away from your box if you're willing to put some effort in configuring it
(I'm not using it personally, but my boss is - with a great success) -
http://www.tmda.net/
That looks interesting. Thanks for pointing it out to me.
tmda challenge-response is not an effective solution against spam. There
are a few reasons:

* When a spammer sends you a mail, the autoresponse you send out will
effectively spam other people. You're saying "I don't like to be
spammed, so I'm spamming you instead". That's annoying, at best.
* Many people (me included, but there are certainly more) do not bother
to jump through hoops for the amazing privilege to communicate with a
complete stranger. Requiring people to do so will indeed get you rid
of all your spam, but it will include a fair amount of legitimate
mail, too.
--
EARTH
smog | bricks
AIR -- mud -- FIRE
soda water | tequila
WATER
-- with thanks to fortune
Marek Habersack
2004-05-11 12:20:00 UTC
Permalink
Post by Wouter Verhelst
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
There is a tool that does a very good job for keeping spam
away from your box if you're willing to put some effort in configuring it
(I'm not using it personally, but my boss is - with a great success) -
http://www.tmda.net/
That looks interesting. Thanks for pointing it out to me.
tmda challenge-response is not an effective solution against spam. There
I beg to differ. Read below.
Post by Wouter Verhelst
* When a spammer sends you a mail, the autoresponse you send out will
effectively spam other people. You're saying "I don't like to be
spammed, so I'm spamming you instead". That's annoying, at best.
It's actually pretty effective but, I agree, not really friendly.
Post by Wouter Verhelst
* Many people (me included, but there are certainly more) do not bother
to jump through hoops for the amazing privilege to communicate with a
complete stranger. Requiring people to do so will indeed get you rid
of all your spam, but it will include a fair amount of legitimate
mail, too.
It's not the way you think it works. The challenge is not (doesn't have to
be, at least) sent to every email. We have set it up so that only mails
with 10.0 < score > 1.0 are challenged. Besides, tmda has several other nice
things - like addresses timing out after some time, addresses available only
for certain posters (for example for your bank statements, credit card
reports etc.) and a few other nice features. You can tune it so that the
unfriendly effect is as minimized as possible. What you mention as a
problem, the fake sender addresses, are really a problem but, selfishly, I'd
rather ignore that issue. All in all, spam is a complex issue not easily
solvable if we'd like to do it using standard protocols, I guess... In the
ideal world everybody would sign their mail with signatures we could trust.
Let's imagine that the mailing list people are subscribed to signs the
mails with its own key but only those mails which are signed by their
subscribed users with their known pgp/gpg keys (yes, it's sort of similar to
closing the list and I realize what the pros/cons of that are) and the
unsigned mails are challenged by the mailing list software and posted only
if the sender certifies that the mail is legit. Pain in the neck, but it
might work, I guess. The same goes with the personal mail - mails signed
with known signatures are passed through, those signed with unknown
signatures are challenged but treated as 'probably legit', those unsigned
are treated as spam. But, again, that would be in ideal world :) Just some
ramblings, really

regards,

marek
John Hasler
2004-05-11 12:49:19 UTC
Permalink
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
--
John Hasler
***@dhh.gt.org (John Hasler)
Dancing Horse Hill
Elmwood, WI
Marek Habersack
2004-05-11 13:03:53 UTC
Permalink
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Me too, and I treat them as spam... :) (what's more, it works)

marek
Wouter Verhelst
2004-05-11 14:22:57 UTC
Permalink
Post by Marek Habersack
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Me too, and I treat them as spam... :)
I think we all do. What's more, some blacklist operaters do, too, so you
won't be able to send mail to other people if you do this.
--
EARTH
smog | bricks
AIR -- mud -- FIRE
soda water | tequila
WATER
-- with thanks to fortune
Marek Habersack
2004-05-11 14:44:52 UTC
Permalink
On Tue, May 11, 2004 at 04:22:57PM +0200, Wouter Verhelst scribbled:
[snip]
Post by Wouter Verhelst
Post by Marek Habersack
Post by John Hasler
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Me too, and I treat them as spam... :)
I think we all do. What's more, some blacklist operaters do, too, so you
won't be able to send mail to other people if you do this.
Well, I think I'll take that risk.

marek
Michelle Konzack
2004-05-11 16:55:11 UTC
Permalink
Post by Marek Habersack
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Me too, and I treat them as spam... :) (what's more, it works)
And me too. I have leand spamassassin to treat it as spam.

Works perfectly !
'spamassassin' has filtered 138 SPAM's today and Anti-AntiVirus Spams
Post by Marek Habersack
marek
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Bartosz Fenski aka fEnIo
2004-05-11 13:17:54 UTC
Permalink
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)

regards
fEnIo
--
_ Bartosz Fenski | mailto:***@o2.pl | pgp:0x13fefc40 | IRC:fEnIo
_|_|_ 32-050 Skawina - Glowackiego 3/15 - w. malopolskie - Polska
(0 0) phone:+48602383548 | Slackware - the weakest link
ooO--(_)--Ooo http://skawina.eu.org | JID:***@jabber.org | RLU:172001
Marek Habersack
2004-05-11 13:33:30 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)
Well, that's a bit different story, isn't it?

marek
Wouter Verhelst
2004-05-11 14:25:11 UTC
Permalink
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)
Well, that's a bit different story, isn't it?
I don't see how. They both involve automated mails. They both involve
waste of bandwidth. They both result in annoyed people and a worse S/N
ratio in mailboxes of people completely and fully unrelated to the mail
the autoreply was replying to.

How are they a different story?
--
EARTH
smog | bricks
AIR -- mud -- FIRE
soda water | tequila
WATER
-- with thanks to fortune
Marek Habersack
2004-05-11 15:04:51 UTC
Permalink
Post by Wouter Verhelst
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)
Well, that's a bit different story, isn't it?
I don't see how. They both involve automated mails. They both involve
waste of bandwidth. They both result in annoyed people and a worse S/N
ratio in mailboxes of people completely and fully unrelated to the mail
the autoreply was replying to.
How are they a different story?
It is fair to assume that a virus mail has a bogus sender address, it is not
as simple to assume that a mail scored higher than ham has a bogus sender
address. Quite a chunk of mails scored below 4 by SpamAssassin are
legitimate mails that have one or two traits that give them the score and
yet they are perfecly legitimate. That's where are they different - virus
notification has a high probablility of hitting an innocent person, unlike a
tmda challenge.

On our servers we silently trash the virus mails, without responding to
them or generating any automated notification mail, that's obvious, and tmda
is not used by default, that's obvious too. But for personal boxes I think
everybody has the right to use tdma to protect them. Also, ISPs who blindly
treat all bounces as spam should stop doing so, I think. Say, have you ever
mailed Wietse Venema, for instance? If you did, then you know he's got an
autoresponder that will write you back sometimes. Is that a spam? I don't
think so. A certain part of tmda replies will miss the target, of course,
but (again thinking selfishly) in total it will save me/you time we'd take
to read the spam and classify it as spam. It will eliminate quite a deal of
the mails which passed through the SpamAssassin (or other) filters. In the
past 8 days, SpamAssassin let through to my box 293 messages it didn't tag
as spam, 199 of them came to my debian address, all of the 199 through the
debian mailing lists. 20s to open, read, tag, forward to sa-learn for each
of those messages, I've wasted 66 minutes of my time. Is that a lot?
Probably not for a week, but it's 57 hours/year, hours which could be saved
for something better than reading stupid spam. And if you happen to send me
a mail that will be scored above 1.0, then you will have to respond to the
tmda challenge only once - your address will be whitelisted from that moment
on (which, of course, opens up a possibility for forging your address by a
spammer, that's given). One more thing to note - tmda challenges differ a
bit from the MTA bounces, it is very easy to classify the mails based on
that difference (again, another window for spammers, but you can't win it
all) and all that remains to have is a bit of good will and understanding
for people who use tmda and take that small effort to respond to the
challenge (not to mention the responses to challenges can be automated as
well).

So, as long as you are free to be annoyed by tmda responses, I can be as
annoyed by the spam I have to deal with. We both have our reasons, we both
have equal rights and we both are free to do what we do and think what we
think.

regards,

marek
Bas Zoetekouw
2004-05-11 15:00:32 UTC
Permalink
Hi Marek!
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Post by John Hasler
What you mention as a problem, the fake sender addresses, are really a
problem but, selfishly, I'd rather ignore that issue.
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)
Well, that's a bit different story, isn't it?
No, it's not. Both automaticaly send out unsollicited messages to
innocent 3th parties. That's called spamming in my dictionary.
--
Kind regards,
+--------------------------------------------------------------------+
| Bas Zoetekouw | GPG key: 0644fab7 |
|----------------------------| Fingerprint: c1f5 f24c d514 3fec 8bf6 |
| ***@o2w.nl, ***@debian.org | a2b1 2bae e41f 0644 fab7 |
+--------------------------------------------------------------------+
Marek Habersack
2004-05-11 16:48:26 UTC
Permalink
Post by Bas Zoetekouw
Hi Marek!
Hello,

[snip]
Post by Bas Zoetekouw
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Post by John Hasler
Selfish isn't the word for it. I get damn near as many bogus bounces as
spams.
Yeah. I would kill every administrator which sets notify for sender
about viruses to the address from From: header with a pleasure ;)
Well, that's a bit different story, isn't it?
No, it's not. Both automaticaly send out unsollicited messages to
innocent 3th parties. That's called spamming in my dictionary.
You send me a mail. My TMDA generates a response which is sent on _my_
behalf. I think if you write me, you're expecting a response - how is that
unsolicited? Besides, you can easily treat the TMDA challenges as spam and
discard them automatically - much easier to do than filtering spam. And as I
wrote in the other mail, the two cases aren't the same IMO. There is a small
chance the mail any of us sends will ever be scored above 0.0 by
SpamAssassin and friends and, therefore, you can safely filter and discard
all the TMDA challenges out - since the only TMDA challenges would be coming
as a result of spam that impersonates you and you don't care about such
mail. So the spammer sends the mail as you, some tmda generates the
challenge, your filter dumps the challenge and you never get to see it.
Traffic? Much less of it than people sending graphics, movies, Excel
spreadsheets, voice mail etc. So, the way I see it, with little effort all
parties could be satisfied.

regards,

marek

Marek Habersack
2004-05-11 12:11:06 UTC
Permalink
On Tue, May 11, 2004 at 10:49:47AM +0200, Bartosz Fenski aka fEnIo scribbled:
[snip]
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Well I was asking generaly. What if almost every future spam would
consist of some image and GPG signature?
Each of those elements have some constant characteristic. In fact, having
spam signed with GPG would make it easier to filter out - you could have
your LDA check the signature, verify it and cast away should it fail
verification.
It's not so easy. In fact checking GPG signatures when fetchmail
downloads mails will kill my machine.
You don't have to do it when fetchmail is fetching them, I suppose. It could
as well be done in your MUA, I think.
Post by Bartosz Fenski aka fEnIo
Right now after night I have to download about 200 mails. Bayesian
filtering + procmail takes my machine about 10-15 minutes to sort out
this. With GPG signatures I will have to get up one hour earlier ;)
May I ask why aren't you filtering on your server?
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
Filtering every such mail isn't a solution for me.
How come? You have to filter every mail in order to see whether it's spam or
not anyway...
Yes but there are less or more complicated filtering solutions.
Sure I can write very complicated rules for procmail + bogofilter
+ spamassasin + gnupg checks + <put whatever you want>, but hey... every
check needs CPU power and harddrive access.
You got that right, the programs you listed above can take all of your CPU,
indeed :) But how about integrating PGP/GPG checking (not necessarily with
gnupg) inside the spam filter? And rather not one written in Perl?
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
There is a tool that does a very good job for keeping spam
away from your box if you're willing to put some effort in configuring it
(I'm not using it personally, but my boss is - with a great success) -
http://www.tmda.net/
That looks interesting. Thanks for pointing it out to me.
I can certify it works well - my boss is subscribed to as many mailing lists
as I am, and yet he receives 1 (_one_) spam/week on average.
Post by Bartosz Fenski aka fEnIo
Post by Marek Habersack
Post by Bartosz Fenski aka fEnIo
In fact mails with GPG signatures had some possitive score in my
procmail. Now I have to remove it :/
I don't think it is a good idea anyway, it's like leaving a passage for
possible spam.
Yes... but this worked perfectly so far... Mail mentioned by me was the
*first* GPG signed spam I ever seen ;)
Do you have a pristine copy of the message perhaps?

regards,

marek
Bernd Eckenfels
2004-05-10 23:43:51 UTC
Permalink
Post by Bartosz Fenski aka fEnIo
Filtering every such mail isn't a solution for me.
In fact mails with GPG signatures had some possitive score in my
procmail. Now I have to remove it :/
How about checking the signature and use it as the perfect key in a whitelist?

Others would be happy to have only authenticated mails.

Greetings
Bernd
--
(OO) -- ***@Mörscher_Strasse_8.76185Karlsruhe.de --
( .. ) ecki@{inka.de,linux.de,debian.org} http://www.eckes.org/
o--o 1024D/E383CD7E ***@IRCNet v:+497211603874 f:+497211603875
(O____O) When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!
--
To UNSUBSCRIBE, email to debian-devel-***@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact ***@lists.debian.org
Colin Watson
2004-05-10 10:59:17 UTC
Permalink
Post by Eike "zyro" Sauer
But as Marco d'Itri pointed out, bayesian filtering is not an option
due to CPU limitation.
bugs.debian.org manages it ... while lists.debian.org gets more incoming
mail than bugs.debian.org, I didn't think it was by an order of
magnitude or anything.
--
Colin Watson [***@flatline.org.uk]
Pascal Hakim
2004-05-10 12:29:15 UTC
Permalink
Post by Colin Watson
Post by Eike "zyro" Sauer
But as Marco d'Itri pointed out, bayesian filtering is not an option
due to CPU limitation.
bugs.debian.org manages it ... while lists.debian.org gets more incoming
mail than bugs.debian.org, I didn't think it was by an order of
magnitude or anything.
Murphy is currently running on much slower hardware. There are plans to
fix this, but it may be a while still.

Cheers,

Pasc
Marek Habersack
2004-05-10 12:27:20 UTC
Permalink
Post by Duncan Findlay
Post by Marek Habersack
Post by Eike "zyro" Sauer
Post by Andrew Lau
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
I'd "donate" 6000 spam mails, if this helps.
I could add my 14845 spams, too :)
Pfff... you can have my 63,286 spams if you really want, but it won't
really help you. The thing with a Bayesian database is that the mail
it's trained on needs to be similar to the mail it will be tested
against.
Most of my spam comes from the debian lists, so I would say it is similar
enough to the traffic down here.
Post by Duncan Findlay
For what it's worth, empirical evidence indicates that SpamAssassin's
Bayesian database is difficult to poison, since it's difficult for
spammers to pick words that are learned as non-spammy (since everyone
has their own set of non-spammy words). But, since lists.debian.org
doesn't use bayes, this point is moot.
I don't understand why is SpamAssassin thought to be the only option? SA is
a CPU/memory hog, it can easily kill even a fairly powerful machine and
there _are_ alternatives to it. One thing to use could be dspam, as I
pointed at in the other post, another (which also uses language
classification and
is already packaged for debian) would be crm114 and then there is a whole
host of bayesian filter programs that are written in a language suited for
heavy-duty tasks (C, that is :>). Both dspam and crm114 boast over 99%
accuracy in spotting spam, now that would be really neat if we had that
level of protection around here.

regards,

marek
Pascal Hakim
2004-05-10 12:53:12 UTC
Permalink
Post by Marek Habersack
Most of my spam comes from the debian lists, so I would say it is similar
enough to the traffic down here.
You have to deal with emails in different languages as well.
Post by Marek Habersack
Post by Duncan Findlay
For what it's worth, empirical evidence indicates that SpamAssassin's
Bayesian database is difficult to poison, since it's difficult for
spammers to pick words that are learned as non-spammy (since everyone
has their own set of non-spammy words). But, since lists.debian.org
doesn't use bayes, this point is moot.
I don't understand why is SpamAssassin thought to be the only option? SA is
a CPU/memory hog, it can easily kill even a fairly powerful machine and
there _are_ alternatives to it. One thing to use could be dspam, as I
pointed at in the other post, another (which also uses language
classification and
is already packaged for debian) would be crm114 and then there is a whole
host of bayesian filter programs that are written in a language suited for
heavy-duty tasks (C, that is :>). Both dspam and crm114 boast over 99%
accuracy in spotting spam, now that would be really neat if we had that
level of protection around here.
We're already close to 99% accuracy. We want more.

If you want to figure it out exactly head over to

http://lists.debian.org/debian-devel/2004/05/index.html

and start counting the spams. 1400 emails didn't make it on the list
this month.

Personally, I'd rather have some spam make it onto the list than block
any valid emails. I still believe we can do better than what we're
currently doing however.

Cheers,

Pasc
Santiago Vila
2004-05-10 15:20:00 UTC
Permalink
1400 emails didn't make it on the list this month.
It's not the spam that didn't make it on the list, but the one that *does*.
Is 20% of spam still not enough for you to do something about it?
Pascal Hakim
2004-05-10 15:43:39 UTC
Permalink
Post by Santiago Vila
1400 emails didn't make it on the list this month.
It's not the spam that didn't make it on the list, but the one that *does*.
Is 20% of spam still not enough for you to do something about it?
Did you even bother reading the email you're replying to?
Michelle Konzack
2004-05-10 08:50:51 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Andrew Lau
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
I'd "donate" 6000 spam mails, if this helps.
I can "donate" around 28.000 SPAMs and 8700 Viruses...
Post by Eike "zyro" Sauer
Ciao,
Eike
Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Rob Weir
2004-05-10 05:55:26 UTC
Permalink
On Mon, May 10, 2004 at 02:27:56AM +1000, Andrew Lau said
Post by Andrew Lau
Post by Santiago Vila
The level of spam in the lists is something like 20% these days [*].
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
Has debian.org's Spamassassin Bayesian database been poisoned? If so,
would flushing the database at random intervals be enough to keep its
usefulness feasible or would it just let too spam in after each flush?
lists.d.o doesn't use bayesian filtering.
--
Rob Weir <***@ertius.org> | ***@ertius.org | Do I look like I want a CC?
Words of the day: Yukon TWA CID Ron Brown Audiotel USCOI 64 Vauxhall Cross Agfa
Kevin Mark
2004-05-10 04:09:40 UTC
Permalink
Post by Santiago Vila
Greetings.
The level of spam in the lists is something like 20% these days [*].
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
The current level of spam is not acceptable by any standard,
so "do nothing" does not count as a "solution".
86 junk messages and 440 good messages yesterday.
76 junk messages and 200 good messages today.
Hi Santiago et al.
I subscribed to this list a few days ago with a similar motive.
I was combing some of the lists on lists.debian.org and was amazed what
I found:
about 20% or more spam on all types of lists ( automated output lists,
users lists, bug tracking list)
Also I saw bug tracking lists being submitted spam as bugs and adding to
the number of bugs in the bts.
It seems that something should be done but I am at a loss as what to
propose. No one wants to submit a legitimate message to a list and have
it sent to /dev/null, so it seems that there needs to be a system that
can gaurantee more spam get through than ham gets tossed.

I realized that the amount of spam in the mailing lists decreases the
signal to noise level greatly and thus reduces their effectiveness.
Hopefully someone out there knows of something to do to alleviate this.
And also to comb all the past mailing list messages and remove the
massive about of spam that is hot and steaming and pungent out of the
garden of debian knowledge.
-Kev
Michelle Konzack
2004-05-10 08:34:11 UTC
Permalink
Hello Santiago Vila,
Post by Santiago Vila
Greetings.
The level of spam in the lists is something like 20% these days [*].
Yes we know !
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
The current level of spam is not acceptable by any standard,
so "do nothing" does not count as a "solution".
86 junk messages and 440 good messages yesterday.
76 junk messages and 200 good messages today.
I am on more then 150 Mailinglists and last nigth I have filtered
more then 480 SPAMS with spamassassin. Not one false !

I use 2.63

Maybe you could install it ?

Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Santiago Vila
2004-05-10 09:53:16 UTC
Permalink
Post by Michelle Konzack
Hello Santiago Vila,
Post by Santiago Vila
Greetings.
The level of spam in the lists is something like 20% these days [*].
Yes we know !
Post by Santiago Vila
Those who still do not see the need to make the lists closed for non
subscribers or non registered people (via the whitelist), please
propose a better solution.
The current level of spam is not acceptable by any standard,
so "do nothing" does not count as a "solution".
86 junk messages and 440 good messages yesterday.
76 junk messages and 200 good messages today.
I am on more then 150 Mailinglists and last nigth I have filtered
more then 480 SPAMS with spamassassin. Not one false !
I use 2.63
Maybe you could install it ?
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.

The problem is: There is also a huge amount of spam which spamassassin
does not caught. Enough (IMHO) to consider seriously closing the lists.
Jérôme Marant
2004-05-10 10:06:30 UTC
Permalink
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
The problem is: There is also a huge amount of spam which spamassassin
does not caught. Enough (IMHO) to consider seriously closing the lists.
Currently listmasters provide a white list, so closing lists would make
sense.
--
Jérôme Marant

http://marant.org
Pascal Hakim
2004-05-10 12:33:05 UTC
Permalink
Post by Jérôme Marant
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
The problem is: There is also a huge amount of spam which spamassassin
does not caught. Enough (IMHO) to consider seriously closing the lists.
Currently listmasters provide a white list, so closing lists would make
sense.
From my personal experience, closing lists results in people not wanting
to post on them anymore. When I have a look there are a huge number of
valid messages coming through that are not from people in the whitelist
or subscribed to the list, whether it is because they read the list on a
news gateway, or using different emails for receiving mail traffic, and
for posting.


Cheers,

Pasc
Jaldhar H. Vyas
2004-05-10 13:01:59 UTC
Permalink
Post by Pascal Hakim
Post by Jérôme Marant
From my personal experience, closing lists results in people not wanting
to post on them anymore.
Sounds like a winner to me.
--
Jaldhar H. Vyas <***@debian.org>
La Salle Debain - http://www.braincells.com/debian/
Russ Allbery
2004-05-11 05:33:40 UTC
Permalink
Post by Jérôme Marant
From my personal experience, closing lists results in people not wanting
to post on them anymore. When I have a look there are a huge number of
valid messages coming through that are not from people in the whitelist
or subscribed to the list, whether it is because they read the list on a
news gateway, or using different emails for receiving mail traffic, and
for posting.
People who read via the news gateway should also post via the news
gateway, which has its own whitelist. (This is what I do.) That
whitelist could be fed into the lists.debian.org whitelist, or the people
on it could be asked to sign up for both. (I'd be happy to do that.)
--
Russ Allbery (***@stanford.edu) <http://www.eyrie.org/~eagle/>
Eike "zyro" Sauer
2004-05-10 10:32:47 UTC
Permalink
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
Marco d'Itri and Rob Weir tell us they don't.
What's true?

Ciao,
Eike
Wouter Verhelst
2004-05-10 10:48:18 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
Marco d'Itri and Rob Weir tell us they don't.
No, they didn't.
Post by Eike "zyro" Sauer
What's true?
They do run spamassassin, but with bayesian filtering disabled (because
that part is too CPU-intensive).
--
EARTH
smog | bricks
AIR -- mud -- FIRE
soda water | tequila
WATER
-- with thanks to fortune
Eike "zyro" Sauer
2004-05-10 11:14:30 UTC
Permalink
Post by Wouter Verhelst
They do run spamassassin, but with bayesian filtering disabled (because
that part is too CPU-intensive).
Ah, ok.
I fear this leaves out the most important part of spamassassin.
While loads of stupid spam can be filtered out by rules,
Bayesian filtering really solved my spam problems.

Ciao,
eike
Stephen M. Gava
2004-05-10 12:37:04 UTC
Permalink
Post by Wouter Verhelst
Post by Eike "zyro" Sauer
Post by Santiago Vila
lists.debian.org *already* runs spamassassin
Marco d'Itri and Rob Weir tell us they don't.
No, they didn't.
Post by Eike "zyro" Sauer
What's true?
They do run spamassassin, but with bayesian filtering disabled (because
that part is too CPU-intensive).
If this really is a serious problem on the lists that's bothering lots of
folk, and it could really be helped by having the lists host being powerful
enough to easily run spamassassin with bayesian filtering enabled, and given
that (in spite of the seemingly inevitable flamewars which so many find so
boring ;) the lifeblood of this project, the fundamental means by which it is
coordinated, is through the mailing lists: then wouldn't a suitable upgrade
to the lists host be a suitable way to spend a small part of the substantial
amount of debian's money being held by SPI?

Martin?
--
Stephen M. Gava <***@debian.org>
Martin Michlmayr - Debian Project Leader
2004-05-10 13:00:36 UTC
Permalink
Post by Stephen M. Gava
Post by Wouter Verhelst
They do run spamassassin, but with bayesian filtering disabled (because
that part is too CPU-intensive).
If this really is a serious problem on the lists that's bothering lots of
folk, and it could really be helped by having the lists host being powerful
...
Post by Stephen M. Gava
Martin?
A replacement for murphy is being worked on already.
--
Martin Michlmayr
***@debian.org
Stephen M. Gava
2004-05-10 20:47:50 UTC
Permalink
Post by Martin Michlmayr - Debian Project Leader
Post by Stephen M. Gava
If this really is a serious problem on the lists that's bothering lots of
folk, and it could really be helped by having the lists host being powerful
[...]
Post by Martin Michlmayr - Debian Project Leader
A replacement for murphy is being worked on already.
Ok, well, while I don't personally find this to be a really terrible problem
(generally skipping over spam takes less time than skipping over flames on
Post by Martin Michlmayr - Debian Project Leader
Murphy is currently running on much slower hardware. There are plans to
fix this, but it may be a while still.
makes it seem like an upgrade won't be a solution for quite a while yet. Do
you, or anyone, have an approximate timeframe for the murphy upgrade?

Cheers,
--
Stephen M. Gava <***@debian.org>
John Hasler
2004-05-10 22:31:02 UTC
Permalink
Post by Stephen M. Gava
Ok, well, while I don't personally find this to be a really terrible problem
(generally skipping over spam takes less time than skipping over flames on
our lists, no?)
No. Flames are easily dealt with by appropriate scoring of threads,
subjects, and authors.
--
John Hasler
***@dhh.gt.org (John Hasler)
Dancing Horse Hill
Elmwood, WI
Michelle Konzack
2004-05-10 21:21:14 UTC
Permalink
Post by Stephen M. Gava
If this really is a serious problem on the lists that's bothering lots of
folk, and it could really be helped by having the lists host being powerful
enough to easily run spamassassin with bayesian filtering enabled, and given
that (in spite of the seemingly inevitable flamewars which so many find so
boring ;) the lifeblood of this project, the fundamental means by which it is
coordinated, is through the mailing lists: then wouldn't a suitable upgrade
to the lists host be a suitable way to spend a small part of the substantial
amount of debian's money being held by SPI?
I think, is every subscriber to lists.debian.org "donate"
5 Euro, we can have a realy big machine ;-)
Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Marc Haber
2004-05-10 21:37:54 UTC
Permalink
On Mon, 10 May 2004 23:21:14 +0200, Michelle Konzack
Post by Michelle Konzack
I think, is every subscriber to lists.debian.org "donate"
5 Euro, we can have a realy big machine ;-)
I am sure that getting a hardware donation is not the problem at hand.

Greetings
Marc
--
-------------------------------------- !! No courtesy copies, please !! -----
Marc Haber | " Questions are the | Mailadresse im Header
Karlsruhe, Germany | Beginning of Wisdom " | Fon: *49 721 966 32 15
Nordisch by Nature | Lt. Worf, TNG "Rightful Heir" | Fax: *49 721 966 31 29
Michelle Konzack
2004-05-10 21:18:41 UTC
Permalink
Post by Wouter Verhelst
They do run spamassassin, but with bayesian filtering disabled (because
that part is too CPU-intensive).
OK, but can you tell me whats the Machine ?
CPU, Memory, HDD...

Maybe Debian need a bigger machine ;-)
I know some Enterprises in Strasbourg and maybe...
...we can get a bigger thing !

Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Blars Blarson
2004-05-10 23:40:05 UTC
Permalink
Post by Michelle Konzack
OK, but can you tell me whats the Machine ?
CPU, Memory, HDD...
http://db.debian.org/machines.cgi?host=murphy
--
Blars Blarson ***@blars.org
http://www.blars.org/blars.html
With Microsoft, failure is not an option. It is a standard feature.
Michelle Konzack
2004-05-11 16:07:48 UTC
Permalink
Thanks.
Post by Blars Blarson
Post by Michelle Konzack
OK, but can you tell me whats the Machine ?
CPU, Memory, HDD...
http://db.debian.org/machines.cgi?host=murphy
I was thinking, it has more power then a PII/400 :-)

Hmmm, have a Quatro PPro 200 (1MB Cache) an 1 GByte
of memory laying around... 19" 4HE. 3Ware 4-Cannel
Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Adrian 'Dagurashibanipal' von Bidder
2004-05-10 11:12:27 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already
eliminates most of the spam which is sent to the lists.
Marco d'Itri and Rob Weir tell us they don't.
What's true?
Nobody (that I saw) said spamassassin was not run. Spamassassin's
bayesian filter does not run. And maybe some of the scores (the HTML
related ones) are not optimized for the Debian lists (which are, thank
god, mostly HTML free.) And, I guess, the charset related things are
not configured optimally, either. If the spamassassin user is different
for the various lists, then each list could be configured to score
things in foreign character sets according to the expected language on
that list. (Disclaimer: I do not know anything about the spamassassin
setup on l.d.o. IANA{DD,L,...})

cheers
-- vbi
--
Could this mail be a fake? (Answer: No! - http://fortytwo.ch/gpg/intro)
Pascal Hakim
2004-05-10 12:37:12 UTC
Permalink
Post by Adrian 'Dagurashibanipal' von Bidder
Nobody (that I saw) said spamassassin was not run. Spamassassin's
bayesian filter does not run. And maybe some of the scores (the HTML
related ones) are not optimized for the Debian lists (which are, thank
god, mostly HTML free.) And, I guess, the charset related things are
not configured optimally, either. If the spamassassin user is different
for the various lists, then each list could be configured to score
things in foreign character sets according to the expected language on
that list. (Disclaimer: I do not know anything about the spamassassin
setup on l.d.o. IANA{DD,L,...})
At the moment all lists use the same setup. There's a bug open on
lists.debian.org relating to this. We could switch to having per
language configuration, but we haven't so far due to the large number of
configuration files we would then need. If someone knows how to include
other config files in spamassassin config files, please let me know =-).


Cheers,

Pasc
Michelle Konzack
2004-05-10 21:16:21 UTC
Permalink
Post by Eike "zyro" Sauer
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
Marco d'Itri and Rob Weir tell us they don't.
What's true?
AFAIK is "Pascal Hakim" Listmaster ;-)
Post by Eike "zyro" Sauer
Ciao,
Eike
Greetings
Michelle
--
Linux-User #280138 with the Linux Counter, http://counter.li.org/
Michelle Konzack Apt. 917 ICQ #328449886
50, rue de Soultz MSM LinuxMichi
0033/3/88452356 67100 Strasbourg/France IRC #Debian (irc.icq.com)
Francesco P. Lovergine
2004-05-10 11:17:03 UTC
Permalink
Post by Santiago Vila
lists.debian.org *already* runs spamassassin, and it already eliminates
most of the spam which is sent to the lists.
The problem is: There is also a huge amount of spam which spamassassin
does not caught. Enough (IMHO) to consider seriously closing the lists.
Add that I already use bogofilter AND spamassassin, and many messages
pass anyway.
--
Francesco P. Lovergine
Blars Blarson
2004-05-10 11:44:43 UTC
Permalink
Post by Marco d'Itri
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
A lot of what I am doing for the BTS was train the spamassassin bayes
filter. The current system used for lists.debian.org reportadly does
not have enough processing power to add bayes filtering. This takes
continuing human time to keep retraining as well.

Tuning the spamassasin rules in reaction to spam runs is also a big
part, and that also takes time.

Someone else will need to do the work on lists.debian.org, I just
don't have the time needed available.
--
Blars Blarson ***@blars.org
http://www.blars.org/blars.html
With Microsoft, failure is not an option. It is a standard feature.
William Ballard
2004-05-10 12:11:42 UTC
Permalink
Post by Blars Blarson
Post by Marco d'Itri
Start using DNSBLs like SBL, XBL and DSBL, or have Blars Blarson try on
murphy the same recipes he uses to filter spam in the BTS.
A lot of what I am doing for the BTS was train the spamassassin bayes
filter. The current system used for lists.debian.org reportadly does
not have enough processing power to add bayes filtering. This takes
continuing human time to keep retraining as well.
Since the soluton is bayes and the only reason l.d.o is not using bayes
is "it can't," wouldn't it be useful to have a few people who can do
'bayes' and distribute lists of Message IDs to ignore to the rest of us?

Is the recommended solution really for each of the 10,000 readers of the
list to individually do their own filtering?
Andrew Suffield
2004-05-10 12:46:17 UTC
Permalink
Post by Santiago Vila
The level of spam in the lists is something like 20% these days [*].
This thread is already larger than the total amount of spam I have
observed on this list in several months. And I don't even spend hours
agonising over the spam "problem", or proposing/implementing
half-cocked schemes to "solve" it.
--
.''`. ** Debian GNU/Linux ** | Andrew Suffield
: :' : http://www.debian.org/ |
`. `' |
`- -><- |
Daniel Burrows
2004-05-10 17:09:37 UTC
Permalink
I have an even better suggestion, one GUARANTEED to stop 100% of the
spam from reaching people who are offended by such things.

We can shut down the mailing lists!

Now, I realize this might sound like a strange suggestion at first,
but think of the benefits:

- First (and, of course, of utmost importance), the number of spam
mails getting through to the lists will fall *immediately* to 0. ZERO.
ZILCH. No more Viagra ads, no more messages in unreadable
charsets, no need to run your own spam filters.

- The load on murphy will drop significantly, allowing it to perform
its non-list-related duties more efficiently.

- The listmasters' jobs will be made much easier, freeing them
to work on more important matters, like flaming each other (in
private mail, of course) over the latest non-free GR.

- The members of the Project as a whole will be able to work on their
technical tasks without being distracted by trivialities such as
user support, coordinating policy with other maintainers, or release
management. Of course, a few important tasks (such as the
flamewar-of-the-month on -devel) will become more difficult, but
accommodations can be made for these corner cases; for instance,
by putting every Developer on the Cc line of critical inflammatory
messages.

And there's no need to stop there! If the level of spam is still not
low enough to meet the approval of all Debian contributors, we could
continue with the trimming of unnecessary services. For instance, do
we really need bugs.debian.org? An obviously accessible bug reporting
service will just encourage users to file bugs, further wasting
developer time.

Instead, we should just encourage users to email developers privately...
after they've brute-force decoded their GPG-encrypted email addresses, of
course; it goes without saying that we'll have to disable @debian.org and
@packages.debian.org email addresses, since spammers know to send mail to
them, and it's certainly unacceptable to place unprotected email addresses
in packages or package metadata. Just make the key length short enough
that a Pentium III can brute-force it in a couple days, and we'll be set.

With these measures, I can guarantee that Debian will become a 100%
SPAM-FREE zone...and God knows nothing else in life is important.

Daniel
--
/-------------------- Daniel Burrows <***@debian.org> -------------------\
| Afternoon, n.: |
| That part of the day we spend worrying |
| about how we wasted the morning. |
\---- Be like the kid in the movie! Play chess! -- http://www.uschess.org ---/
Ean Schuessler
2004-05-10 19:50:19 UTC
Permalink
You are officially a funny guy.
Post by Daniel Burrows
I have an even better suggestion, one GUARANTEED to stop 100% of the
spam from reaching people who are offended by such things.
We can shut down the mailing lists!
--
Ean Schuessler, CTO
Brainfood, Inc.
http://www.brainfood.com
Loading...