Christian Boltz wrote:
Maybe you should adjust your heuristics a bit, even at the risk of needing a somewhat more complex SQL query ;-) Disclaimer: This is a high-level proposal, I never looked at the database layout or content ;-) and therefore can't offer a ready-to-use SQL query.
Christian, your input is always most welcome - even when not in the form of a ready-to-use SQL query :-)
So how do we differenciate spammers from good users?
Precisely the same question I have been wondering about.
IMHO it isn't too complex and can be answered with two questions:
a) Does the _username_ contain a @ sign?
No ("example_user") -> someone with an openSUSE account -> most likely not a spammer.
Yes ("user@example.com") -> account was created by sending a mail to progress -> continue with b)
b) Does the user profile (like https://progress.opensuse.org/users/76 ) (or the underlaying tables) show activity for this user? Note: checking for "created tickets" _or_ "activity" should be enough.
Yes, at least one ticket created -> most likely it's a valid user who created one or more tickets by mail
No activity -> account created by mail, but no tickets created? That's a strong indication that tickets were deleted, and therefore a strong indication for being a spammer.
My thoughts exactly, I just had not quite formulated them. I was also thinking "account created by mail" + "no further activity" -> something's not right.
I wonder if the resulting list of spammers includes any @suse.* or @opensuse.org addresses. If so, the checks I propose are somewhat broken and need more adjustments ;-)
One idea I have been playing with is to do a simple email verification - anyone writing to admin@.o.o - new? reply with request for a verification. (an email reply). If you don't reply to verify a new account, dismissed. -- Per Jessen, Zürich (6.9°C) Member, openSUSE Heroes