WIRED:虽然垃圾邮件让所有的人都烦恼,也让Internet的速度降低了60%,但是殊不知垃圾邮件事实上也蕴涵了很多高科技的成分,尤其是在现在这场发垃圾和反垃圾的高科技战斗之中。 除了黑名单之外,目前最被看好的反垃圾方式应该就是Bayesian规则了。通过识别原来的垃圾邮件的内容,自动判别下一封Email属于垃圾邮件的可能性。这几乎就是人工智能和真人智能的比赛,因为能够发出垃圾邮件的人也并非等闲之辈他们往往都自己亲自安装一些最好的反垃圾邮件设备,比如Inboxer和POPFile,不断地调整自己的发送技巧以尽可能地躲过被过虑掉的结果。 一位软件工程师制作了一份“垃圾邮件者纲要(Spammer's Compendium)”,不断更新地列举了垃圾邮件中采用的最奇妙和下流的办法,比如使用各种非可见的HTML代码等(如果你是一位业余垃圾邮件发送者,看一下应该会有不少收获的)。不过最让人感到好玩的还要算是使用了最新的一个研究课题-人类阅读的机制(也就是著名的Can You Raed Tihs)。所以有些垃圾邮件的内容里几乎没有一个拼写正确的单词,但是你却可以没有什么障碍的阅读,比如这个。这些技巧对于Bayesian来说也是一个不小的挑战,当然如果他们漏网了,你确实可以惊叹一下发送者的巧妙用心。 这场战斗是一定要持续下去的,这些只是垃圾邮件们在超速进化中的一些故事而已。
The Spammers' Compendium
Being a public exposition of tricks, secret ploys, ruses and techniques employed by those that send many scurrilous messages through the ether using the mysteries of electronics and other modern marvels to dazzle the eye, lighten the wallet and clog the recipient.
What: Simple description of the entry Popularity: How common the trick is: common, sometimes, rare Complexity: How complex the trick is: simple, clever, dastardly Date added: When this entry was made Example from the wild: Actual example from email seen in the wild
The Tricks
The Big Picture
What: The entire email consists of a small HTML page consisting of an image enclosed in a single hyperlink. Popularity: Common Complexity: Simple Date added: January 17, 2003 Example from the wild:
April 29, 2003: Scott Schram points out that some instances of this are being sent with valid but unrelated text before and after the image.
Invisible Ink
What: Use of white text on a white background containing words designed to confuse a filter. Popularity: Common Complexity: Clever Date added: January 17, 2003 Example from the wild:
What: Insert a piece of current news in a bogus HTML tag. Popularity: Rare Complexity: Clever Date added: January 17, 2003 Example from the wild:
Hypertextus Interruptus
What: Split words using HTML comments, pairs of zero width tags, or bogus tags Popularity: Common Complexity: Clever Date added: January 17, 2003 Examples from the wild:
millionaireFind New FriendsViagraFree
September 15, 2003: Another example comes from Tim Peters, this uses a Microsoft-only HTML tag to insert ignored text into the word Viagra:
Via6q5r7gra
Slice and Dice
What: Use a table to send words through as individual letters arranged top to bottom but read left to right Popularity: Rare Complexity: Dastardly Date added: January 17, 2003 Example from the wild: (picture)
U
O a
D u a
N
B d
N
C
C w
1
1
C S N
bta nd
ipl niv nd
o r
ach ipl
o o
onf
ALL ith
-
-
all und I V
in the
oma ers lif
equ
elo oma
ne
ide
NO in
3 1
2 1
24 ays
E
a a
s it e
ir
rs s
is
nt
W da
2
2
h a
MIME is Money
What: Send two part MIME document, text/plain part contains bogus text, text/html part contains the spam message Popularity: Rare Complexity: Very clever Date added: January 17, 2003 Example from the wild:
------=_NextPart_001_2D3DF_01C29D73.26716240Content-Type: text/plain;The modes of letting vacant farms, the duty of supplying buildings and permanentimprovements, and the form in which rent is to be received, have all been carefullydiscussed in the older financial treatises. Most of these questions belong topractical administration, and are, moreover, not of great interest in modern times.Certain plain rules, may, however, be stated. The claims of successors to the latetenant should not be overlooked; it is better for the tenure to be continued withoutbreak, and therefore the question of new letting ought rarely tooccur.------=_NextPart_001_2D3DF_01C29D73.26716240Content-Type: text/html;
Now is the perfect time to get a mortgage,and we have a simple and free way for you to get started.
September 15, 2003: This trick seems to be getting more common.
L O S T i n S P A C E
What: Insert spaces between letters to make words unrecognizable. Popularity: Common Complexity: Simple Date added: January 17, 2003 Examples from the wild:
M O R T G A G EF*R*E*E V扞扐扜扲扐 O*N*L*I*N*E
Enigma
What: Use URL encoding to hide URLs Popularity: Rare Complexity: Clever Date added: January 17, 2003 Example:
What: Keep HTML body of email in a Javascript that fires when the email is opened Popularity: Rare Complexity: Clever Date added: January 17, 2003 Example from the wild:
Ze Foreign Accent
What: Replace letters with numbers or use nonsense accents Popularity: Common Complexity: Simple Date added: January 17, 2003 Example from the wild:
What: Large nonsense words designed to mess up CRC based spam identification Popularity: Common Complexity: Clever Date added: January 17, 2003 Example from the wild:
crecrephaswukutugucrovazichonuprixisluwephimajoq
The Black Hole
What: Use of font size 0 to break up words with zero width spaces Popularity: Rare Complexity: Clever Date added: April 1, 2003 Example from the wild:
V i a g r a
A Numbers Game
What: Use HTML entities instead of letters Popularity: Rare Complexity: Simple Date added: April 1, 2003 Example from the wild:
Watch Dogs slurp young girls puss
Bogus Login
What: Use URL username@host syntax to disguise a URL. Popularity: Rare Complexity: Simple Date added: April 6, 2003 Example from the wild: (this example also use % encoding of the URL to further disguise it)
Click Here
Honey, I shrunk the font
What: Use very small (size 1) font to hide bogus text (see also The Black Hole) Popularity: Rare Complexity: Simple Date added: April 6, 2003 Example from the wild: (Notice how the spammer didn't follow the instructions and managed to leave the instructions in the spam :-) (This spam also uses Invisible Ink for these words)
Random word ofBIG LETTERS with length 1 to 22 TSUTHRXJKVUVBECP
Random word ofsmall letters with length 1 to 16 uyswdgueoclrwlf
Random word ofmixed symbols with length 1 to 27 7y14R484w1m7531X
Your text 9, note,maximum length of tag is 255 symbols
No Whitespace No Cry
What: Since many languages separate words with spaces, and since many spam filters do the same this spammer decided that replacing spaces with something else was a good idea. Popularity: Rare Complexity: Dumb Date added: May 15, 2003 Example from the wild:
What: Another way of hiding text in an HTML email by placing it in the which is unlikely to be displayed by the email client. Popularity: Rare Complexity: Simple Date added: May 27, 2003 Example from the wild:
dinosaur reptile ghueej egrjerijg gerrg
Camouflage
What: Like Invisible Ink, but instead of using identical colors (e.g. white on white) use very similar colors. Popularity: Rare Complexity: Very clever Date added: June 2, 2003 Example from the wild: (The colors 1133333, 123939, and 423939 are chosen to be very similar without being the same)
those rearing lands
Plasticine sex-cartoons. eel harness highest Absolutely new category of adu1t sites.nobody jets held Northumbria- diamond sleep
And In The Right Corner
What: Adding a legitimate but odd word at the far right of the subject line (typically preceded with lots of spaces and tabs). The word is design to poison a Bayesian filter and alter the spam's hash value. Popularity: Rare Complexity: Clever Date added: June 18, 2003 Example from the wild: (Thanks for Gary Robinson for pointing this one out)
Subject: FEATURED IN MAJOR MAGAZINES algorithmic
A Form of Desperation
What: Hiding text by placing it in the name of a hidden form field Popularity: Rare Complexity: Clever Date added: June 24, 2003 Example from the wild:
Get The LOWEST PRICE On Your New Car
September 15, 2003: Another example came in from Darren J. Young that uses the value tag and fills it with a phrase from current events:
It's Mini Marquee!
What: Using the
webmaster@jgc.org, Copyright (c) 1999-2003 John Graham-Cumming