[Spam]垃圾邮件者纲要(Spammer's Compendium)

来源:互联网 发布:域名如何注册 编辑:程序博客网 时间:2024/05/15 07:42

(中文文字来源于“中国数字部落(DIGIBLOG_ORG) - 无关紧要的消息”:http://digiblog.org/)

 

(p.s.:Spammer's Compendium用实例列举了垃圾邮件制造者绝大多数的伎俩,不得不感叹spammer们的智慧。)

WIRED:虽然垃圾邮件让所有的人都烦恼,也让Internet的速度降低了60%,但是殊不知垃圾邮件事实上也蕴涵了很多高科技的成分,尤其是在现在这场发垃圾和反垃圾的高科技战斗之中。
除了黑名单之外,目前最被看好的反垃圾方式应该就是Bayesian规则了。通过识别原来的垃圾邮件的内容,自动判别下一封Email属于垃圾邮件的可能性。这几乎就是人工智能和真人智能的比赛,因为能够发出垃圾邮件的人也并非等闲之辈他们往往都自己亲自安装一些最好的反垃圾邮件设备,比如InboxerPOPFile,不断地调整自己的发送技巧以尽可能地躲过被过虑掉的结果。
一位软件工程师制作了一份“垃圾邮件者纲要(Spammer's Compendium)”,不断更新地列举了垃圾邮件中采用的最奇妙和下流的办法,比如使用各种非可见的HTML代码等(如果你是一位业余垃圾邮件发送者,看一下应该会有不少收获的)。不过最让人感到好玩的还要算是使用了最新的一个研究课题-人类阅读的机制(也就是著名的Can You Raed Tihs)。所以有些垃圾邮件的内容里几乎没有一个拼写正确的单词,但是你却可以没有什么障碍的阅读,比如这个。这些技巧对于Bayesian来说也是一个不小的挑战,当然如果他们漏网了,你确实可以惊叹一下发送者的巧妙用心。
这场战斗是一定要持续下去的,这些只是垃圾邮件们在超速进化中的一些故事而已。

The Spammers' Compendium

Being a public exposition of tricks,
secret ploys, ruses and techniques
employed by those that send many
scurrilous messages through the ether
using the mysteries of electronics and
other modern marvels to dazzle the eye,
lighten the wallet and clog the recipient.

Background

I gave a talk entitled The Spammers' Compendium at the MIT Spam Conference and decided to keep it updated in a non-Powerpoint form. Hence this page was born.

I last updated it on September 15, 2003

Each entry consists of five items:

What: Simple description of the entry
Popularity: How common the trick is: common, sometimes, rare
Complexity: How complex the trick is: simple, clever, dastardly
Date added: When this entry was made
Example from the wild: Actual example from email seen in the wild

The Tricks

The Big Picture

What: The entire email consists of a small HTML page consisting of an image enclosed in a single hyperlink.
Popularity: Common
Complexity: Simple
Date added: January 17, 2003
Example from the wild:

April 29, 2003: Scott Schram points out that some instances of this are being sent with valid but unrelated text before and after the image.


Invisible Ink

What: Use of white text on a white background containing words designed to confuse a filter.
Popularity: Common
Complexity: Clever
Date added: January 17, 2003
Example from the wild:
search words: suspensory obscurearistocratical meningorachidian unafeared brahmachari

The Daily News

What: Insert a piece of current news in a bogus HTML tag.
Popularity: Rare
Complexity: Clever
Date added: January 17, 2003
Example from the wild:

Hypertextus Interruptus

What: Split words using HTML comments, pairs of zero width tags, or bogus tags
Popularity: Common
Complexity: Clever
Date added: January 17, 2003
Examples from the wild:
millionaireFind New FriendsViagraFree
September 15, 2003: Another example comes from Tim Peters, this uses a Microsoft-only HTML tag to insert ignored text into the word Viagra:
Via6q5r7gra

Slice and Dice

What: Use a table to send words through as individual letters arranged top to bottom but read left to right
Popularity: Rare
Complexity: Dastardly
Date added: January 17, 2003
Example from the wild: (picture)
 
U
 
O
a
 
D
u
a
 
N
 
B
d
 
N
 
C
 
C
w
 
1
 
 
 
1
 
C
S
   
 N 
   
bta
nd 
   
ipl
niv
nd 
   
o r
   
ach
ipl
   
o o
   
onf
   
ALL
ith
   
 - 
   
   
   
 - 
   
all
und
   
I V
   
in 
the
   
oma
ers
lif
   
equ
   
elo
oma
   
ne 
   
ide
   
 NO
in 
   
3 1
   
   
   
2 1
   
 24
ays
  
 E
  

 a
  

it

  
ir
  
rs

  
is
  
nt
  

da
  
 2
  
  
  
 2
  
 h
 a

MIME is Money

What: Send two part MIME document, text/plain part contains bogus text, text/html part contains the spam message
Popularity: Rare
Complexity: Very clever
Date added: January 17, 2003
Example from the wild:
------=_NextPart_001_2D3DF_01C29D73.26716240Content-Type: text/plain;The modes of letting vacant farms, the duty of supplying buildings and permanentimprovements, and the form in which rent is to be received, have all been carefullydiscussed in the older financial treatises. Most of these questions belong topractical administration, and are, moreover, not of great interest in modern times.Certain plain rules, may, however, be stated. The claims of successors to the latetenant should not be overlooked; it is better for the tenure to be continued withoutbreak, and therefore the question of new letting ought rarely tooccur.------=_NextPart_001_2D3DF_01C29D73.26716240Content-Type: text/html;

Now is the perfect time to get a mortgage,and we have a simple and free way for you to get started.

September 15, 2003: This trick seems to be getting more common.

L O S T i n S P A C E

What: Insert spaces between letters to make words unrecognizable.
Popularity: Common
Complexity: Simple
Date added: January 17, 2003
Examples from the wild:
M O R T G A G EF*R*E*E V扞扐扜扲扐 O*N*L*I*N*E

Enigma

What: Use URL encoding to hide URLs
Popularity: Rare
Complexity: Clever
Date added: January 17, 2003
Example:
http://7763631671/obscure.htmhttp://0xCeBF9e37/obscure.htmhttp://0316.0277.0236.067/obscure.htmhttp://3468664375@3468664375/o%62s%63ur%65%2e%68t%6D

Script Writer

What: Keep HTML body of email in a Javascript that fires when the email is opened
Popularity: Rare
Complexity: Clever
Date added: January 17, 2003
Example from the wild:
    

Ze Foreign Accent

What: Replace letters with numbers or use nonsense accents
Popularity: Common
Complexity: Simple
Date added: January 17, 2003
Example from the wild:
V1DE0 T4PE M0RTG4GEF醤t醩t扃 -- e醨n m鮪閥 thr魎gh un珲lle鐃ed judgments

Speaking in Tongues

What: Large nonsense words designed to mess up CRC based spam identification
Popularity: Common
Complexity: Clever
Date added: January 17, 2003
Example from the wild:
crecrephaswukutugucrovazichonuprixisluwephimajoq

The Black Hole

What: Use of font size 0 to break up words with zero width spaces
Popularity: Rare
Complexity: Clever
Date added: April 1, 2003
Example from the wild:
V i a g r a

A Numbers Game

What: Use HTML entities instead of letters
Popularity: Rare
Complexity: Simple
Date added: April 1, 2003
Example from the wild:
Watch Dogs slurp young girls puss

Bogus Login

What: Use URL username@host syntax to disguise a URL.
Popularity: Rare
Complexity: Simple
Date added: April 6, 2003
Example from the wild: (this example also use % encoding of the URL to further disguise it)
Click Here

Honey, I shrunk the font

What: Use very small (size 1) font to hide bogus text (see also The Black Hole)
Popularity: Rare
Complexity: Simple
Date added: April 6, 2003
Example from the wild: (Notice how the spammer didn't follow the instructions and managed to leave the instructions in the spam :-) (This spam also uses Invisible Ink for these words)

Random word ofBIG LETTERS with length 1 to 22 TSUTHRXJKVUVBECP

Random word ofsmall letters with length 1 to 16 uyswdgueoclrwlf

Random word ofmixed symbols with length 1 to 27 7y14R484w1m7531X

Your text 9, note,maximum length of tag is 255 symbols


No Whitespace No Cry

What: Since many languages separate words with spaces, and since many spam filters do the same this spammer decided that replacing spaces with something else was a good idea.
Popularity: Rare
Complexity: Dumb
Date added: May 15, 2003
Example from the wild:
DidAyouFknowNyouMcanBgetVprescriptionVmedications prescribedTonlineTwith       NORPRIORRPRESCRIPTIONRREQUIRED!      WeZhaveztheXlargestLselectionLofNprescriptionsNavailableZonline!      LowestzPrices -- NextzDayxDelivery

Honorary Title

What: Another way of hiding text in an HTML email by placing it in the which is unlikely to be displayed by the email client. <BR><B>Popularity:</B> Rare <BR><B>Complexity:</B> Simple <BR><B>Date added:</B> May 27, 2003 <BR><B>Example from the wild:</B> <PRE><title>dinosaur reptile ghueej egrjerijg gerrg

Camouflage

What: Like Invisible Ink, but instead of using identical colors (e.g. white on white) use very similar colors.
Popularity: Rare
Complexity: Very clever
Date added: June 2, 2003
Example from the wild: (The colors 1133333, 123939, and 423939 are chosen to be very similar without being the same)
those rearing lands

Plasticine sex-cartoons.
eel harness highest
Absolutely new category of adu1t sites.nobody jets held
Northumbria- diamond sleep

And In The Right Corner

What: Adding a legitimate but odd word at the far right of the subject line (typically preceded with lots of spaces and tabs). The word is design to poison a Bayesian filter and alter the spam's hash value.
Popularity: Rare
Complexity: Clever
Date added: June 18, 2003
Example from the wild: (Thanks for Gary Robinson for pointing this one out)
Subject: FEATURED IN MAJOR MAGAZINES                                   algorithmic

A Form of Desperation

What: Hiding text by placing it in the name of a hidden form field
Popularity: Rare
Complexity: Clever
Date added: June 24, 2003
Example from the wild:
Get The  LOWEST PRICE  On Your New Car
September 15, 2003: Another example came in from Darren J. Young that uses the value tag and fills it with a phrase from current events:

It's Mini Marquee!

What: Using the tag the spammer can hide text in a tiny unobtrusive square.
Popularity: Rare
Complexity: Fairly Clever
Date added: July 9, 2003
Example from the wild:
Did you ever play that gamewhen you were a kid where the little plastic hippo tries to gobble up allyour marbles?

You've been framed

What: Using the tag the spammer can hide text and break up words. <BR><B>Popularity:</B> Fairly Common <BR><B>Complexity:</B> Fairly Clever <BR><B>Date added:</B> September 15, 2003 <BR><B>Example from the wild:</B> <PRE>Ere<frame><noframes>ywl55ctions

Control Freak

What: Use of non-printing characters, especially in the Subject and especially NUL to mess up filters that use 0 terminated strings.
Popularity: Rare
Complexity: Clever
Date added: September 15, 2003

Don't Cramp My Style

What: Enclose text within tags to hide it from user but confuse filters.
Popularity: Very Rare
Complexity: Fairly Clever
Date added: September 15, 2003
Example from the wild:
RANDOM

Common Encodings

Many spam emails use quoted printable and base64 encoding on top of the tricks outlined on the right. Any spam filter needs to be able to understand both of these and MIME nested encoding (e.g. base64 on top of quoted printable). A quoted printable example from the wild (used the Black Hole trick):
V i a g r&nbs=p;a
A base64 example from the wild (note that this used very long base64 lines that do not meet the standard):
------=_NextPart_000_60BF_00005753.000048CCContent-Type: text/html;charset="iso-8859-1"Content-Transfer-Encoding: base64PEhUTUw+PEJPRFkgQkdDT0xPUj0iIzAwMDAwMCI+PC9QPjxQIEFMSUdOPUNFTlRFUj48Rk9OVCAgQ09MT1I9IiNmZjAwMDAiIEJBQ0s9IiMwMDAwMDAiIHN0eWxlPSJCQUNLR1JPVU5ELUNPTE9SOiAjMDAwMDAwIiBTSVpFPTYgUFRTSVpFPTI0PlRoZSBob3R0ZXN0IEdpcmxzIE9ubGluZSE8QlI+DQpTdG9wIHdhc3RpbmcgeW91ciB0aW1lIHdpdGggNSBzZWM8QlI+DQp2aWRlbyBjbGlwcyEgQ29tZSB0byBvdXIgc2l0ZSBmb3I8QlI+DQpGcmVlIEZ1bGwgTGVuZ3RoIE1vdmllcyE8QlI+DQo8QSBIUkVGPSJodHRwOi8vd2NhbWF0ZXVycy5jb20vbC9ibCI+V2h5IHdhaXQsIHNlZSBmb3IgRnJlZTwvQT48L0ZPTlQ+PEZPTlQgIENPTE9SPSIjZmYwMDAwIiBCQUNLPSIjMDAwMDAwIiBzdHlsZT0iQkFDS0dST1VORC1DT0xPUjogIzAwMDAwMCIgU0laRT02IFBUU0laRT0yNCBGQU1JTFk9IlNBTlNTRVJJRiIgRkFDRT0iQXJpYWwiIExBTkc9IjAiPjxCUj4NCjwvUD48UCBBTElHTj1MRUZUPjwvRk9OVD48Rk9OVCAgQ09MT1I9IiNmZjAwMDAiIEJBQ0s9IiMwMDAwMDAiIHN0eWxlPSJCQUNLR1JPVU5ELUNPTE9SOiAjMDAwMDAwIiBTSVpFPTMgUFRTSVpFPTExIEZBTUlMWT0iU0FOU1NFUklGIiBGQUNFPSJBcmlhbCIgTEFORz0iMCI+PEJSPg0KPC9GT05UPjxGT05UICBDT0xPUj0iIzAwMDBmZiIgQkFDSz0iIzAwMDAwMCIgc3R5bGU9IkJBQ0tHUk9VTkQtQ09MT1I6ICMwMDAwMDAiIFNJWkU9NiBQVFNJWkU9MjQgR! kFNSUxZPSJTQU5TU0VSSUYiIEZBQ0U9IkFyaWFsIiBMQU5HPSIwIj48QSBIUkVGPSJodHRwOi8vd2NhbWF0ZXVycy5jb20vbC9yIj5ObyBtb3JlIG1haWwgaGVyZTwvQT48L0ZPTlQ+PC9IVE1MPg0K------=_NextPart_000_60BF_00005753.000048CC--

A Complex Example

This is an example of a real email that uses multiple techniques to disguise its contents:
PGh0bWw+DQo8YSBocmVmPSJodHRwOi8vJTc3JTc3dy5wJTYxJTczJTczNCU2NiU3MmUlNjUlMkVuZXQvcGIzLyIgVDhJPjxGT05UIFNJWkU9NT48Qj4mIzg3OyYjOTc7PCFLND50PCE0YTQ1PmMmIzEwNDs8IVBKMHV1PiAmIzY4OzwhT1UxMGRRPm88IWgzMj5nPCFOWDc4PnM8IUY0NzZ0PiAmIzExNTsmIzEwODs8IXkweDY+dSYjMTE0OzwhV1ZRPnAmIzMyOzwhMW0+eTwhS1NrUD5vPCFvMzVBZT51JiMxMTA7JiMxMDM7PCE0N2ViVTM+ICYjMTAzOyYjMTA1OyYjMTE0OyYjMTA4OyYjMTE1OyYjMzI7PCF5MjU+cCYjMTE3OzwhOFljPnMmIzExNTsmIzEyMTs8ITVSaTQ+JzwhcEdTNj5zJiMzMjsmIzk3OzwhQWgxPnMmIzMyOyYjMTE2OyYjMTA0OzwhMXJKM1JIPmU8IW84V1h1PnkmIzMyOzwhMzU+czwhMFE3ND5jJiMxMTQ7PCFSZnA+ZTwhUGw+YTwhSzQ+bTwhNGE0NT4gJiMxMDI7PCFQSjB1dT5vJiMxMTQ7PCFPVTEwZFE+IDwhaDMyPm08IU5YNzg+bzwhRjQ3NnQ+ciYjMTAxOyYjMzM7PC9mb250PjwvYT48QlI+DQo8QlIgck0wc1JhUHE+PGEgaHJlZj0iaHR0cDovL3d3dyUyRSU3MCU2MSU3MyU3MyUzNGZyZWUlMkUlNkUlNjV0L3BiMy8iIDFySjNSSEJvOFcgdW5TVlQ3PjxGT05UIFNJWkU9ND48Qj48IXkweDY+QyYjMTA4OzwhV1ZRPmkmIzk5OzwhMW0+azwhS1NrUD4gPCFvMzVBZT5IJiMxMDE7JiMxMTQ7PCE0N2ViVTM+ZTwvZm9udD48L2E+PEJSPjxCUj48QlI+PEJSPjxCUj48QlI+PEJSPiYjMTM7JiMxMDsmIzY5OyYjMTA5OyYjOTc7JiMxMDU7PCF5MjU+bCYjMzI7PCE4WWM+QiYjOTc7JiMxMDA7PCE1Umk0Pj88QlIgUlIgMk1PZHZjTT4NCm5vIG1vcmUgPGEgaHJlZj0iaHR0cDovL3JlbW92ZSUyRSU2RGUlNzMlNzNhJTY3JTY1bSU2NW4lNkYlNzcuJTZFZXQvIiBSZnBOUD5DbGljayBIZXJlPC9hPjxCUj4NCjxCUj48L2h0bWw+DQoNCmFQcTgyTU9kICAgICAgICAgICAgICAgICAgICAgICAgICAgICAgICBjTUo=
Removing the base64 encoding reveals the following odd looking HTML.
 Watch Dogs slurp younggirls pussy's as they scream for more!

Click Here






Email Bad?
no more Click Here

aPq82MOd cMJ
The email uses bad HTML tags to split words (Hypertextus Interruptus), URL encoding to hide the URLs used (Enigma), HTML entities to hide letters (A Numbers Game) and spaces (Lost in Space). Removing the bad HTML used to split words (Hypertextus Interruptus) reveals:
 WatchDogs slurp young girls pussy's as they scream for more!

Cl ic k Her e






Emai l Bad?
no more Click Here

aPq82MOd cMJ
Removing the URL encoding (Enigma) reveals:
Watch Dogs slurp young girls pussy's as they scream for more!

Clic k Her e






Emai l Bad?
no more Click Here

aPq82MOd cMJ
Then removing the HTML entities (A Numbers Game) reveals the true message:
 Watch dogs slurp young girls pussy抯 as they scream for more!

Cl i c k Her e






Email Bad?
no moreClick Here

aPq82MOd cMJ

webmaster@jgc.org, Copyright (c) 1999-2003 John Graham-Cumming

 

 



Trackback: http://tb.blog.csdn.net/TrackBack.aspx?PostId=12733


原创粉丝点击