Spam Attachments Features

Habul Dataset

Botnet Dataset

Rank

Category

Feature

Rank

Category

Feature

1

Subject

Number of capitalized words

1

Subject

Min of the compression ratio for the bz2 compressor

2

Subject

Sum of all the character lengths of words

2

Subject

Min of the compression ratio for the zlib compressor

3

Subject

Number of words containing letters and numbers

3

Subject

Min of character diversity of each word

4

Subject

Max of ratio of digit characters to all characters of each word

4

Subject

Min of the compression ratio for the lzw compressor

5

Header

Hour of day when email was sent

5

Subject

Max of the character lengths of words

(a)

(b)

Spam URLs Features

1

URL

The number of all URLs in an email

1

Header

Day of week when email was sent

2

URL

The number of unique URLs in an email

2

Payload

Number of characters

3

Payload

Number of words containing letters and numbers

3

Payload

Sum of all the character lengths of words

4

Payload

Min of the compression ratio for the bz2 compressor

4

Header

Minute of hour when email was sent

5

Payload

Number of words containing only letters

5

Header

Hour of day when email was sent

(c)

(d)