Spam Attachments Features | |||||
Habul Dataset | Botnet Dataset | ||||
Rank | Category | Feature | Rank | Category | Feature |
1 | Subject | Number of capitalized words | 1 | Subject | Min of the compression ratio for the bz2 compressor |
2 | Subject | Sum of all the character lengths of words | 2 | Subject | Min of the compression ratio for the zlib compressor |
3 | Subject | Number of words containing letters and numbers | 3 | Subject | Min of character diversity of each word |
4 | Subject | Max of ratio of digit characters to all characters of each word | 4 | Subject | Min of the compression ratio for the lzw compressor |
5 | Header | Hour of day when email was sent | 5 | Subject | Max of the character lengths of words |
(a) | (b) | ||||
Spam URLs Features | |||||
1 | URL | The number of all URLs in an email | 1 | Header | Day of week when email was sent |
2 | URL | The number of unique URLs in an email | 2 | Payload | Number of characters |
3 | Payload | Number of words containing letters and numbers | 3 | Payload | Sum of all the character lengths of words |
4 | Payload | Min of the compression ratio for the bz2 compressor | 4 | Header | Minute of hour when email was sent |
5 | Payload | Number of words containing only letters | 5 | Header | Hour of day when email was sent |
(c) | (d) |