Dataset Name | Author | Date | Real or Simulated | Features | DDoS attack Types | Dataset Size | Availability | Advantage | Limitation |
KDD’99 Cup dataset [8] | MIT Lincoln Labs |
| Simulated | -two weeks of attack-free encounters and five week attack instance -output divided into 5 categories of ;DOS, Probe, R2L, U2R, and Normal -has 38 total attack types | SYN flood | 743 MB | Available | -easily obtainable -many attack type available | -heavily imbalanced dataset with 80% attack traffic. |
CAIDA DDoS Attack 2007 dataset [9] | Paul Hick | Aug 4, 2007 | Simulated | -consist of data anonymized within one hour - resource consumer | UDP flood | 21 GB | Quasi- restricted | -available for public use -effective to handle large DDoS attack above 5 Gb -traces can be read on any software reading tcpdump | -non-attack traffic is unavailable -does not include payload packets |
EPA http dataset | Laura Bottomley | Aug 29, 1995 | Real | -46,014 GET requests - 1622 POST requests -107 HEAD requests -6 invalid requests -One-second accuracy on timestamp | HTTP flooding | 4.4 MB | Available | -smaller dataset size | -cannot determine legitimate and illegitimate HTTP requests -small dataset may limit the extent of attack detection |
DARPA_2009 _malware- DDoS_attack -20091104 | University of Southern California- Information Sciences Institute | Nov 4, 2009 | Real | -background traffic and malware attack on compromised hosts of 172.28.0.0/16 IP range. -Attack performed on non-local target of IP 152.162.178.254 at TCP port 499 | Malware DDoS attack | 346.5 MB | Quasi- Restricted | -contains vectors for attacks from real DDoS attacks |
|
DARPA_2009_ DDoS_attack- 20091105 | University of Southern California- Information Sciences Institute | Nov 5, 2009 | Real | -SYN floods targeted on one IP address (172.28.4.7) - The attack also has background traffic -DDoS traffic from 100 separate IPs | SYN flood | 1.01 GB | Quasi- restricted | -consist attack from multiple real sources hence able to learn attack vectors | -Attack targeted to one victim only does not determine the overall network strength |
NSL-KDD dataset [10] | Mahbod Tavallee, Ebrahim Bagheri, Wei Lu, Ali A. Ghorbani | 2009 | Simulated | -Continuous Duration -Discrete protocol -Discrete service | Back, Land, Neptune, Process table, Worm (10), Apache2. | 124 MB | Available |
|
|
ISCX dataset [11] | Unknown | June 11, 2010 to June 17, 2010 | Simulated | -practical network and traffic -Labeled dataset -different intrusion scenarios | HTTP, SMTP, SSH, IMAP, FTP | 84.46 GB | Available |
|
|