Year & Reference

Method

Findings

Limitations

2009

[13]

Proposed a mobile agent-based mechanism combined with the coloring i.e. robust watermarking to identify the information leakage sources.

The proposed method effectively identifies the potential leakage sources from both the covert channel and the insiders.

Implemented primarily through modification of the SELinux kernel modules. Experimental details and results of the host-resident agents in technique were not represented.

2010

[20]

Proposed a model to assess the malicious and honest users.

The model effectively classifies the malicious

and honest users and prevent the distribution of files to them thus, preventing the data leakage.

The model uses a single classification technique to classify malicious and honest uers.

2010

[17]

Proposed a system named as iLeak for personal data loss detection and is lightweight as compared to other proposed systems.

This lightweight system effectively prevents the inadvertent data leaks and produces overhead of 4% for the protected systems and applications.

Detection approach relies on keywords for representing sensitive information, there is a chance for false alerts.

2011

[23]

Proposed an algorithm for automatic classification of corporate documents as sensitive or not-sensitive.

Effectively classifies the sensitive corporate sensitive documents and works well on big data.

Most of the works studied employed the used of a single machine learning technique (SVM) for document classifiers.

2012

[16]

Developed two models i.e. watcher and guilt model. Watcher model identifies the unauthorized access and guilt model defines the probability of identifying the guilty distribution parties.

Assesses the probability of an agent to be responsible for the data leakage.

The models developed were to evaluated.

2013

[12]

Makes use of the user’s guilt probability to define a file allocation plan.

Effectively identifies the leak source and provides a file allocation plan.

Provide little or no support for alert handling.

2015

[18]

Defines the criteria for characterizing the significance and relevance of data attacks and advanced criteria for characterizing the data loss incidents.

Complete protection against the data loss in a corporate sector is impossible as human involvement is a key decisive factor in the data-information leakage prevention.

No practical/functional Information Leakage Detection and Prevention system had been implemented for a distributed system.

2015

[24]

Proposed a dynamic three-phase data leakage detection scheme.

The proposed method efficiently identifies the anomalous behavior, detects and classifies the data leakage resources.

The result presented by the author indicated that C4.5 is the best machine learning techniques but C4.5 does not work very well on a small training set.