Shannon Entropy of 2 Dimensional (2D) Gabor wavelet

Song, et al. [22]

2016

The image is filtered by 2D Gabor wavelet then the features are extracted. This wavelet can effectively captures the image texture and edge properties from different scale and orientations.

1. Detection error rate

2. Entropy value

Advantage

1. Effectively capturing the changes in image texture

Disadvantage

1. Reduced detection accuracy

Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT)

Deepa, et al. [23]

2012

The transformation of facial images in frequency domain used to reduce the image redundancy. Five images are randomly selected for training and another five images are selected for testing.

1. Recognition rate

2. Training time

3. Testing time

4. Euclidean distance

Advantage

1. Dimensionality reduction

2. Higher recognition rate

Disadvantage

1. Reduced sustainability

2. Simple classifier is used

Enhanced histogram features method

Song, et al. [25]

2015

The Perturbed Quantization (PQ) is applied for double compression JPEG images. The global, local and dual histogram features of DCT coefficients and its differences are calculated.

1. True positive rate

2. False positive rate

3. Detection accuracy

Advantage

1. High detection accuracy

Disadvantage

1. Constrained for complex image regions

Machine Learning Based Classification Techniques

Ensemble Classifier

Kodovsky, et al. [27]

2012

The ensemble classifier provides the fast construction of the steganography detector. The steganalyst are allowed to work on the high dimensional feature space with large dataset.

1. Detection error

2. Median (MED)

3. Median Absolute Deviation (MAD)

Advantage

1. Improved accuracy

Disadvantage

1. Increased computational complexity

Ensemble based Extreme Learning Machine (EN-ELM)

Liu and Wang [28]

2010

The decisions are made by the cross validation scheme. The Discrete Cosine Transform (DCT) reduces the dimensionality.

1. Classification accuracy

2. testing accuracy

3. Training time

Advantage

1. Alleviate overfitting

2. Higher testing accuracy

Disadvantage

1. Increased training time

2. Increased computational burden

Extreme Learning Machine (ELM) classifier

Huang, et al. [29]

2010

The ELM is based on the Karush-Kuhn-Tucker (KKT) theorem. All training data are linearly separable in the ELM feature space. ELM is less sensitive to learning parameters.

1. Testing rate

2. Training time

3. Testing deviation

Advantage

1. Easily implemented

2. Minimized testing error

Disadvantage

1. Average testing accuracy

Differential Evolution based Extreme Learning Machine (DE-ELM)

Bazi, et al. [31]

2014

An automatic solution based Differential Evolution (DE) algorithm is developed in associated with the ELM classifier. The Principal Component Analysis (PCA) is applied to reduce the dimensionality of the data.

1. Overall (OA) standard deviation

2. Average (AA) standard deviation

3. Sensitivity

Advantage

1. Faster solution

Disadvantage

1. Reduced classification accuracy

Support Vector Machine (SVM) classifier

Shankar, et al [36]

2012

The block dependency features (inter and intra features) are used for the classification of steganography.

1. classification percentage

2. Embedding percentage

Advantage

1. Faster solution

Disadvantage

1. Message length is not detected

Hinge loss function based cognitive ensemble of Extreme Learning Machine (ELM)

Sachnev, et al. [32]

2015

The classifier performance is depended on the choice of classifier and weightage of each classifier. The quality of extracted features defines the performance of binary classifier.

1. Number of hidden neurons

2. Testing efficiency

Advantage

1. Improved classification performance

2. Best testing efficiency

Disadvantage

1. Geometric features are not recognized

Bayesian Ensemble Classifier

Li, et al. [35]

2013

The high dimensional feature vector is calculated from each JPEG image in a training set. The feature vectors are trained by the sub-classifier and it is integrated to make final decisions.

1. Average computation time

2. detection percentage

Advantage

1. Low computational complexity

Disadvantage

1. Increased number of feature vector

2. Required large training set