Shannon Entropy of 2 Dimensional (2D) Gabor wavelet | Song, et al. [22] | 2016 | The image is filtered by 2D Gabor wavelet then the features are extracted. This wavelet can effectively captures the image texture and edge properties from different scale and orientations. | 1. Detection error rate 2. Entropy value
| Advantage 1. Effectively capturing the changes in image texture Disadvantage 1. Reduced detection accuracy | |
Discrete Fourier Transform (DFT) and Discrete Cosine Transform (DCT) | Deepa, et al. [23] | 2012 | The transformation of facial images in frequency domain used to reduce the image redundancy. Five images are randomly selected for training and another five images are selected for testing. | 1. Recognition rate 2. Training time 3. Testing time 4. Euclidean distance | Advantage 1. Dimensionality reduction 2. Higher recognition rate Disadvantage 1. Reduced sustainability 2. Simple classifier is used | |
Enhanced histogram features method | Song, et al. [25] | 2015 | The Perturbed Quantization (PQ) is applied for double compression JPEG images. The global, local and dual histogram features of DCT coefficients and its differences are calculated. | 1. True positive rate 2. False positive rate 3. Detection accuracy | Advantage 1. High detection accuracy Disadvantage 1. Constrained for complex image regions | |
Machine Learning Based Classification Techniques | ||||||
Ensemble Classifier | Kodovsky, et al. [27] | 2012 | The ensemble classifier provides the fast construction of the steganography detector. The steganalyst are allowed to work on the high dimensional feature space with large dataset. | 1. Detection error 2. Median (MED) 3. Median Absolute Deviation (MAD) | Advantage 1. Improved accuracy Disadvantage 1. Increased computational complexity | |
Ensemble based Extreme Learning Machine (EN-ELM) | Liu and Wang [28] | 2010 | The decisions are made by the cross validation scheme. The Discrete Cosine Transform (DCT) reduces the dimensionality. | 1. Classification accuracy 2. testing accuracy 3. Training time | Advantage 1. Alleviate overfitting 2. Higher testing accuracy Disadvantage 1. Increased training time 2. Increased computational burden | |
Extreme Learning Machine (ELM) classifier | Huang, et al. [29] | 2010 | The ELM is based on the Karush-Kuhn-Tucker (KKT) theorem. All training data are linearly separable in the ELM feature space. ELM is less sensitive to learning parameters. | 1. Testing rate 2. Training time 3. Testing deviation | Advantage 1. Easily implemented 2. Minimized testing error Disadvantage 1. Average testing accuracy | |
Differential Evolution based Extreme Learning Machine (DE-ELM) | Bazi, et al. [31] | 2014 | An automatic solution based Differential Evolution (DE) algorithm is developed in associated with the ELM classifier. The Principal Component Analysis (PCA) is applied to reduce the dimensionality of the data. | 1. Overall (OA) standard deviation 2. Average (AA) standard deviation 3. Sensitivity | Advantage 1. Faster solution Disadvantage 1. Reduced classification accuracy | |
Support Vector Machine (SVM) classifier | Shankar, et al [36] | 2012 | The block dependency features (inter and intra features) are used for the classification of steganography. | 1. classification percentage 2. Embedding percentage | Advantage 1. Faster solution Disadvantage 1. Message length is not detected | |
Hinge loss function based cognitive ensemble of Extreme Learning Machine (ELM) | Sachnev, et al. [32] | 2015 | The classifier performance is depended on the choice of classifier and weightage of each classifier. The quality of extracted features defines the performance of binary classifier. | 1. Number of hidden neurons 2. Testing efficiency | Advantage 1. Improved classification performance 2. Best testing efficiency Disadvantage 1. Geometric features are not recognized | |
Bayesian Ensemble Classifier | Li, et al. [35] | 2013 | The high dimensional feature vector is calculated from each JPEG image in a training set. The feature vectors are trained by the sub-classifier and it is integrated to make final decisions. | 1. Average computation time 2. detection percentage | Advantage 1. Low computational complexity Disadvantage 1. Increased number of feature vector 2. Required large training set | |