Using Shannon's Entropy to detect corrupted images
The above image shows the values of Shannons Entropy for each image in a set of 1340 images.
Shannon's Entropy is defined as the sum of - (p(x) * log2 p(x)) where p(x) represents the frequency of a given character in a data stream of characters. So in practical terms, every character is read in an image file, and the instances of the charcter are counted. On ellipse and to a lesser degree the teragrid random GL errors happen that create these bad frames. In the past I simply had to
make web pages with scripts and look at all the frames manually. This method seems to accurately identify corrupted images. Where there are spikes
in the graphs of Shannon's Entropy values across all the image files, is where a corrupted image is. Below are 21 such frames that the algorithm picked
correctly.
0056
 |
0198
 |
0213
 |
0413
 |
0442
 |
0464
 |
0488
 |
0527
 |
0535
 |
0570
 |
0622
 |
0916
 |
0917
 |
0919
 |
1300
 |
1306
 |
1320
 |
1324
 |
1329
 |
1333
 |
1337
 |