Using Shannon's Entropy to detect corrupted images


The above image shows the values of Shannons Entropy for each image in a set of 1340 images. Shannon's Entropy is defined as the sum of - (p(x) * log2 p(x)) where p(x) represents the frequency of a given character in a data stream of characters. So in practical terms, every character is read in an image file, and the instances of the charcter are counted. On ellipse and to a lesser degree the teragrid random GL errors happen that create these bad frames. In the past I simply had to make web pages with scripts and look at all the frames manually. This method seems to accurately identify corrupted images. Where there are spikes in the graphs of Shannon's Entropy values across all the image files, is where a corrupted image is. Below are 21 such frames that the algorithm picked correctly.
0056
0198
0213
0413
0442
0464
0488
0527
0535
0570
0622
0916
0917
0919
1300
1306
1320
1324
1329
1333
1337