Semiconductor Computer Vision:

Our pipeline for anomaly detection in the semiconductor manufacturing process images involves applying CONDITIONAL MULTIPLE HYPOTHESIS TESTS over HOG Features to obtain FDR controlled discriminative features from n <<< p high-dimensional datasets (125 Images with 420350 features (125 by 420350 matrices) followed by T-Stochastic Neighborhood Embedding for anomaly detection in highly imabalanced high-dimensional datasets with few samples, yet thousands of feature variables. (Work done by Praneeth Vepakomma & Austin Bock). 

Developed Pipeline: The first two plots are images from semiconductor manufacturing. We have 125 images of this kind, where only a minority (4) are anomalous and rest are benign. These 4 need to be detected through a learning model. The pipeline we designed is 3 steps: i) To generate CV Features (HOG, GIST) for each image ii) Apply Conditional T-Test (result in figure 3)(http://amstat.tandfonline.com/doi/abs/10.1198/sbr.2009.0003), Step iii) Generate cut-off rule from Step ii) based on Test statistic and Standard error and step iv) Perform manifold learning (T-SNE) on result of iii (result in fig 4). Skipping any of the techniques in this pipeline makes it very hard to catch these anomalies. This pipeline is very applicable to microarray and microscopic images to be able to classify malign vs benign.

© Vepakomma, Praneeth 2017