Outlier detection can be a pain point for all data driven companies, especially as data volumes grow. At Netflix we have multiple datasets growing by 10B+ record/day and so there’s a need for automated anomaly detection tools ensuring data quality and identifying suspicious anomalies. Today we are open-sourcing our outlier detection function, called Robust Anomaly Detection (RAD), as part of our Surus project.
more here............http://techblog.netflix.com/2015/02/rad-outlier-detection-on-big-data.html
more here............http://techblog.netflix.com/2015/02/rad-outlier-detection-on-big-data.html