Amperity’s Principal Data Scientist, Yan Yan, describes Vamperity, our vampire-based subsidiary doing entity resolution, and theoretical approaches to sorting out these elusive creates. Software Engineer, Chuck Sakoda, illustrates how to build ML-oriented large-scale processing in Spark and Clojure, and the many pitfalls and lessons learned along the way.