Robert Holder, PhD
Hadoop and Spark
Ph.D. Computer Science, UMBC
M.S. Computer Science, Tulane University
B.S. Computer Science, Tulane University
Hardening Cyber-security by Preventing Phishing Attacks
Developed a system to stream network and web traffic logs into Hadoop store using Nifi and Kafka. Analyzed weblogs to identify patterns in traffic and modeled behavioral profiles of normal and suspect activity. Used unsupervised models to automate the alerting of security operators.
Consumer and Real Estate Data Services
Extracting Information from Scanned Real Estate Documents
Implemented a system for extracting information from scanned documents. Utilized deep learning and computer vision to preprocess the document to extract metadata including signatures and barcodes. Used OCR software to convert scanned documents to text, and developed custom language models to correct OCR errors to improve OCR performance.
- Python data stack: scikit-learn, pandas, numpy, Jupyter, others
- Supervised and unsupervised machine learning, SVMs, Time series analysis, outlier prediction models
- Hadoop: Spark, MapReduce, Hive, HBase, Accumulo
- Deep Learning
- Amazon Web Services: EC2, ECS, RedShift, EMR, Lambda, S3