Ellie Kloberdanz

Machine Learning Researcher/Data Scientist/Software Engineer

Education

Iowa State University, Ames, IA

PhD: Computer Science

August 2020 - May 2023
  • Research: AI & Machine Learning
  • GPA: 3.83
  • Thesis: Numerical Stability of Deep Learning Algorithms and Quantization of Deep Neural Networks

Iowa State University, Ames, IA

Master of Science: Computer Science

August 2018 - May 2020
  • Breadth Area: AI & Machine Learning
  • GPA: 3.75
  • Thesis: Reprogramming of Neural Networks: A New and Improved Learning Technique
  • Courses taken: Design & Analysis of Algorithms, Machine Learning, Principles of AI, Theory of Computation, Deep Learning, Introduction to Machine Learning, Principles of Operating Systems, Advanced Topics in Software Engineering: Foundations, Convex Optimization, Concurrent Systems, Numerical Analysis of High Performance Computing, Advanced Topics in Computational Models of Learning

University of Glasgow, Glasgow, United Kingdom

Bachelor of Accounting: Accounting with Finance

September 2012 - June 2016
  • GPA: 19.3 out of 22.0
  • Dissertation: Impact of liquidity risk regulation on US banking sector
  • Academic Prizes: Morgan Stanley Prize, ACCA Prize, Deloitte Prize, Entrepreneurship Award

Publications

An Improved (Adversarial) Reprogramming Technique for Neural Networks

September 2021

International Conference on Artifical Neural Networks

DeepStability: A Study of Unstable Numerical Methods and Their Solutions in Deep Learning

2022

International Conference on Software Engineering

Work Experience

Senior Data Scientist

February 2022 - Present

Cape Privacy, startup specializing in privacy-preserving machine learning (series A funding)

Moose, a framework for Secure Multi-Party Computation (MPC)

  • Implemented neural networks that perform inference on encrypted data
  • Implemented secure protocols for ReLU, Softmax that are performed on encrypted data
  • Software Engineer

    August 2021 - February 2022

    Collins Aerospace, Mission Systems

    Created a white paper on Adversarial Machine Learning for RF (radio frequency) signal processing for DoD use cases

  • Submitted to DARPA (Defense Advanced Research Projects Agency), presented at RTX symposium
  • Testing adversarial robustness of an internal deep learning model
  • Resource manager IR&D (Independent Research & Development) research project

  • A codebase for simulating a multi-function antenna aperture
  • Created a stress test scenarios to aid algorithm design and development
  • Prepared a practical demonstration, assisted with project management and budgeting
  • Data Scientist

    June 2020 - July 2021

    John Deere Financial

    Asset value risk analysis and model development

  • Developed forecasting models for future asset values in the agriculture and construction markets
  • Developed auction price prediction machine learning models with 0.1 mean squared error and 0.9 R squared
  • Model validation and backtesting
  • Executed a monthly re-training process of a XGBoost model for asset value predictions
  • Data Engineering: created a balanced time series dataset
  • Machine Learning Mentor

  • Volunteered to take on an additional role as a ML mentor, which involves teaching others relevant ML concepts and acting as a resource that any team can consult for technical advice
  • Financial documents classification with NLP

  • Took initiative to develop several machine learning models that use optical character recognition and classify financial documents (naive bayes, random forest, support vector machine, knn, decision tree)
  • Created a website with Flask, HTML, and CSS to deploy the best model, saving the company money versus buying a vendor model
  • Path planning for tractors with reinforcement learning

  • Joined an initiative to develop reinforcement learning models for finding an optimal path through a field
  • Completed several reinforcement learning courses with the team
  • Participated in a hackathon: contributed a demo of multi agent reinforcement learning
  • Pricing analyses

  • Performed driver analysis of loan late fees leveraging machine learning (random forest, various linear models)
  • Improved and further developed a data science application for corporate pricing used to set loan interest rates
  • Data & Operations Research Scientist Intern

    Principal Financial Group

    May 2019 - August 2019

    Helped to develop a Python library for portfolio optimization for equity and fixed income.

    Key contributions:

    • Ported over 30,000 line codebase from Python 2 to Python 3
    • Eliminated inefficiencies and refactored code from over 100 scripts and 30k lines of code to 5 scripts with 3k lines of code
    • Transformed codebase using object oriented design into classes with methods
    • Implemented code that supports a new strategy: Dynamic Risk Premium
    • Made the library modular, customizable, and asset agnostic
    • Investigated PostgreSQL database and documented how data is created and queried
    • Created web-based documentation using Sphinx

    Risk Operations Investment Banking Analyst

    Jefferies

    July 2017 - July 2018

    Analyzed market and credit risk data across fixed income and equity trading desks utilizing SQL, VBA, and Excel

    Key contributions:

    • Prepared, optimized, automated, and analyzed risk reports
    • Learned to work in a fast pace environment with tight deadlines and high exposure to top management
    • Improved analytical skills

    Market Risk Contractor

    Ernst & Young

    October 2014 - December 2014

    Conducted market research

    Market Risk Intern

    Ernst & Young

    June 2014 - August 2014

    Supported senior market risk team on banking regulation projects such as liquidity stress testing

    Brokerage Intern

    Cyrrus

    June 2013 - July 2013

    Learned about financial markets and products

    Coding Projects

    Distributed Neural Network Training with MPI

    MPI, or Message Passing Interface, is a standardized library for multi-node distributed compute on cluster computers (supercomputers). In this project I implement a parallel neural network in C++ leveraging Eigen for linear algebra operations, and MPI for multi-node distributed data parallelism to reduce training time via distributed learning by spreading the training data across multiple processors.

    GitHub

    ekBLAS

    Developed a C multithreaded parallel implementation with OpenMP of several BLAS linear algebra routines.

    GitHub

    Algebraic Graph Connectivity

    Formulated the problem of finding edge weights that maximize algebraic connectivity as a convex optimization problem. Using a numerical example, demonstrated that greater number of edges does not necessarily lead to greater algebraic connectivity. Instead, optimal edge weights are more important.

    GitHub

    Checkers AI

    This project is an AI that plays checkers utilizing the Alpha-beta pruning search algorithm.

    GitHub

    Naive Bayes Text Classifier

    Implemented a multinomial naive bayes model from scratch to classify text documents

    GitHub

    Reinforcement Learning – Prediction and Control

    Implemented temporal difference learning, expected Sarsa, Q-learning, and Actor-Critic algorithms.

    GitHub

    Twitter Sentiment Analysis

    Developed a sentiment analysis tool with the Twitter API: trained various ML models, experimented with different vectorization techniques, and implemented multithreaded text pre-processing. This tool scrapes twitter for recent tweets cotaining any chosen keyword, and applies the best pre-trained ML model to classify the sentiment of each tweet. A user can interact with this tool via Jupyter Notebook or a website built with Flask, HTML, and CSS.

    GitHub

    Machine Learning Trading Platform

    Developed a neural net for technical analysis and a natural language processing model for sentiment analysis of webscraped news articles. Achieved returns 4% above market.

    GitHub

    Ensemble Machine Learning

    Developed random forest, AdaBoost, and also weighted and unweighted ensemble models constructed from 5 different ML models (neural net, KNN, logistic regression, Naive Bayes, decision tree) with Skearn. Experimented with hyper-parameter values using grid search.

    GitHub

    Reinforcement Learning

    Implemented Double Q-learning algorithm in Acrobot-v1 environment

    GitHub

    Decision Tree Classifiers

    Implemented decision trees, imputed missing data, performed k-fold cross validation on congressional voting and breast cancer datasets.

    GitHub

    Experiments with Neural Nets

    Designed feed forward and convolutional models in Keras running on TensorFlow, and experimeted with number of layers, neurons, learning & momentum rates, input scaling, activation functions etc.

    GitHub

    Model Selection

    Best subset selection, forward and backward selection methods, lasso, ridge regression, ordinary linear least squares, and principal component regression - model selection with different criteria (e.g.: Bayesian information criterion) and validation.

    GitHub