Machine Learning/DATA

Technique: Robotic Process Automation (RPA)

Healthcare

Challenge

Automation of the Retrieval of Medical Charts

Data

  • A business process outsourcing company coding medical charts for oasis and ICD-10 charts needed a solution to improve this process.
  • Medical record documents were pulled from multiple Electronic Health Records.

Solution

  • 75 custom-built RPA bots access medical records by logging in through Citrix anyconnect, entering the EMR system, and extracting necessary information.
  • OCR provides a searchable document within a custom user interface.

Technique: Document AI

Healthcare

Challenge

Form Recognition for Medical Bills

Data

  • Years of historical HCFA and UB forms were provided.
  • These included well-scanned documents with little to no flaws, documents with blurred backgrounds, documents with handwriting, and other complicating conditions.

Solution

An Azure-based form recognizer, layered with OCR and custom logic allowed for the identification of form type (HCFA or UB) and the subsequent extraction of data from the form into a database structure.

Technique: Advanced Analytics

Poison Control Center

Challenge

Tracking emerging trends in Poison Control Center Data

Data

10 years of poison control center data including:

  • Exposure Date
  • Substance
  • Category

Solution

Twitter’s Anomaly Detection and Breakout Models were combined with custom statistical modeling to create an estimate of substances that are “trending” over the course of 3 months, 1 year, and 3 years.

Technique: Advanced Analytics

Aircraft Manufacturing

Challenge

Predictive Maintenance on F-35 Fighter Jets

Data

  • 3 years of flightline maintenance data for F-35s
  • Weather data by day for each station location
  • Part supply data for each part
  • Aircraft mileage by day

Solution

  • Exponential Model to predict remaining life of a product.
  • Probability of resolution for an issue based on each potential solution.
  • Analytics dashboard to present and review final results.

Technique: Predictive Modeling

Debt Collection Company

Challenge

Predicting the Likelihood of Debt Repayment

Data

  • Data about individual consumer debts including balance, time since last payment, time since delinquency, debt type, etc.
  • US Census Bureau’s American
  • Community Survey for income, unemployment, etc.
  • NPS Data for Physician Information

Solution

A set of Random Forest models predict the likelihood of the following:

  • Probability of payment based on a phone call
  • Probability of payment based on a letter
  • Probability of a contact based on the time of day for a phone call.

Technique: Natural Language Processing

Corporate Business

Challenge

Classification of Job Titles for Resumes and Job Descriptions

Data

  • 7 million Candidate Resumes
  • 1.1 million Distinct “Job Titles”
    Code Ninja is way too specific to be a great job title
  • 679 Distinct “Occupations”
    Software Development is way to generic to be a great job title

Solution

A “vector embedding” was created for each resume and then titles were grouped. Similar titles (eg: javascript developer and node developer) were combined. An average “anchor vector” was used to represent the title and new records are classified by their representative anchor.