Applied Machine Learning for Identity and Access Management Workshop at Black Hat 2018

Identity and Access Management (IAM) is one of the most-important security controls. Now more than ever; especially with the advent of cloud computing and other distributed 'as a service' platforms. IAM systems create massive amounts of log and telemetry data, overwhelming security teams with their sheer volume. Aaron Turner and Raffael Marty have designed a 2-day, vendor-agnostic training course to provide participants with hands-on instruction for how to deal with these massive amounts of data. Using the latest approaches in machine learning and design, the course will give real-world examples of how to capture the maximum amount of data for analysis and then sort through it to find real security problems.

This hands-on training session will provide participants with an in-depth look at how to use the latest in big data processing tools to solve real-world problems around monitoring users through IAM systems. Machine learning theory & design instruction will be followed by hands-on labs to apply lessons learned to IAM log data, followed by instructions on the latest IAM architectures and tools that help design even the toughest cloud security solutions. The final project will be a capture the flag (CTF) exercise where each team must both defend an IAM system while simultaneously attacking their opponent's IAM infrastructure and dependent systems.

The outline of the course looks as follows:

IAM Theory

  • What is IAM?
  • Identity stores, provisioning, AuthN vs AuthZ, MFA
  • IAM tools
  • IAM architectures - for example, beyond corp
  • Cloud IAM

Applied IAM

  • Implementing federated identity system for hybrid cloud deployments
  • Attacking IAM - how to find the weakest link in the IAM architecture
  • IAM Logging - active directory configuration, data centralization (big data, RDBMS)
  • IAM log parsing and analysis

ML Theory

  • Basic statistics and machine learning algorithms
  • Artificial intelligence and data mining
  • Data visualization basics

ML Frameworks

  • Introduction to R and some of the most important libraries
  • Intro to data analysis in Python
  • Python libraries including Pandas, sci-kit learn, numpy, and scipy
  • Python Notebooks

IAM Intelligence

  • Loading IAM logs into Python
  • Running statistics on IAM logs
  • Machine learning on IAM logs
  • Visualizing IAM logs


  • Intro to the CtF setup
  • Mutual CtF exercise - both teams will simultaneously be attacking and defending their IAM infrastructures