ML-Assisted Validation of Autonomous Freight Train Controls

Development of a machine learning model to select most appropriate simulation environments, improving the test coverage for LEADER energy management software.

Control SystemsMachine LearningRail AutomationSoftware Validation

Project Overview

This project addressed a critical validation challenge in real-time control software for the rail industry, specifically New York Air Brake's LEADER energy management system. Previous testing relied on arbitrarily selected train configurations (consists), which failed to represent the wide variation of actual trains observed in the field, leading to unreliable performance characterization.

Our solution was to develop and implement a novel methodology utilizing unsupervised machine learning to derive a small, highly representative sample set for rigorous testing.

The Challenge: Unrepresentative Testing Data

Modern energy management software, classified as Level 2 Autonomy, must safely and efficiently manage trains with tremendous variation (mass, length, power-to-weight ratio, and distributed power configurations). Testing only a few "ad hoc" trains led to:

Underestimation of Extremes: Key performance metrics like transit time and fuel consumption were often underestimated, failing to capture the full range of expected field outcomes (e.g., a factor of 6 difference in fuel consumption between test and clustered consists was identified).
Unexercised Control Logic: Critical sub-systems, such as the air brake control algorithm, were not activated at all during testing with the ad hoc consists, resulting in unverified control system tuning.
Inaccurate Behaviors: The typical closed-loop throttle and air brake commands observed in the clustered sample set often fell outside the range established by the ad hoc tests.

The Bespoke Automata Solution: Hierarchical Agglomerative Clustering

We engineered a software system capable of performing automated statistical analysis on empirical field data to identify the most representative consists.

Methodology

The core of the solution is a two-pass Hierarchical Agglomerative Clustering (HAC) algorithm applied to six key features of the train consist data (aggregate weight, power, length, locomotive positions, weights, and weight distribution).

Feature Vectorization: Real-world LEADER log files were processed to create multi-dimensional feature vectors.
Normalization: All features were normalized to ensure equal weighting in the clustering distance calculation.
HAC Algorithm: A "bottom-up" approach linked the most similar data points until a user-specified number of clusters (e.g., 10) was achieved, visualizing the hierarchy via a dendrogram.
Centroid Selection: The consist identified nearest to the centroid of each cluster was selected as the most representative for that "type" of train.

Results and Impact

The clustered sample set was tested against the original ad hoc consists using a simulated track segment. The findings demonstrated a significant increase in testing fidelity:

Expanded Performance Envelope: The clustered consists displayed far greater variance in aggregate performance (time and fuel) and location-specific behaviors (throttle and air brake commands).
Validation of Safety Logic: The clustered set included low power-to-weight ratio trains that forced the LEADER system to engage air brake logic, which was completely unexercised by the previous ad hoc tests.
Efficiency: The method successfully reduced a large population of 2,019 trains down to a representative set of 10, ensuring more efficient and time-effective testing while providing results that are indicative of actual field outcomes.This project validated that using data-driven, machine learning techniques is critical for thoroughly testing Level 2 Autonomy systems in highly variable real-world environments.