Optimized Test Protocol for Railroad Energy Management Software

Train Marshalling

The railroad is a system for moving freight from an originator at some arbitrary location to a receiver at some different arbitrary location.  Certain originator and receiver pairs transport consistent freight loads at a roughly predictable rate, such as coal miners and their power plant customers.  Power plants have a (seasonally adjusted) predictable need for fuel and order this fuel from the coal mines at a very predictable rate.  

However, the great majority of originators and receivers either:

  1. Move predictable loads on an as-needed basis, an example of which is grain harvesting.  While there is a roughly-known seasonal need for transporting grain, the magnitude and timing of temperature and rainfall will dictate the quantity of grain to be moved as well as the dates when transport is needed.
  2. Move unpredictable loads at a regular rate, an example of which is intermodal transport.  These moves include originators such as Amazon and UPS.  These originators can be certain they will need to move freight at a regular rate (daily, weekly, etc.), but the weight and volumes that they will be moving are for the most part wholly unknown until the date of transit.

Because of these factors, the payloads involved in rail freight moves are wholly unpredictable past a very short horizon.  Additionally, since so many trains traverse North America, the makeup of a train (called a "consist" within the industry) may be seen one day and then never seen again.

Energy Management

A few large companies in the pool of suppliers to the North American freight rail system market so-called "energy management" software solutions to the industry.  The term "energy management" basically refers to cruise control for freight locomotives, although driving a freight train bears only passing resemblance to driving a road vehicle.

Energy management systems select throttle and braking commands for each of the locomotives within the train in real time to provide "optimal" driving of the train while balancing the following, usually conflicting factors:

  1. Minimizing the use of diesel fuel to improve railroad operating costs
  2. Maximizing the speed of the train to improve railroad on-time performance
  3. Obeying all of the speed limits imposed by civil authorities, track maintenance crews, railroad engineering management, and so forth
  4. Managing the forces within the train to prevent break-in-twos, derailments, damage to the lading, and unnecessary wear-and-tear on the train components

So, makers of energy management systems face a challenging validation and verification problem to continuously improve their products while ensuring that they do not introduce any new problems in some corner case.  There is an essentially unbounded set of trains that their products may be called upon to manage.  Due to the possibility of maintenance activities and track conditions, there is also an essentially unbounded set of speed limits over each piece of track where energy management systems are certified for operation.  Finally, the costs of computing power and storage space needed to simulate so many unique trains over usually lengthy distances can quickly become sizable.  Even more costly is the engineering time needed to analyze, interpret, and evaluate the outcomes of such simulations.

Energy management software makers clearly have a need to limit their testing to a manageable number of tests while still maximizing coverage of train makeup so that their quality assurance function can assert that performance is "as expected."

Clustering Solution

In this project, we provided the client with a solution based on hierarchical agglomerative clustering to select a set of "most representative" trains from a population sample.  By testing the energy management system against those most representative trains in a simulated environment, the clustering technique guarantees that that they are providing a high degree of test coverage while limiting the amount of simulation and post-processing activities.

Like all forms of clustering, hierarchical clustering is a form of unsupervised learning, which means that significantly less human involvement is needed and, once the method is tuned, it can be run periodically to re-calibrate the results as railroad operational policies change.  For example, with the advent of precision scheduled railroading, trains became much longer and heavier.  To ensure that this style of train is represented in the clustering, the algorithm would need to be run again once PSR-style trains were represented in the sample data.

Results

For this project, we were provided with the set of trains that the client's energy management software encountered over a year of operation on a single railroad territory.  Over several iterations, we selected a normalized feature set which evaluation (using the Hopkins statistic) demonstrated that the data was suitable for clustering.  We next applied the clustering algorithm using this feature set to classify each provided train into a cluster, where the number of clusters was defined by the client as being acceptable.

The clusters themselves were evaluated busing the Davies-Bouldin index and the silhouette coefficient and were shown to be representative of the provided data set.

Prior to execution of this project, the client had been testing their software using two arbitrarily selected trains on the territory of interest.  We provided them with trains near the centroid of each of our identified clusters, as shown below (reprinted from the ASME whitepaper referenced below).  


Their energy management product was simulated on each of the 2 arbitrary consist and the 10 cluster centroid consists.  The results showed that certain aspects of their product (such as the air braking algorithm) were not being exercised by their algorithm at all.  Further, the two consists that they had selected were not representative of the more modern, PSR-style trains that their customer had transitioned to.

Use of the cluster centroid trains in simulation evinced much greater variation in performance of their energy management product than did the two arbitrarily-selected trains.  This variation provided them a more accurate representation of the expected results in the field and allowed them to refine their software before release to avoid undesirable behaviors that their customer would likely report as a defect.

Presentation of Results

The development of this solution was presented at the 2020 ASME Joint Rail Conference.  The whitepaper can be found in the proceedings of that conference.