Telematics data presents the opportunity to characterize the vehicle lifetime usage. This information is used to validate development and testing targets. Because there can be bad or missing data, it needs to be reviewed prior to analysis to have confidence in the analysis.
Preparation for Analysis
Because of the amount of data, these steps are critical in any analysis. Generally, the crucial steps in the workflow are:
- Determine the business reasons for the study.
- Follow the scope of the study.
- Define the data source and how it will be reported.
- Assure data integrity.
Business Reason
First, determine who needs the analysis, why it is needed, when it is needed, and the types of output. Sometimes the user is focused on parameters that affect only their component. Other times, the objective is to compare different customer usages, like retail vs. fleet customer usage, or make different product comparisons.
For vehicle telematics data, engineers and management are the usual customers. Engineers and management are focused on a market segment. In the automobile industry, that may be a particular vehicle type with a specific engine and transmission. Another factor may be the demographics of the market segment. Frequently, the focus is on component usage parameters for a specific vehicle, engine, or transmission. Management and generally wants comparisons of different vehicles. Both may need to characterize a vehicle population’s 5th, 50th, and 95th percentiles usage.
Scope
“How do you eat an elephant? One bite at a time!” So how do you process massive telediagnostic data files contained in a very large database? Define the scope of the study to limit the amount of information that needs to be extracted for analysis. The 1-second data will contain records of upwards of 1000 CAN channels, but only a few are required for a study. Engineers and management are organized to support one development project, which may be defined by a vehicle type, engine size, and transmission. A defined scope limits the number of CAN channels required.
If a project was to design a commercial van, then it does not make sense to include other types of vehicles. For example, anybody who uses airport shuttle vans knows that slider doors frequently appear ready to fall off. A good use of telediagnostic would be to monitor the number of open/close cycles for each door to create validated design targets for durability tests. The durability tests would be used to identify the mechanical parts that need improvement.
Data Source
A telematics module may be installed to monitor the CAN bus and to store one record per second where each record consist of the status of as many as 1000 CAN channels. The telematics module stores the data for later transmission to a server. The 1Hz records are preferred as the time series and/or associations of different channels can be made. For example, an engine torque x engine speed map can be generated for a vehicle, or many vehicles analyzed for a fleet.
Alternatively, the telematics module can preprocess data into histograms and increment counters, reducing the amount that needs to be communicated. However, a histogram essentially filters the data when it is binned into the histogram. The analysis of any time series or correlations between different channels is not possible with a histogram.
Frequently, a specialized supplier collects and stores the telematics data. Some suppliers provide the raw data while others provide only histogram summaries.
Data Integrity
In an analysis, it has been said: “Garbage in, garbage out”. Vehicle telematics has data integrity issues. The data stream is corrupted when data is missing or out of range data is encountered. There need to be a plan to deal with corrupted data.
In electronic control of vehicle functions, priorities are imposed on the CAN signal data. For example, when a vehicle is started, there is a lot of coordination between different modules and sensors, which are powered up at different times. Data will be missing until a module or sensor is operating. Similarly, sensors can be temporarily inoperative, causing missing or bad data. One plan would be to interpolate between known good values, but that “creates” data. An alternative plan is to treat the known data as a statistical sample and ignore missing data, preserving data integrity.
During an analysis of vehicle speed, it was noted that one vehicle showed 30% of the time the vehicle was moving less than 180 km/hr., and 70% of the time the speed was 180 km/hr. This usage is not reasonable as speed limits are generally about 110 km/hr. and such high-speed operation is a rare event. Contacting the data supplier, it was determined that the vehicle speed sensor, shorted to supply voltage, provided a voltage signal that was interpreted as 180 km/hr. The analysis was modified to treat the vehicle speeds of 180 km/hr. as missing data. The analysis of the remaining good speeds data was similar to the vehicle speeds of other vehicles in the fleet.
Verifying data integrity is an ongoing activity that requires the close cooperation of the analyst and the engineer to determine a strategy for any analysis.
Conclusion
Prior to analyzing telematics data, the key steps are
- Determine the business reasons for the study.
- Follow the scope of the study.
- Define the data source and how it will be reported.
- Assure data integrity.
If anybody wants to engage me on this or other topics, please contact me. I offer a free hour for the first contact to discuss your problem/concerns and to determine how I can help you.
I have worked in Quality, Reliability, Applied Statistics, and Data Analytics over 30 years in design engineering and manufacturing. In the university, I taught at the graduate level. I provide Minitab seminars to corporate clients, write articles, and have presented and written papers at SAE, ISSAT, and ASQ. I want to assist you.
Dennis Craggs, Consultant
810-964-1529
dlcraggs@me.com
John Carston says
I like how you mentioned that it is important to consider different modules and sensors when driving. My uncle mentioned to me last night that his friend is hoping to find a solution to increase the productivity of his fleets and asked if I have any idea what is the best option to do. Thanks to this informative article and I’ll be sure to tell him that he can consult a fleet management software company as they can answer all his inquiries.
Dennis Craggs says
I don’t know if a fleet management software company would do more than report simple event counts and parameter averages for a single vehicle. The value of telematics is to look across vehicles to determine the statistics across a sample of vehicles. Then one may infer the distribution and percentiles for a type of data. A 95% usage range allows one to set reasonable expectations. Identification of high or low outlier (vehicles) presents an opportunity for improvement or corrective actions. For example, a driver who exceeds a 95% population limit may be driving too fast or too many hours/day. Alternatively, they may have a rout with better highways and less congestion.
James Lucas says
Hey team, great post.
One should consider certain point before examining telematics data:
Determine the study’s commercial purpose.
Observe the study’s breadth.
Define the data source and the manner in which it will be presented.
Ascertain the accuracy of the data.