Telematics data presents the opportunity to characterize the vehicle lifetime usage. This information is used to validate development and testing targets. Because there can be bad or incomplete data, it needs to be reviewed prior to any analysis to have confidence in the results.
Preparation for Analysis
Because of the amount of data, these steps are critical in any analysis. Generally, the crucial steps in the workflow are:
- Determine the business reasons for the study.
- Follow the scope of the study.
- Data source and integrity.
First, we need to determine who needs the analysis, why it is needed, when it is needed, and the types of output. Sometimes the user is interested in parameters that affect only one component. Other times, the objective is to compare different customer usages, like retail vs. fleet customer usage, or make different product comparisons.
For vehicle telematics data, engineers and management are the usual customers. Engineers and management are focused on a market segment. In the automobile industry, that may be a particular vehicle type with a specific engine and transmission. Another factor may be the demographics of the market segment. Frequently, the focus is on component usage parameters for a specific vehicle, engine, or transmission. Management and generally wants comparisons of different vehicles. Both may need to characterize a vehicle population’s 5th, 50th, and 95th percentiles usage.
“How do you eat an elephant? One bite at a time!” So how do you process massive telediagnostic data files? Define the scope of the study to limit the amount of information to be analyzed. Engineers and management are organized to support one development project, which may be defined by a vehicle type, engine size, and transmission. With a defined scope the number of CAN channels that need to be analyzed is reduced.
If a project was to design a commercial van, then it does not make sense to include other types of vehicles. For example, anybody who uses airport shuttle vans knows that the doors appear ready to fall off. A good use of telediagnostic would be to monitor the number of open/close cycles for each door to create validated design targets for durability tests. The durability tests would be used to identify the mechanical parts that need improvement.
A telematics module may be installed to monitor the CAN bus and to store one record per second where each record consist of the status of as many as 1000 CAN channels. The telematics module stores the data for later transmission to a server. The 1Hz records are preferred as the time series and/or associations of different channels can be made. For example, an engine torque x engine speed map can be generated for a vehicle, or many vehicles analyzed for a fleet.
Alternatively, a telematics module can preprocess the data into histograms and increment counters, reducing the amount that needs to be communicated. However, a histogram essentially filters the data when it is binned into the histogram. The analysis of any time series or correlations between different channels is not possible with a histogram.
Frequently, a specialized supplier collects and stores the telematics data. Some suppliers provide the raw data while others provide only histogram summaries.
In an analysis, it has been said: “Garbage in, garbage out”. Vehicle telematics has data integrity issues. The data stream is corrupted when data is missing or measurements are unreasonable. There need to be a plan to deal with corrupted data.
In electronic control of vehicle functions, priorities are imposed on the CAN signal data. For example, when a vehicle is started, there is a lot of coordination between different modules and sensors, which are powered up at different times. Data will be missing until a module or sensor is operating. Similarly, sensors can be temporarily inoperative, causing missing or bad data. One plan would be to interpolate between known good values, but that “creates” data. An alternative plan is to treat the known data as a statistical sample and ignore missing data, preserving data integrity.
During an analysis of vehicle speed, it was noted that one vehicle showed 30% of the time the vehicle was moving less than 180 km/hr., and 70% of the time the speed was 180 km/hr. This usage is not reasonable as speed limits are generally about 110 km/hr. and such high-speed operation is a rare event. Contacting the data supplier, it was determined that the vehicle speed sensor, shorted to supply voltage, provided a voltage signal that was interpreted as 180 km/hr. The analysis was modified to treat the vehicle speeds of 180 km/hr. as missing data. The analysis of the remaining good speeds data was similar to the vehicle speeds of other vehicles in the fleet.
Verifying data integrity is an ongoing activity that requires the close cooperation of the analyst and the engineer to determine a strategy for any analysis.
Prior to analyzing telematics data, the key steps are
- Defining a customer and business focus for the analysis.
- Segment the data to focus only on the vehicles and CAN channels required.
- Verify the data integrity.