Background

The prediction of power battery charging energy is the core issue of power battery decline assessment, and it is correctly predicted. Measure the charging energy of the power battery, evaluate the residual value of the power battery, fault detection, charging planning, etc.

The charging energy of power battery is affected by the coupling of multiple factors such as cumulative mileage and temperature. Competitors must design a power battery energy prediction model to predict the charging energy of the power battery. This question gives the charging state and charging energy of the first n charging processes to be predicted.Participants are required to predict the charging energy of the n+1st charging process.

Data

Data Description

Various parameters of the charging process are provided, and participants need to identify any anomalies in the data.

Table 1: Training Sample Data Format and Description

COLUMNS TYPE NOTES
vehicle_id STRING Unique vehicle identification code
charge_start_time INT Charging start time
charge_end_time INT Charging end time
mileage FLOAT Vehicle odometer mileage at the start of charging (km)
charge_start_soc INT Battery SOC at the start of charging
charge_end_soc INT Battery SOC at the end of charging
charge_start_U FLOAT Total battery voltage at the start of charging (V)
charge_end_U FLOAT Total battery voltage at the end of charging (V)
charge_start_I FLOAT Total battery current at the start of charging (A)
charge_end_I FLOAT Total battery current at the end of charging (A)
charge_max_temp FLOAT Maximum temperature of the battery system during charging (°C)
charge_min_temp FLOAT Minimum temperature of the battery system during charging (°C)
charge_energy FLOAT Charging energy for this process (kWh)

Table 2: Test Sample Data Format and Description

COLUMNS TYPE NOTES
vehicle_id STRING Unique vehicle identification code
charge_start_time INT Charging start time
charge_end_time INT Charging end time
mileage FLOAT Vehicle odometer mileage at the start of charging (km)
charge_start_soc INT Battery SOC at the start of charging
charge_end_soc INT Battery SOC at the end of charging
charge_start_U FLOAT Total battery voltage at the start of charging (V)
charge_end_U FLOAT Total battery voltage at the end of charging (V)
charge_start_I FLOAT Total battery current at the start of charging (A)
charge_end_I FLOAT Total battery current at the end of charging (A)
charge_max_temp FLOAT Maximum temperature of the battery system during charging (°C)
charge_min_temp FLOAT Minimum temperature of the battery system during charging (°C)

Table 3: Submission Content Data Format and Description

COLUMNS TYPE NOTES
vehicle_id STRING Unique vehicle identification code
charge_energy FLOAT Charging energy for this process (kWh)

Table 4: Driving Track Data Format and Description

COLUMNS TYPE NOTES
vehicle_id STRING Unique vehicle identification code
time INT Time
state INT Vehicle state (1 for start, 2 for stop, 3 for other)
GPS_lat FLOAT Latitude
GPS_lon FLOAT Longitude

Table 5: Submission Content Data Format and Description

COLUMNS TYPE NOTES
vehicle_id STRING Unique vehicle identification code
track_mileage FLOAT Track mileage (km)

 

Evaluation Rule

### Scoring Formula

For both the charging energy prediction and the driving track mileage calculation, the scoring formula is:

e = √(∑i=1n (ri – ai)2 / n)

Where:
– e  is the evaluation parameter, with a smaller value indicating closer proximity to the actual answer.
– ri is the calculated energy or track mileage.
– ai is the actual energy or mileage.
– n is the total number of samples.

This formula calculates the root mean square error (RMSE) between the predicted values and the actual values.

 

 

Solution
 

Data Analysis and Cleaning
  • The data analysis involves processing the distribution of energy in the training set, revealing a log-normal distribution, and noting that the factors affecting charging energy have a multiplicative relationship.
  • In terms of data cleaning, techniques such as missing value detection and anomaly correction are used, including the use of random forests and gradient boosting trees to correct anomalies and fill in missing values.
  • Model design is divided into basic feature groups, representative feature groups, and time feature groups, which include original features, difference features, ratio features, memory features, and time dimension features.
  • Various regression methods (such as Ridge Regression, Lasso Regression, and ElasticNet Regression) are used to address multicollinearity and feature selection issues
  • Techniques such as K-fold cross-validation, Support Vector Regression (SVR), Gradient Boosting, and XGBoost are employed to enhance the robustness and accuracy of the model.
  • The document describes the applicable scenarios and advantages and disadvantages of different algorithms in detail.
  • Emphasis is placed on the portability of the model, making it suitable for edge computing scenarios and cloud computing environments. The model design focuses on simplification and efficiency to adapt to different data distributions and computing resources.
  • Practical applications in engineering are mentioned, such as using the Internet of Vehicles for big data aggregation and processing.

 

Results
 

2nd place
Through detailed data analysis and cleaning, complex model design and algorithm selection, and portability optimization for practical application scenarios, the team demonstrates their deep understanding and innovative capabilities in big data analysis and machine learning applications.

Leave a Comment

Comments

No comments yet. Why don’t you start the discussion?

Leave a Reply