Started my Project on Energy-Efficient DevOps: Auto-Suspending Idle AWS Resources using ML
Week 2 – ML Modeling Phase (Log Extraction & Model Training)
Objective:
To extract the logged CPU and workload data from the EC2 instance, preprocess it, label idle/active states, and train a machine learning model to detect idle periods.
What I Did Today:
Stopped Data Collection
After 3 full days of continuous logging using cron, I safely downloaded the log files from EC2 to my local machine for analysis:
cpu_log.csvworkload_log.txt
Used SCP or manual download via EC2 session to transfer the files.
Preprocessed the Data
Parsed the CPU logs and converted timestamps. Added engineered features:
Rolling average of CPU %
Hour of day
Minute
Is weekend/weekday
Labeled each row as Idle (1) or Active (0) based on a CPU usage threshold (e.g., 20%).
Trained an ML Model
Used a Decision Tree Classifier from scikit-learn. Steps included:
Feature scaling (if needed)
Train-test split (80/20)
Model fitting
Evaluated Performance
Used:
Confusion Matrix
Accuracy Score
Classification Report
Also visualized feature importance to understand what influences predictions most.
You can use Google Colab Notebook — ML Summary
Load CPU Log
→ Loaded thecpu_log.csvfile usingpandasand parsed the timestamp column for time-based analysis
Visualize CPU Usage
→ Created line plots of CPU usage over time to detect workload and idle patterns visually.
Label Idle States
→ Labeled each row asIdle (1)orActive (0)based on whether CPU usage was below a threshold (e.g., 20%).
Feature Engineering
→ Added features like rolling averages, hour, and minute to enrich the dataset for ML training.
Train ML Model
→ Used a Decision Tree Classifier fromscikit-learnto train on the labeled and feature-enhanced dataset.
Evaluate Model
→ Assessed model performance using accuracy, confusion matrix, and classification report.
Feature Importance
→ Visualized the most influential features in the prediction using bar plots.
Problems Faced:
SSH connection to EC2 timed out temporarily ( 1.verify the public ip address from the command in the powershell and the one on your aws instance is the same, 2. configure security groups of those instances so that we can connect from your ip- inbound rules)
Crontab logs needed manual checking to verify that logging was consistent.
Had to clean missing/garbled lines in
cpu_log.csvbefore modeling.
Tags:
#DevOps #MachineLearning #AWS #Python #Colab #CloudOptimization #cron #BuildInPublic
Want to Follow Along?
I’ll be sharing weekly progress — issues, logs, architecture, and ML models.
If you've solved similar problems (like automated cloud optimization), I’d love to hear your insight.