Trong thế giới business ngày nay, một sự thật khó chối cãi: Chi phí acquire customer mới cao gấp 5-7 lần so với retain customer hiện tại. Tuy nhiên, theo khảo sát của Carptech với 80+ doanh nghiệp tại Việt Nam, 68% companies không có hệ thống dự đoán churn - họ chỉ biết khách hàng đã rời đi KHI ĐÃ RỜI ĐI, không có cơ hội can thiệp.
Customer Churn Prediction - dự đoán khách hàng nào sắp rời bỏ - là một trong những ML use cases có ROI rõ ràng nhất và fastest time-to-value (3-6 tháng). Một churn prediction model thành công có thể giảm churn rate 15-30%, tương đương hàng tỷ đồng revenue saved mỗi năm.
Bài viết này là hướng dẫn end-to-end, từ A đến Z, để bạn tự build một churn prediction system: business problem definition, data preparation, feature engineering, model training (với code examples đầy đủ), deployment, và monitoring. Kèm case study thực tế về SaaS company giảm churn từ 8% xuống 5.5% trong 6 tháng.
TL;DR - Key Takeaways
- Churn definition varies by industry: E-commerce (90 days no purchase), SaaS (cancelled subscription), Bank (closed account)
- Feature engineering is 80% of success: RFM, engagement, support tickets, behavioral changes - 20-30 features typical
- Algorithm choice: XGBoost, Random Forest work best for churn (AUC 0.75-0.85 achievable)
- Evaluation metric: Precision/Recall trade-off more important than accuracy (imbalanced classes)
- Deployment: Weekly batch scoring → CRM integration → Automated retention campaigns
- ROI: 20-30% churn reduction typical, 12-24 months to full value
Why Customer Churn Matters: The Business Case
The Economics of Churn
Cost comparison:
Customer Acquisition Cost (CAC): 1,500,000 VND
- Marketing spend: 1,000,000đ
- Sales effort: 300,000đ
- Onboarding: 200,000đ
Customer Retention Cost: 300,000 VND
- Win-back campaigns: 150,000đ
- Customer success: 100,000đ
- Loyalty rewards: 50,000đ
Savings: 1,500,000 - 300,000 = 1,200,000đ per customer (5x cheaper)
Lifetime Value (LTV) impact:
Scenario A: High churn (8% monthly)
- Average customer lifetime: 12.5 months
- LTV (at 500k VND/month ARPU): 6.25M VND
Scenario B: Low churn (5% monthly)
- Average customer lifetime: 20 months
- LTV: 10M VND
Difference: 3.75M VND per customer (+60%!)
Company-level impact:
- SaaS company với 10,000 customers, 8% monthly churn
- Churned customers/month: 800
- Annual churned customers: 9,600
- If reduce churn to 5%: Save 3,600 customers/year
- Revenue saved (at 10M LTV): 36B VND/year
Industry Benchmarks: What's "Good" Churn?
| Industry | Monthly Churn | Annual Churn | Notes |
|---|---|---|---|
| SaaS (B2B) | 3-7% | 32-58% | Enterprise SaaS lower (1-2%), SMB higher (5-10%) |
| SaaS (B2C) | 5-10% | 46-72% | Consumer apps churn higher |
| E-commerce | 15-25% | 84-94% | Based on 90-day definition |
| Telecom | 1-3% | 12-31% | Contractual, high switching costs |
| Banking | 10-20% | 72-89% | Account dormancy |
| Subscription boxes | 8-15% | 62-82% | High initial churn (3 months) |
Vietnam specifics (Carptech data):
- Vietnamese SaaS: 6-9% monthly (higher than global due to price sensitivity)
- E-commerce: 20-30% (90-day definition, competitive market)
- Fintech: 8-12% (regulatory trust issues)
Target: Reduce churn by 20-30% (e.g., 8% → 5.5-6%)
Step 1: Define Churn - Không dễ như nghĩ
What is "Churn"?
Contractual churn (explicit):
- SaaS: Subscription cancelled (clear event, easy to identify)
- Telecom: Contract terminated
- Gym membership: Membership not renewed
Non-contractual churn (implicit):
- E-commerce: No purchase in X days (how many days = churn?)
- Mobile app: No login in X days
- Bank: Account dormant (no transactions)
Challenge with non-contractual: You have to DEFINE churn threshold
Churn Definition Examples by Industry
E-commerce:
# Option 1: Fixed time window
churned = (days_since_last_purchase > 90)
# Option 2: Relative to historical behavior
avg_purchase_interval = df.groupby('customer_id')['days_between_purchases'].mean()
churned = (days_since_last_purchase > avg_purchase_interval * 2)
# Option 3: Percentile-based
p90_days = df['days_since_last_purchase'].quantile(0.9)
churned = (days_since_last_purchase > p90_days)
Recommendation: Start with simple (90 days), refine later based on business context
SaaS:
# Clear for paid subscriptions
churned = (subscription_status == 'cancelled')
# For freemium, define "active user"
active_user = (logins_30d >= 4) & (features_used_30d >= 2)
churned = ~active_user
Mobile app:
# Based on login frequency
churned = (days_since_last_login > 30)
# Or engagement threshold
monthly_active_users = (logins_30d > 0)
churned = ~monthly_active_users
Prediction Window
Not just "has customer churned?" but "will customer churn in next X days?"
Typical windows:
- 30 days: Fast-moving (e-commerce, apps)
- 60-90 days: Medium-term (SaaS)
- 180+ days: Long-term (banking, insurance)
Why this matters for ML:
- 30-day window: Need frequent predictions (weekly), faster action
- 90-day window: Monthly predictions OK, more planning time
Example label creation (SaaS, 60-day window):
from datetime import datetime, timedelta
def create_churn_labels(df, prediction_date, churn_window_days=60):
"""
Label customers as churned if they cancel within churn_window_days of prediction_date
"""
future_date = prediction_date + timedelta(days=churn_window_days)
labels = []
for _, customer in df.iterrows():
# Did customer churn between prediction_date and future_date?
if customer['subscription_status'] == 'active':
# Check future cancellations
cancellations = df[
(df['customer_id'] == customer['customer_id']) &
(df['cancellation_date'] >= prediction_date) &
(df['cancellation_date'] <= future_date)
]
churned = (len(cancellations) > 0)
else:
churned = False # Already churned
labels.append({
'customer_id': customer['customer_id'],
'prediction_date': prediction_date,
'churned_next_60d': churned
})
return pd.DataFrame(labels)
Step 2: Feature Engineering - 80% của Success
ML rule: "Garbage in, garbage out" Corollary: "Great features in, great model out"
Feature Categories
1. RFM - Recency, Frequency, Monetary (Most predictive):
# Recency
features['days_since_last_purchase'] = (
prediction_date - df.groupby('customer_id')['purchase_date'].max()
).dt.days
# Frequency
features['total_purchases_all_time'] = df.groupby('customer_id')['order_id'].count()
features['purchases_last_30d'] = df[df['purchase_date'] >= prediction_date - timedelta(30)] \
.groupby('customer_id')['order_id'].count()
features['purchases_last_90d'] = df[df['purchase_date'] >= prediction_date - timedelta(90)] \
.groupby('customer_id')['order_id'].count()
# Monetary
features['total_spend_all_time'] = df.groupby('customer_id')['amount'].sum()
features['total_spend_last_90d'] = df[df['purchase_date'] >= prediction_date - timedelta(90)] \
.groupby('customer_id')['amount'].sum()
features['avg_order_value'] = df.groupby('customer_id')['amount'].mean()
2. Engagement features (Product usage):
# For SaaS/apps
features['logins_30d'] = login_df.groupby('customer_id')['login_date'].count()
features['days_active_30d'] = login_df.groupby('customer_id')['login_date'].nunique()
features['features_used_30d'] = feature_usage_df.groupby('customer_id')['feature_name'].nunique()
# Feature depth (power users)
features['advanced_features_used'] = feature_usage_df[
feature_usage_df['feature_tier'] == 'advanced'
].groupby('customer_id')['feature_name'].nunique()
# For e-commerce
features['website_visits_30d'] = web_analytics.groupby('customer_id')['session_id'].nunique()
features['product_views_30d'] = web_analytics.groupby('customer_id')['product_id'].nunique()
features['add_to_cart_30d'] = web_analytics[
web_analytics['event'] == 'add_to_cart'
].groupby('customer_id')['event'].count()
3. Customer service features (Warning signals):
# Support tickets
features['support_tickets_90d'] = support_tickets.groupby('customer_id')['ticket_id'].count()
features['complaints_90d'] = support_tickets[
support_tickets['type'] == 'complaint'
].groupby('customer_id')['ticket_id'].count()
# Resolution metrics
features['avg_resolution_time'] = support_tickets.groupby('customer_id')['resolution_hours'].mean()
features['unresolved_tickets'] = support_tickets[
support_tickets['status'] == 'open'
].groupby('customer_id')['ticket_id'].count()
4. Behavioral change features (Trends are powerful):
# Change in frequency
features['purchases_30d_vs_prior_30d'] = (
purchases_last_30d - purchases_prior_30d
)
features['purchases_trend'] = features['purchases_30d_vs_prior_30d'] / (purchases_prior_30d + 1) # Avoid /0
# Change in monetary
features['spend_30d_vs_prior_30d'] = (
spend_last_30d - spend_prior_30d
)
# Change in engagement
features['logins_30d_vs_prior_30d'] = (
logins_last_30d - logins_prior_30d
)
features['login_trend'] = features['logins_30d_vs_prior_30d'] / (logins_prior_30d + 1)
Why trends matter: A customer with 10 purchases/month is good. But if they had 20 purchases last month and dropped to 10 this month → warning signal (even though absolute number still looks OK).
5. Customer attributes:
# Demographics
features['customer_age_days'] = (prediction_date - df['signup_date']).dt.days
features['acquisition_channel'] = df['acquisition_channel'] # Categorical
features['customer_segment'] = df['segment'] # VIP, Regular, etc.
# Plan/pricing
features['plan_type'] = df['plan_type'] # Free, Starter, Pro, Enterprise
features['mrr'] = df['monthly_recurring_revenue'] # For SaaS
features['is_on_discount'] = df['discount_percent'] > 0
6. Product/category preferences (E-commerce specific):
# Category diversity
features['categories_purchased'] = purchase_df.groupby('customer_id')['category'].nunique()
# Favorite category
features['top_category'] = purchase_df.groupby('customer_id')['category'].agg(
lambda x: x.value_counts().index[0]
)
# Brand loyalty
features['brands_purchased'] = purchase_df.groupby('customer_id')['brand'].nunique()
features['repeat_brand_rate'] = (
purchase_df[purchase_df.duplicated(['customer_id', 'brand'], keep=False)]
.groupby('customer_id')['order_id'].count() / features['total_purchases_all_time']
)
Example: Complete Feature Set (30 features)
import pandas as pd
import numpy as np
from datetime import datetime, timedelta
def engineer_churn_features(customer_id, prediction_date, transactions_df, logins_df, support_df):
"""
Create features for a single customer at a specific prediction_date
"""
features = {'customer_id': customer_id, 'prediction_date': prediction_date}
# Filter data up to prediction_date only (no data leakage!)
hist_transactions = transactions_df[
(transactions_df['customer_id'] == customer_id) &
(transactions_df['purchase_date'] < prediction_date)
]
hist_logins = logins_df[
(logins_df['customer_id'] == customer_id) &
(logins_df['login_date'] < prediction_date)
]
# RFM
if len(hist_transactions) > 0:
features['days_since_last_purchase'] = (
prediction_date - hist_transactions['purchase_date'].max()
).days
features['total_purchases'] = len(hist_transactions)
features['total_spend'] = hist_transactions['amount'].sum()
features['avg_order_value'] = hist_transactions['amount'].mean()
# Last 30 days
last_30d = hist_transactions[
hist_transactions['purchase_date'] >= prediction_date - timedelta(30)
]
features['purchases_30d'] = len(last_30d)
features['spend_30d'] = last_30d['amount'].sum() if len(last_30d) > 0 else 0
# Prior 30 days (for trend)
prior_30d = hist_transactions[
(hist_transactions['purchase_date'] >= prediction_date - timedelta(60)) &
(hist_transactions['purchase_date'] < prediction_date - timedelta(30))
]
features['purchases_prior_30d'] = len(prior_30d)
features['spend_prior_30d'] = prior_30d['amount'].sum() if len(prior_30d) > 0 else 0
# Trend
features['purchase_trend'] = (
features['purchases_30d'] - features['purchases_prior_30d']
) / (features['purchases_prior_30d'] + 1)
else:
# No purchase history → Fill with defaults
features.update({
'days_since_last_purchase': 9999,
'total_purchases': 0,
'total_spend': 0,
'avg_order_value': 0,
'purchases_30d': 0,
'spend_30d': 0,
'purchase_trend': 0
})
# Engagement
if len(hist_logins) > 0:
features['days_since_last_login'] = (
prediction_date - hist_logins['login_date'].max()
).days
features['logins_30d'] = len(hist_logins[
hist_logins['login_date'] >= prediction_date - timedelta(30)
])
else:
features['days_since_last_login'] = 9999
features['logins_30d'] = 0
# Support tickets (warning signal)
hist_support = support_df[
(support_df['customer_id'] == customer_id) &
(support_df['created_date'] < prediction_date)
]
features['support_tickets_90d'] = len(hist_support[
hist_support['created_date'] >= prediction_date - timedelta(90)
])
return features
# Apply to all customers
all_features = []
for customer_id in customer_ids:
features = engineer_churn_features(
customer_id, prediction_date, transactions_df, logins_df, support_df
)
all_features.append(features)
features_df = pd.DataFrame(all_features)
Common Feature Engineering Mistakes
1. Data leakage (most dangerous):
# WRONG: Using future information
features['days_until_churn'] = (churn_date - prediction_date).days # THIS IS THE TARGET!
# WRONG: Including post-prediction data
features['purchases_next_30d'] = ... # Future data
# RIGHT: Only use data BEFORE prediction_date
features['purchases_last_30d'] = hist_data[
hist_data['date'] < prediction_date
].count()
2. Not handling missing values:
# Many customers have no support tickets → NaN
# Model will fail or treat NaN strangely
# WRONG: Drop rows with NaN
df = df.dropna() # Lose customers
# RIGHT: Fill with meaningful defaults
features['support_tickets_90d'] = features['support_tickets_90d'].fillna(0)
features['days_since_last_purchase'] = features['days_since_last_purchase'].fillna(9999) # "Never"
3. Ignoring feature scaling (for some algorithms):
# Features have different scales
# days_since_last_purchase: 0-1000
# total_purchases: 1-500
# total_spend: 100,000-50,000,000
# For tree-based models (XGBoost, Random Forest): Scaling not needed
# For linear models (Logistic Regression): Need scaling
from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
Step 3: Model Training - Algorithms & Evaluation
Dataset Preparation
import pandas as pd
from sklearn.model_selection import train_test_split
# Load features & labels
features_df = pd.read_csv('churn_features.csv')
labels_df = pd.read_csv('churn_labels.csv')
# Merge
df = features_df.merge(labels_df, on=['customer_id', 'prediction_date'])
# Separate features (X) and target (y)
feature_cols = [
'days_since_last_purchase', 'total_purchases', 'total_spend', 'avg_order_value',
'purchases_30d', 'spend_30d', 'purchase_trend',
'days_since_last_login', 'logins_30d',
'support_tickets_90d',
# ... add all 30 features
]
X = df[feature_cols]
y = df['churned_next_60d']
# Train/test split (temporal split preferred)
# Use older data for training, recent for testing
split_date = '2024-10-01'
train_mask = df['prediction_date'] < split_date
test_mask = df['prediction_date'] >= split_date
X_train, X_test = X[train_mask], X[test_mask]
y_train, y_test = y[train_mask], y[test_mask]
print(f"Train size: {len(X_train)}, Test size: {len(X_test)}")
print(f"Train churn rate: {y_train.mean():.2%}")
print(f"Test churn rate: {y_test.mean():.2%}")
Algorithm Comparison
Try multiple algorithms, compare results:
from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier
from sklearn.metrics import roc_auc_score, classification_report
models = {
'Logistic Regression': LogisticRegression(max_iter=1000, random_state=42),
'Random Forest': RandomForestClassifier(n_estimators=200, max_depth=10, random_state=42),
'XGBoost': XGBClassifier(n_estimators=200, max_depth=6, learning_rate=0.1, random_state=42)
}
results = {}
for name, model in models.items():
# Train
model.fit(X_train, y_train)
# Predict probabilities
y_pred_proba = model.predict_proba(X_test)[:, 1]
# Evaluate
auc = roc_auc_score(y_test, y_pred_proba)
results[name] = {'model': model, 'auc': auc, 'predictions': y_pred_proba}
print(f"\n{name}:")
print(f" AUC: {auc:.4f}")
# Select best model
best_model_name = max(results, key=lambda x: results[x]['auc'])
best_model = results[best_model_name]['model']
print(f"\nBest model: {best_model_name} (AUC: {results[best_model_name]['auc']:.4f})")
Typical results:
- Logistic Regression: AUC 0.72-0.76 (good baseline, interpretable)
- Random Forest: AUC 0.76-0.80 (better performance)
- XGBoost: AUC 0.78-0.85 (usually wins, our recommendation)
Hyperparameter Tuning (XGBoost)
from sklearn.model_selection import GridSearchCV
# Parameter grid
param_grid = {
'n_estimators': [100, 200, 300],
'max_depth': [4, 6, 8],
'learning_rate': [0.01, 0.05, 0.1],
'subsample': [0.8, 1.0],
'colsample_bytree': [0.8, 1.0]
}
# Grid search with cross-validation
xgb = XGBClassifier(random_state=42)
grid_search = GridSearchCV(
xgb,
param_grid,
cv=5,
scoring='roc_auc',
n_jobs=-1,
verbose=1
)
grid_search.fit(X_train, y_train)
print(f"Best parameters: {grid_search.best_params_}")
print(f"Best cross-validation AUC: {grid_search.best_score_:.4f}")
# Final model with best parameters
best_xgb = grid_search.best_estimator_
Warning: Grid search is computationally expensive (hours for large datasets). Start with default params, only tune if needed for marginal gains.
Evaluation Metrics: Beyond Accuracy
Imbalanced classes problem:
- Churn rate = 8% → 92% non-churn
- Naive model: Predict everyone as "not churned" → 92% accuracy!
- But this model is useless (doesn't catch any churners)
Better metrics:
1. ROC-AUC (Area Under ROC Curve):
from sklearn.metrics import roc_auc_score, roc_curve
import matplotlib.pyplot as plt
auc = roc_auc_score(y_test, y_pred_proba)
print(f"AUC: {auc:.4f}")
# Plot ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot(fpr, tpr, label=f'AUC = {auc:.4f}')
plt.plot([0, 1], [0, 1], 'k--', label='Random')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend()
plt.show()
- AUC = 0.5: Random guessing
- AUC = 0.7-0.8: Good model
- AUC = 0.8-0.9: Excellent model
- AUC > 0.9: Suspiciously good (check for data leakage!)
2. Precision & Recall (Confusion Matrix):
from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix
# Choose threshold (default 0.5 often not optimal)
threshold = 0.3 # Lower threshold = catch more churners, but more false alarms
y_pred = (y_pred_proba >= threshold).astype(int)
# Metrics
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)
print(f"Precision: {precision:.2%}") # Of predicted churners, how many actually churned?
print(f"Recall: {recall:.2%}") # Of actual churners, how many did we catch?
print(f"F1-score: {f1:.4f}")
# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(cm)
Example output:
Confusion Matrix:
Predicted Non-Churn Predicted Churn
Actual Non-Churn 8,500 (TN) 450 (FP)
Actual Churn 200 (FN) 850 (TP)
Precision: 65% (850 / (850+450)) → Of customers we flagged, 65% actually churned
Recall: 81% (850 / (850+200)) → We caught 81% of churners
Trade-off:
- High precision (low FP): Fewer false alarms, but miss some churners
- High recall (low FN): Catch most churners, but many false alarms
Business decision:
- High cost of false positives (expensive retention campaigns): Prefer high precision
- High cost of losing customers (high LTV): Prefer high recall
3. Precision-Recall curve (find optimal threshold):
from sklearn.metrics import precision_recall_curve
precisions, recalls, thresholds = precision_recall_curve(y_test, y_pred_proba)
# Plot
plt.plot(thresholds, precisions[:-1], label='Precision')
plt.plot(thresholds, recalls[:-1], label='Recall')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.title('Precision-Recall vs Threshold')
plt.legend()
plt.grid()
plt.show()
# Find threshold where precision = 0.7 (business requirement)
idx = (precisions >= 0.7).argmax()
optimal_threshold = thresholds[idx]
print(f"Threshold for 70% precision: {optimal_threshold:.4f}")
print(f"Recall at this threshold: {recalls[idx]:.2%}")
Feature Importance: What Drives Churn?
# XGBoost feature importance
feature_importance = pd.DataFrame({
'feature': feature_cols,
'importance': best_xgb.feature_importances_
}).sort_values('importance', ascending=False)
print(feature_importance.head(10))
# Plot
import matplotlib.pyplot as plt
feature_importance.head(15).plot(x='feature', y='importance', kind='barh', figsize=(10, 8))
plt.xlabel('Importance')
plt.title('Top 15 Features for Churn Prediction')
plt.gca().invert_yaxis()
plt.show()
Typical top features:
- days_since_last_purchase (20-25% importance)
- purchase_trend (15-18%)
- total_purchases (10-12%)
- logins_30d (8-10%)
- support_tickets_90d (6-8%)
Insights:
- Behavioral changes (trends) more predictive than absolute values
- Engagement metrics (logins, feature usage) strong signals
- Support tickets = red flag
Step 4: Deployment - From Model to Action
Batch Scoring (Weekly Schedule)
import schedule
import time
def score_customers():
"""
Score all active customers for churn risk
Run weekly (every Monday 6 AM)
"""
print(f"[{datetime.now()}] Starting weekly churn scoring...")
# 1. Get active customers
active_customers = get_active_customers() # Query from database
print(f" Found {len(active_customers)} active customers")
# 2. Engineer features
prediction_date = datetime.now().date()
features = []
for customer_id in active_customers:
feat = engineer_churn_features(customer_id, prediction_date, ...)
features.append(feat)
features_df = pd.DataFrame(features)
X = features_df[feature_cols]
# 3. Score with model
churn_probabilities = best_xgb.predict_proba(X)[:, 1]
# 4. Create results table
results = pd.DataFrame({
'customer_id': features_df['customer_id'],
'prediction_date': prediction_date,
'churn_probability': churn_probabilities,
'risk_tier': pd.cut(churn_probabilities, bins=[0, 0.3, 0.7, 1.0], labels=['Low', 'Medium', 'High'])
})
# 5. Save to database
save_to_database(results, table='churn_predictions')
print(f" Scored {len(results)} customers, saved to database")
# 6. Trigger retention campaigns for high-risk
high_risk = results[results['risk_tier'] == 'High']
trigger_retention_campaigns(high_risk)
print(f" Triggered retention campaigns for {len(high_risk)} high-risk customers")
print("[Completed] Churn scoring finished\n")
# Schedule weekly (every Monday at 6 AM)
schedule.every().monday.at("06:00").do(score_customers)
# Keep running
while True:
schedule.run_pending()
time.sleep(3600) # Check every hour
Integration with CRM / Marketing Automation
Option 1: Database integration:
-- Analysts query churn scores directly
SELECT
c.customer_id,
c.email,
c.name,
cp.churn_probability,
cp.risk_tier,
cp.prediction_date
FROM customers c
JOIN churn_predictions cp ON c.customer_id = cp.customer_id
WHERE cp.prediction_date = CURRENT_DATE
AND cp.risk_tier = 'High'
ORDER BY cp.churn_probability DESC
LIMIT 100;
Option 2: API endpoint (for real-time scoring):
from flask import Flask, request, jsonify
import joblib
app = Flask(__name__)
# Load model on startup
model = joblib.load('churn_model.pkl')
@app.route('/predict_churn', methods=['POST'])
def predict_churn():
"""
API endpoint to score a single customer
"""
data = request.json
customer_id = data['customer_id']
# Engineer features
features = engineer_churn_features(customer_id, datetime.now().date(), ...)
X = pd.DataFrame([features])[feature_cols]
# Predict
churn_prob = model.predict_proba(X)[0, 1]
return jsonify({
'customer_id': customer_id,
'churn_probability': float(churn_prob),
'risk_tier': 'High' if churn_prob > 0.7 else ('Medium' if churn_prob > 0.3 else 'Low')
})
if __name__ == '__main__':
app.run(host='0.0.0.0', port=5000)
Option 3: Reverse ETL (Hightouch, Census):
- Churn scores in data warehouse → Sync to CRM (Salesforce, HubSpot)
- Sales reps see churn risk in CRM interface
Retention Campaign Automation
def trigger_retention_campaigns(high_risk_customers):
"""
Send personalized retention campaigns based on churn reasons
"""
for _, customer in high_risk_customers.iterrows():
customer_id = customer['customer_id']
churn_prob = customer['churn_probability']
# Analyze churn reasons (feature values)
features = get_customer_features(customer_id)
# Decision tree for campaign type
if features['purchase_trend'] < -0.5:
# Decreasing purchase frequency
campaign_type = 'reactivation_discount'
campaign_message = f"Nhớ bạn quá! Voucher 20% cho đơn hàng tiếp theo"
elif features['support_tickets_90d'] > 2:
# Support issues
campaign_type = 'customer_success_call'
campaign_message = "Đội Customer Success sẽ gọi để hỗ trợ bạn"
elif features['logins_30d'] < 2:
# Low engagement
campaign_type = 'feature_education'
campaign_message = "5 tính năng bạn chưa biết trong app"
else:
# General win-back
campaign_type = 'loyalty_reward'
campaign_message = "Ưu đãi đặc biệt cho khách hàng thân thiết"
# Send campaign via marketing automation (Mailchimp, Braze, etc.)
send_campaign(
customer_id=customer_id,
campaign_type=campaign_type,
message=campaign_message,
discount_percent=20 if 'discount' in campaign_type else 0
)
print(f" Sent {campaign_type} to customer {customer_id} (churn prob: {churn_prob:.2%})")
Step 5: Monitoring & Continuous Improvement
Model Performance Monitoring
# Weekly evaluation on new data
def evaluate_model_performance():
"""
Evaluate model on past week's predictions vs actual outcomes
"""
# Get predictions from 60 days ago (when we predicted 60-day churn)
prediction_date = datetime.now().date() - timedelta(days=60)
predictions = get_predictions_from_date(prediction_date)
# Get actual churn outcomes
actual_churn = get_actual_churn_outcomes(prediction_date, prediction_date + timedelta(60))
# Merge
df = predictions.merge(actual_churn, on='customer_id')
# Evaluate
auc = roc_auc_score(df['actually_churned'], df['churn_probability'])
precision = precision_score(df['actually_churned'], df['predicted_high_risk'])
recall = recall_score(df['actually_churned'], df['predicted_high_risk'])
print(f"[{datetime.now()}] Model Performance (last 60 days):")
print(f" AUC: {auc:.4f}")
print(f" Precision: {precision:.2%}")
print(f" Recall: {recall:.2%}")
# Alert if performance degrades
if auc < 0.70: # Threshold
send_alert("CHURN MODEL PERFORMANCE DEGRADED", details=f"AUC dropped to {auc:.4f}")
# Run monthly
schedule.every().month.at("01:00").do(evaluate_model_performance)
Business Impact Tracking
Metrics to track:
-- Overall churn rate (before vs after ML)
WITH monthly_churn AS (
SELECT
DATE_TRUNC('month', date) as month,
COUNT(DISTINCT CASE WHEN churned = TRUE THEN customer_id END) as churned_customers,
COUNT(DISTINCT customer_id) as total_customers,
COUNT(DISTINCT CASE WHEN churned = TRUE THEN customer_id END)::FLOAT /
COUNT(DISTINCT customer_id) * 100 as churn_rate
FROM customer_status_daily
GROUP BY month
)
SELECT * FROM monthly_churn
ORDER BY month DESC
LIMIT 12;
-- Retention campaign effectiveness
SELECT
campaign_type,
COUNT(*) as customers_targeted,
SUM(CASE WHEN churned_60d = FALSE THEN 1 ELSE 0 END) as retained,
SUM(CASE WHEN churned_60d = FALSE THEN 1 ELSE 0 END)::FLOAT / COUNT(*) * 100 as retention_rate
FROM retention_campaigns rc
JOIN customer_outcomes co ON rc.customer_id = co.customer_id
WHERE rc.sent_date >= '2024-01-01'
GROUP BY campaign_type
ORDER BY retention_rate DESC;
Dashboard metrics:
- Monthly churn rate trend (target: -20% vs baseline)
- Customers flagged as high-risk
- Retention campaigns sent
- Campaign success rate (% retained)
- ROI (revenue saved vs campaign cost)
Model Retraining Schedule
When to retrain:
- Regular schedule: Every 3-6 months (to capture new patterns)
- Performance degradation: If AUC drops >5% from baseline
- Concept drift: Customer behavior changes (e.g., COVID impact)
- New features available: Additional data sources connected
Retraining workflow:
def retrain_churn_model():
"""
Retrain model with latest data (last 18 months)
"""
print(f"[{datetime.now()}] Starting model retraining...")
# 1. Get fresh training data (last 18 months)
end_date = datetime.now().date() - timedelta(days=60) # Need 60 days for labels
start_date = end_date - timedelta(days=540) # 18 months
# 2. Generate features & labels for all customers in this period
# (Same feature engineering code as before)
# 3. Train new model
# (Same training code as before)
# 4. Evaluate on holdout set
# Compare new model vs old model
# 5. If new model better → Deploy
if new_model_auc > old_model_auc + 0.02: # At least 2% improvement
save_model(new_model, 'churn_model_v2.pkl')
print(f" New model deployed (AUC: {new_model_auc:.4f} vs {old_model_auc:.4f})")
else:
print(f" New model not better, keeping old model")
# Run quarterly
schedule.every(3).months.do(retrain_churn_model)
Case Study: SaaS Company Giảm Churn 30%
Background:
- Company: B2B SaaS (project management tool), ~5,000 customers
- Baseline churn: 8% monthly (industry average 5-7%)
- Annual revenue: 60B VND
- Problem: Losing ~400 customers/month, no proactive retention
Implementation (4 months):
Month 1: Data preparation
- Connected Stripe (subscriptions), Mixpanel (product usage), Zendesk (support tickets)
- Built Data Warehouse (BigQuery) + dbt models
- Historical data: 24 months
Month 2: Feature engineering & modeling
- 28 features engineered (RFM, engagement, support tickets, behavioral trends)
- Trained XGBoost model
- AUC: 0.81 (excellent for first attempt)
- Top features:
- days_since_last_login (22%)
- logins_30d_trend (18%)
- features_used_30d (15%)
- support_tickets_90d (12%)
Month 3: Deployment
- Weekly batch scoring (every Monday)
- CRM integration (Salesforce) - churn scores visible to CSMs
- Automated email campaigns for high-risk (>70% churn prob)
Month 4: Optimization
- A/B test: 50% control (no intervention), 50% treatment (retention campaigns)
- Campaign types:
- Feature education: Video tutorials for underutilized features
- Customer success call: Proactive check-in for high-risk accounts
- Discount offers: 20% off next 3 months for at-risk customers
Results after 6 months:
| Metric | Before | After | Change |
|---|---|---|---|
| Monthly churn rate | 8.0% | 5.5% | -31% (↓2.5pp) |
| Customers lost/month | 400 | 275 | -125 customers |
| High-risk customers identified | N/A | 600/month | New |
| Retention campaign success rate | N/A | 35% | 210 saved/month |
| Annual revenue saved | N/A | 18B VND | (125 × 12 × 12M LTV) |
| Campaign cost | N/A | 2B VND | Discounts + CS time |
| Net benefit | N/A | 16B VND/year | 8x ROI |
Key learnings:
- Behavioral changes > absolute metrics: Login trend more predictive than absolute login count
- Support tickets = strong signal: Customers with 2+ tickets in 90 days had 3x churn rate
- Discount not always needed: 60% retained with feature education alone (cheaper than discounts)
- Timing matters: Intervene when churn probability 40-70% (still salvageable)
Common Pitfalls & How to Avoid
1. Chasing perfect model, ignoring business impact:
- Spending 3 months to improve AUC from 0.80 → 0.83
- Better: Deploy 0.80 model quickly, measure business impact, iterate
2. Not A/B testing:
- Assume churn reduction due to ML, but could be seasonality, product changes
- Solution: Always have control group (no intervention)
3. Retention campaigns too generic:
- Same "20% discount" email to everyone
- Better: Personalize based on churn reasons (support issues → CS call, low engagement → education)
4. Forgetting about model maintenance:
- Train once in 2023, use in 2025 → Performance degrades
- Solution: Quarterly retraining, monthly performance monitoring
5. Over-reliance on model:
- "Model says low churn risk, ignore this customer complaint"
- Solution: ML assists, humans decide. Customer Success Managers have final say.
Kết Luận: Churn Prediction = Proven ROI
Customer churn prediction là một trong những ML use cases dễ justify ROI nhất:
- Clear metric: Churn rate reduction (%)
- Fast time-to-value: 3-6 months pilot → production
- High impact: 20-30% churn reduction = tens of billions VND for mid-size companies
- Transferable: Learnings apply to other prediction problems (upsell, LTV prediction)
Implementation roadmap:
- Month 1: Data preparation (Data Warehouse, historical data)
- Month 2: Feature engineering, model training
- Month 3: Deployment, CRM integration
- Month 4-6: A/B testing, optimization
- Ongoing: Monitoring, retraining, scaling
Next steps:
- Assess data readiness (have 12+ months history? Clean data?)
- Define churn appropriately for your business
- Start with pilot (top 20% customers by revenue)
- Liên hệ Carptech nếu cần support end-to-end ML project (carptech.vn/contact)
Tài liệu tham khảo:
Bài viết này là phần 2 của series "Advanced Analytics & AI/ML" tháng 5. Đọc thêm về Analytics Maturity, Recommendation Systems, và Demand Forecasting.
Carptech - Data Platform & ML Solutions for Vietnamese Enterprises. Liên hệ tư vấn miễn phí.




