Quay lại Blog
Data PlatformCập nhật: 13 tháng 5, 202521 phút đọc

Customer Churn Prediction: End-to-End ML Project Guide

Hướng dẫn chi tiết xây dựng ML model dự đoán customer churn từ A-Z: định nghĩa churn, feature engineering, model training (XGBoost, Random Forest), deployment, monitoring. Với code examples đầy đủ và case study giảm churn 25%.

Đặng Quỳnh Hương

Đặng Quỳnh Hương

Senior Data Scientist

Biểu đồ ML workflow cho churn prediction với features, model training và retention campaigns
#Machine Learning#Churn Prediction#Customer Retention#Data Science#Python#XGBoost

Trong thế giới business ngày nay, một sự thật khó chối cãi: Chi phí acquire customer mới cao gấp 5-7 lần so với retain customer hiện tại. Tuy nhiên, theo khảo sát của Carptech với 80+ doanh nghiệp tại Việt Nam, 68% companies không có hệ thống dự đoán churn - họ chỉ biết khách hàng đã rời đi KHI ĐÃ RỜI ĐI, không có cơ hội can thiệp.

Customer Churn Prediction - dự đoán khách hàng nào sắp rời bỏ - là một trong những ML use cases có ROI rõ ràng nhất và fastest time-to-value (3-6 tháng). Một churn prediction model thành công có thể giảm churn rate 15-30%, tương đương hàng tỷ đồng revenue saved mỗi năm.

Bài viết này là hướng dẫn end-to-end, từ A đến Z, để bạn tự build một churn prediction system: business problem definition, data preparation, feature engineering, model training (với code examples đầy đủ), deployment, và monitoring. Kèm case study thực tế về SaaS company giảm churn từ 8% xuống 5.5% trong 6 tháng.

TL;DR - Key Takeaways

  • Churn definition varies by industry: E-commerce (90 days no purchase), SaaS (cancelled subscription), Bank (closed account)
  • Feature engineering is 80% of success: RFM, engagement, support tickets, behavioral changes - 20-30 features typical
  • Algorithm choice: XGBoost, Random Forest work best for churn (AUC 0.75-0.85 achievable)
  • Evaluation metric: Precision/Recall trade-off more important than accuracy (imbalanced classes)
  • Deployment: Weekly batch scoring → CRM integration → Automated retention campaigns
  • ROI: 20-30% churn reduction typical, 12-24 months to full value

Why Customer Churn Matters: The Business Case

The Economics of Churn

Cost comparison:

Customer Acquisition Cost (CAC): 1,500,000 VND
  - Marketing spend: 1,000,000đ
  - Sales effort: 300,000đ
  - Onboarding: 200,000đ

Customer Retention Cost: 300,000 VND
  - Win-back campaigns: 150,000đ
  - Customer success: 100,000đ
  - Loyalty rewards: 50,000đ

Savings: 1,500,000 - 300,000 = 1,200,000đ per customer (5x cheaper)

Lifetime Value (LTV) impact:

Scenario A: High churn (8% monthly)
  - Average customer lifetime: 12.5 months
  - LTV (at 500k VND/month ARPU): 6.25M VND

Scenario B: Low churn (5% monthly)
  - Average customer lifetime: 20 months
  - LTV: 10M VND

Difference: 3.75M VND per customer (+60%!)

Company-level impact:

  • SaaS company với 10,000 customers, 8% monthly churn
  • Churned customers/month: 800
  • Annual churned customers: 9,600
  • If reduce churn to 5%: Save 3,600 customers/year
  • Revenue saved (at 10M LTV): 36B VND/year

Industry Benchmarks: What's "Good" Churn?

IndustryMonthly ChurnAnnual ChurnNotes
SaaS (B2B)3-7%32-58%Enterprise SaaS lower (1-2%), SMB higher (5-10%)
SaaS (B2C)5-10%46-72%Consumer apps churn higher
E-commerce15-25%84-94%Based on 90-day definition
Telecom1-3%12-31%Contractual, high switching costs
Banking10-20%72-89%Account dormancy
Subscription boxes8-15%62-82%High initial churn (3 months)

Vietnam specifics (Carptech data):

  • Vietnamese SaaS: 6-9% monthly (higher than global due to price sensitivity)
  • E-commerce: 20-30% (90-day definition, competitive market)
  • Fintech: 8-12% (regulatory trust issues)

Target: Reduce churn by 20-30% (e.g., 8% → 5.5-6%)

Step 1: Define Churn - Không dễ như nghĩ

What is "Churn"?

Contractual churn (explicit):

  • SaaS: Subscription cancelled (clear event, easy to identify)
  • Telecom: Contract terminated
  • Gym membership: Membership not renewed

Non-contractual churn (implicit):

  • E-commerce: No purchase in X days (how many days = churn?)
  • Mobile app: No login in X days
  • Bank: Account dormant (no transactions)

Challenge with non-contractual: You have to DEFINE churn threshold

Churn Definition Examples by Industry

E-commerce:

# Option 1: Fixed time window
churned = (days_since_last_purchase > 90)

# Option 2: Relative to historical behavior
avg_purchase_interval = df.groupby('customer_id')['days_between_purchases'].mean()
churned = (days_since_last_purchase > avg_purchase_interval * 2)

# Option 3: Percentile-based
p90_days = df['days_since_last_purchase'].quantile(0.9)
churned = (days_since_last_purchase > p90_days)

Recommendation: Start with simple (90 days), refine later based on business context

SaaS:

# Clear for paid subscriptions
churned = (subscription_status == 'cancelled')

# For freemium, define "active user"
active_user = (logins_30d >= 4) & (features_used_30d >= 2)
churned = ~active_user

Mobile app:

# Based on login frequency
churned = (days_since_last_login > 30)

# Or engagement threshold
monthly_active_users = (logins_30d > 0)
churned = ~monthly_active_users

Prediction Window

Not just "has customer churned?" but "will customer churn in next X days?"

Typical windows:

  • 30 days: Fast-moving (e-commerce, apps)
  • 60-90 days: Medium-term (SaaS)
  • 180+ days: Long-term (banking, insurance)

Why this matters for ML:

  • 30-day window: Need frequent predictions (weekly), faster action
  • 90-day window: Monthly predictions OK, more planning time

Example label creation (SaaS, 60-day window):

from datetime import datetime, timedelta

def create_churn_labels(df, prediction_date, churn_window_days=60):
    """
    Label customers as churned if they cancel within churn_window_days of prediction_date
    """
    future_date = prediction_date + timedelta(days=churn_window_days)

    labels = []
    for _, customer in df.iterrows():
        # Did customer churn between prediction_date and future_date?
        if customer['subscription_status'] == 'active':
            # Check future cancellations
            cancellations = df[
                (df['customer_id'] == customer['customer_id']) &
                (df['cancellation_date'] >= prediction_date) &
                (df['cancellation_date'] <= future_date)
            ]
            churned = (len(cancellations) > 0)
        else:
            churned = False  # Already churned

        labels.append({
            'customer_id': customer['customer_id'],
            'prediction_date': prediction_date,
            'churned_next_60d': churned
        })

    return pd.DataFrame(labels)

Step 2: Feature Engineering - 80% của Success

ML rule: "Garbage in, garbage out" Corollary: "Great features in, great model out"

Feature Categories

1. RFM - Recency, Frequency, Monetary (Most predictive):

# Recency
features['days_since_last_purchase'] = (
    prediction_date - df.groupby('customer_id')['purchase_date'].max()
).dt.days

# Frequency
features['total_purchases_all_time'] = df.groupby('customer_id')['order_id'].count()
features['purchases_last_30d'] = df[df['purchase_date'] >= prediction_date - timedelta(30)] \
    .groupby('customer_id')['order_id'].count()
features['purchases_last_90d'] = df[df['purchase_date'] >= prediction_date - timedelta(90)] \
    .groupby('customer_id')['order_id'].count()

# Monetary
features['total_spend_all_time'] = df.groupby('customer_id')['amount'].sum()
features['total_spend_last_90d'] = df[df['purchase_date'] >= prediction_date - timedelta(90)] \
    .groupby('customer_id')['amount'].sum()
features['avg_order_value'] = df.groupby('customer_id')['amount'].mean()

2. Engagement features (Product usage):

# For SaaS/apps
features['logins_30d'] = login_df.groupby('customer_id')['login_date'].count()
features['days_active_30d'] = login_df.groupby('customer_id')['login_date'].nunique()
features['features_used_30d'] = feature_usage_df.groupby('customer_id')['feature_name'].nunique()

# Feature depth (power users)
features['advanced_features_used'] = feature_usage_df[
    feature_usage_df['feature_tier'] == 'advanced'
].groupby('customer_id')['feature_name'].nunique()

# For e-commerce
features['website_visits_30d'] = web_analytics.groupby('customer_id')['session_id'].nunique()
features['product_views_30d'] = web_analytics.groupby('customer_id')['product_id'].nunique()
features['add_to_cart_30d'] = web_analytics[
    web_analytics['event'] == 'add_to_cart'
].groupby('customer_id')['event'].count()

3. Customer service features (Warning signals):

# Support tickets
features['support_tickets_90d'] = support_tickets.groupby('customer_id')['ticket_id'].count()
features['complaints_90d'] = support_tickets[
    support_tickets['type'] == 'complaint'
].groupby('customer_id')['ticket_id'].count()

# Resolution metrics
features['avg_resolution_time'] = support_tickets.groupby('customer_id')['resolution_hours'].mean()
features['unresolved_tickets'] = support_tickets[
    support_tickets['status'] == 'open'
].groupby('customer_id')['ticket_id'].count()

4. Behavioral change features (Trends are powerful):

# Change in frequency
features['purchases_30d_vs_prior_30d'] = (
    purchases_last_30d - purchases_prior_30d
)
features['purchases_trend'] = features['purchases_30d_vs_prior_30d'] / (purchases_prior_30d + 1)  # Avoid /0

# Change in monetary
features['spend_30d_vs_prior_30d'] = (
    spend_last_30d - spend_prior_30d
)

# Change in engagement
features['logins_30d_vs_prior_30d'] = (
    logins_last_30d - logins_prior_30d
)
features['login_trend'] = features['logins_30d_vs_prior_30d'] / (logins_prior_30d + 1)

Why trends matter: A customer with 10 purchases/month is good. But if they had 20 purchases last month and dropped to 10 this month → warning signal (even though absolute number still looks OK).

5. Customer attributes:

# Demographics
features['customer_age_days'] = (prediction_date - df['signup_date']).dt.days
features['acquisition_channel'] = df['acquisition_channel']  # Categorical
features['customer_segment'] = df['segment']  # VIP, Regular, etc.

# Plan/pricing
features['plan_type'] = df['plan_type']  # Free, Starter, Pro, Enterprise
features['mrr'] = df['monthly_recurring_revenue']  # For SaaS
features['is_on_discount'] = df['discount_percent'] > 0

6. Product/category preferences (E-commerce specific):

# Category diversity
features['categories_purchased'] = purchase_df.groupby('customer_id')['category'].nunique()

# Favorite category
features['top_category'] = purchase_df.groupby('customer_id')['category'].agg(
    lambda x: x.value_counts().index[0]
)

# Brand loyalty
features['brands_purchased'] = purchase_df.groupby('customer_id')['brand'].nunique()
features['repeat_brand_rate'] = (
    purchase_df[purchase_df.duplicated(['customer_id', 'brand'], keep=False)]
    .groupby('customer_id')['order_id'].count() / features['total_purchases_all_time']
)

Example: Complete Feature Set (30 features)

import pandas as pd
import numpy as np
from datetime import datetime, timedelta

def engineer_churn_features(customer_id, prediction_date, transactions_df, logins_df, support_df):
    """
    Create features for a single customer at a specific prediction_date
    """
    features = {'customer_id': customer_id, 'prediction_date': prediction_date}

    # Filter data up to prediction_date only (no data leakage!)
    hist_transactions = transactions_df[
        (transactions_df['customer_id'] == customer_id) &
        (transactions_df['purchase_date'] < prediction_date)
    ]
    hist_logins = logins_df[
        (logins_df['customer_id'] == customer_id) &
        (logins_df['login_date'] < prediction_date)
    ]

    # RFM
    if len(hist_transactions) > 0:
        features['days_since_last_purchase'] = (
            prediction_date - hist_transactions['purchase_date'].max()
        ).days
        features['total_purchases'] = len(hist_transactions)
        features['total_spend'] = hist_transactions['amount'].sum()
        features['avg_order_value'] = hist_transactions['amount'].mean()

        # Last 30 days
        last_30d = hist_transactions[
            hist_transactions['purchase_date'] >= prediction_date - timedelta(30)
        ]
        features['purchases_30d'] = len(last_30d)
        features['spend_30d'] = last_30d['amount'].sum() if len(last_30d) > 0 else 0

        # Prior 30 days (for trend)
        prior_30d = hist_transactions[
            (hist_transactions['purchase_date'] >= prediction_date - timedelta(60)) &
            (hist_transactions['purchase_date'] < prediction_date - timedelta(30))
        ]
        features['purchases_prior_30d'] = len(prior_30d)
        features['spend_prior_30d'] = prior_30d['amount'].sum() if len(prior_30d) > 0 else 0

        # Trend
        features['purchase_trend'] = (
            features['purchases_30d'] - features['purchases_prior_30d']
        ) / (features['purchases_prior_30d'] + 1)

    else:
        # No purchase history → Fill with defaults
        features.update({
            'days_since_last_purchase': 9999,
            'total_purchases': 0,
            'total_spend': 0,
            'avg_order_value': 0,
            'purchases_30d': 0,
            'spend_30d': 0,
            'purchase_trend': 0
        })

    # Engagement
    if len(hist_logins) > 0:
        features['days_since_last_login'] = (
            prediction_date - hist_logins['login_date'].max()
        ).days
        features['logins_30d'] = len(hist_logins[
            hist_logins['login_date'] >= prediction_date - timedelta(30)
        ])
    else:
        features['days_since_last_login'] = 9999
        features['logins_30d'] = 0

    # Support tickets (warning signal)
    hist_support = support_df[
        (support_df['customer_id'] == customer_id) &
        (support_df['created_date'] < prediction_date)
    ]
    features['support_tickets_90d'] = len(hist_support[
        hist_support['created_date'] >= prediction_date - timedelta(90)
    ])

    return features

# Apply to all customers
all_features = []
for customer_id in customer_ids:
    features = engineer_churn_features(
        customer_id, prediction_date, transactions_df, logins_df, support_df
    )
    all_features.append(features)

features_df = pd.DataFrame(all_features)

Common Feature Engineering Mistakes

1. Data leakage (most dangerous):

# WRONG: Using future information
features['days_until_churn'] = (churn_date - prediction_date).days  # THIS IS THE TARGET!

# WRONG: Including post-prediction data
features['purchases_next_30d'] = ...  # Future data

# RIGHT: Only use data BEFORE prediction_date
features['purchases_last_30d'] = hist_data[
    hist_data['date'] < prediction_date
].count()

2. Not handling missing values:

# Many customers have no support tickets → NaN
# Model will fail or treat NaN strangely

# WRONG: Drop rows with NaN
df = df.dropna()  # Lose customers

# RIGHT: Fill with meaningful defaults
features['support_tickets_90d'] = features['support_tickets_90d'].fillna(0)
features['days_since_last_purchase'] = features['days_since_last_purchase'].fillna(9999)  # "Never"

3. Ignoring feature scaling (for some algorithms):

# Features have different scales
# days_since_last_purchase: 0-1000
# total_purchases: 1-500
# total_spend: 100,000-50,000,000

# For tree-based models (XGBoost, Random Forest): Scaling not needed
# For linear models (Logistic Regression): Need scaling

from sklearn.preprocessing import StandardScaler
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

Step 3: Model Training - Algorithms & Evaluation

Dataset Preparation

import pandas as pd
from sklearn.model_selection import train_test_split

# Load features & labels
features_df = pd.read_csv('churn_features.csv')
labels_df = pd.read_csv('churn_labels.csv')

# Merge
df = features_df.merge(labels_df, on=['customer_id', 'prediction_date'])

# Separate features (X) and target (y)
feature_cols = [
    'days_since_last_purchase', 'total_purchases', 'total_spend', 'avg_order_value',
    'purchases_30d', 'spend_30d', 'purchase_trend',
    'days_since_last_login', 'logins_30d',
    'support_tickets_90d',
    # ... add all 30 features
]
X = df[feature_cols]
y = df['churned_next_60d']

# Train/test split (temporal split preferred)
# Use older data for training, recent for testing
split_date = '2024-10-01'
train_mask = df['prediction_date'] < split_date
test_mask = df['prediction_date'] >= split_date

X_train, X_test = X[train_mask], X[test_mask]
y_train, y_test = y[train_mask], y[test_mask]

print(f"Train size: {len(X_train)}, Test size: {len(X_test)}")
print(f"Train churn rate: {y_train.mean():.2%}")
print(f"Test churn rate: {y_test.mean():.2%}")

Algorithm Comparison

Try multiple algorithms, compare results:

from sklearn.linear_model import LogisticRegression
from sklearn.ensemble import RandomForestClassifier
from xgboost import XGBClassifier
from sklearn.metrics import roc_auc_score, classification_report

models = {
    'Logistic Regression': LogisticRegression(max_iter=1000, random_state=42),
    'Random Forest': RandomForestClassifier(n_estimators=200, max_depth=10, random_state=42),
    'XGBoost': XGBClassifier(n_estimators=200, max_depth=6, learning_rate=0.1, random_state=42)
}

results = {}
for name, model in models.items():
    # Train
    model.fit(X_train, y_train)

    # Predict probabilities
    y_pred_proba = model.predict_proba(X_test)[:, 1]

    # Evaluate
    auc = roc_auc_score(y_test, y_pred_proba)
    results[name] = {'model': model, 'auc': auc, 'predictions': y_pred_proba}

    print(f"\n{name}:")
    print(f"  AUC: {auc:.4f}")

# Select best model
best_model_name = max(results, key=lambda x: results[x]['auc'])
best_model = results[best_model_name]['model']
print(f"\nBest model: {best_model_name} (AUC: {results[best_model_name]['auc']:.4f})")

Typical results:

  • Logistic Regression: AUC 0.72-0.76 (good baseline, interpretable)
  • Random Forest: AUC 0.76-0.80 (better performance)
  • XGBoost: AUC 0.78-0.85 (usually wins, our recommendation)

Hyperparameter Tuning (XGBoost)

from sklearn.model_selection import GridSearchCV

# Parameter grid
param_grid = {
    'n_estimators': [100, 200, 300],
    'max_depth': [4, 6, 8],
    'learning_rate': [0.01, 0.05, 0.1],
    'subsample': [0.8, 1.0],
    'colsample_bytree': [0.8, 1.0]
}

# Grid search with cross-validation
xgb = XGBClassifier(random_state=42)
grid_search = GridSearchCV(
    xgb,
    param_grid,
    cv=5,
    scoring='roc_auc',
    n_jobs=-1,
    verbose=1
)

grid_search.fit(X_train, y_train)

print(f"Best parameters: {grid_search.best_params_}")
print(f"Best cross-validation AUC: {grid_search.best_score_:.4f}")

# Final model with best parameters
best_xgb = grid_search.best_estimator_

Warning: Grid search is computationally expensive (hours for large datasets). Start with default params, only tune if needed for marginal gains.

Evaluation Metrics: Beyond Accuracy

Imbalanced classes problem:

  • Churn rate = 8% → 92% non-churn
  • Naive model: Predict everyone as "not churned" → 92% accuracy!
  • But this model is useless (doesn't catch any churners)

Better metrics:

1. ROC-AUC (Area Under ROC Curve):

from sklearn.metrics import roc_auc_score, roc_curve
import matplotlib.pyplot as plt

auc = roc_auc_score(y_test, y_pred_proba)
print(f"AUC: {auc:.4f}")

# Plot ROC curve
fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)
plt.plot(fpr, tpr, label=f'AUC = {auc:.4f}')
plt.plot([0, 1], [0, 1], 'k--', label='Random')
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('ROC Curve')
plt.legend()
plt.show()
  • AUC = 0.5: Random guessing
  • AUC = 0.7-0.8: Good model
  • AUC = 0.8-0.9: Excellent model
  • AUC > 0.9: Suspiciously good (check for data leakage!)

2. Precision & Recall (Confusion Matrix):

from sklearn.metrics import precision_score, recall_score, f1_score, confusion_matrix

# Choose threshold (default 0.5 often not optimal)
threshold = 0.3  # Lower threshold = catch more churners, but more false alarms
y_pred = (y_pred_proba >= threshold).astype(int)

# Metrics
precision = precision_score(y_test, y_pred)
recall = recall_score(y_test, y_pred)
f1 = f1_score(y_test, y_pred)

print(f"Precision: {precision:.2%}")  # Of predicted churners, how many actually churned?
print(f"Recall: {recall:.2%}")  # Of actual churners, how many did we catch?
print(f"F1-score: {f1:.4f}")

# Confusion matrix
cm = confusion_matrix(y_test, y_pred)
print("\nConfusion Matrix:")
print(cm)

Example output:

Confusion Matrix:
                Predicted Non-Churn    Predicted Churn
Actual Non-Churn      8,500 (TN)          450 (FP)
Actual Churn            200 (FN)          850 (TP)

Precision: 65% (850 / (850+450)) → Of customers we flagged, 65% actually churned
Recall: 81% (850 / (850+200)) → We caught 81% of churners

Trade-off:

  • High precision (low FP): Fewer false alarms, but miss some churners
  • High recall (low FN): Catch most churners, but many false alarms

Business decision:

  • High cost of false positives (expensive retention campaigns): Prefer high precision
  • High cost of losing customers (high LTV): Prefer high recall

3. Precision-Recall curve (find optimal threshold):

from sklearn.metrics import precision_recall_curve

precisions, recalls, thresholds = precision_recall_curve(y_test, y_pred_proba)

# Plot
plt.plot(thresholds, precisions[:-1], label='Precision')
plt.plot(thresholds, recalls[:-1], label='Recall')
plt.xlabel('Threshold')
plt.ylabel('Score')
plt.title('Precision-Recall vs Threshold')
plt.legend()
plt.grid()
plt.show()

# Find threshold where precision = 0.7 (business requirement)
idx = (precisions >= 0.7).argmax()
optimal_threshold = thresholds[idx]
print(f"Threshold for 70% precision: {optimal_threshold:.4f}")
print(f"Recall at this threshold: {recalls[idx]:.2%}")

Feature Importance: What Drives Churn?

# XGBoost feature importance
feature_importance = pd.DataFrame({
    'feature': feature_cols,
    'importance': best_xgb.feature_importances_
}).sort_values('importance', ascending=False)

print(feature_importance.head(10))

# Plot
import matplotlib.pyplot as plt

feature_importance.head(15).plot(x='feature', y='importance', kind='barh', figsize=(10, 8))
plt.xlabel('Importance')
plt.title('Top 15 Features for Churn Prediction')
plt.gca().invert_yaxis()
plt.show()

Typical top features:

  1. days_since_last_purchase (20-25% importance)
  2. purchase_trend (15-18%)
  3. total_purchases (10-12%)
  4. logins_30d (8-10%)
  5. support_tickets_90d (6-8%)

Insights:

  • Behavioral changes (trends) more predictive than absolute values
  • Engagement metrics (logins, feature usage) strong signals
  • Support tickets = red flag

Step 4: Deployment - From Model to Action

Batch Scoring (Weekly Schedule)

import schedule
import time

def score_customers():
    """
    Score all active customers for churn risk
    Run weekly (every Monday 6 AM)
    """
    print(f"[{datetime.now()}] Starting weekly churn scoring...")

    # 1. Get active customers
    active_customers = get_active_customers()  # Query from database
    print(f"  Found {len(active_customers)} active customers")

    # 2. Engineer features
    prediction_date = datetime.now().date()
    features = []
    for customer_id in active_customers:
        feat = engineer_churn_features(customer_id, prediction_date, ...)
        features.append(feat)

    features_df = pd.DataFrame(features)
    X = features_df[feature_cols]

    # 3. Score with model
    churn_probabilities = best_xgb.predict_proba(X)[:, 1]

    # 4. Create results table
    results = pd.DataFrame({
        'customer_id': features_df['customer_id'],
        'prediction_date': prediction_date,
        'churn_probability': churn_probabilities,
        'risk_tier': pd.cut(churn_probabilities, bins=[0, 0.3, 0.7, 1.0], labels=['Low', 'Medium', 'High'])
    })

    # 5. Save to database
    save_to_database(results, table='churn_predictions')
    print(f"  Scored {len(results)} customers, saved to database")

    # 6. Trigger retention campaigns for high-risk
    high_risk = results[results['risk_tier'] == 'High']
    trigger_retention_campaigns(high_risk)
    print(f"  Triggered retention campaigns for {len(high_risk)} high-risk customers")

    print("[Completed] Churn scoring finished\n")

# Schedule weekly (every Monday at 6 AM)
schedule.every().monday.at("06:00").do(score_customers)

# Keep running
while True:
    schedule.run_pending()
    time.sleep(3600)  # Check every hour

Integration with CRM / Marketing Automation

Option 1: Database integration:

-- Analysts query churn scores directly
SELECT
    c.customer_id,
    c.email,
    c.name,
    cp.churn_probability,
    cp.risk_tier,
    cp.prediction_date
FROM customers c
JOIN churn_predictions cp ON c.customer_id = cp.customer_id
WHERE cp.prediction_date = CURRENT_DATE
    AND cp.risk_tier = 'High'
ORDER BY cp.churn_probability DESC
LIMIT 100;

Option 2: API endpoint (for real-time scoring):

from flask import Flask, request, jsonify
import joblib

app = Flask(__name__)

# Load model on startup
model = joblib.load('churn_model.pkl')

@app.route('/predict_churn', methods=['POST'])
def predict_churn():
    """
    API endpoint to score a single customer
    """
    data = request.json
    customer_id = data['customer_id']

    # Engineer features
    features = engineer_churn_features(customer_id, datetime.now().date(), ...)
    X = pd.DataFrame([features])[feature_cols]

    # Predict
    churn_prob = model.predict_proba(X)[0, 1]

    return jsonify({
        'customer_id': customer_id,
        'churn_probability': float(churn_prob),
        'risk_tier': 'High' if churn_prob > 0.7 else ('Medium' if churn_prob > 0.3 else 'Low')
    })

if __name__ == '__main__':
    app.run(host='0.0.0.0', port=5000)

Option 3: Reverse ETL (Hightouch, Census):

  • Churn scores in data warehouse → Sync to CRM (Salesforce, HubSpot)
  • Sales reps see churn risk in CRM interface

Retention Campaign Automation

def trigger_retention_campaigns(high_risk_customers):
    """
    Send personalized retention campaigns based on churn reasons
    """
    for _, customer in high_risk_customers.iterrows():
        customer_id = customer['customer_id']
        churn_prob = customer['churn_probability']

        # Analyze churn reasons (feature values)
        features = get_customer_features(customer_id)

        # Decision tree for campaign type
        if features['purchase_trend'] < -0.5:
            # Decreasing purchase frequency
            campaign_type = 'reactivation_discount'
            campaign_message = f"Nhớ bạn quá! Voucher 20% cho đơn hàng tiếp theo"

        elif features['support_tickets_90d'] > 2:
            # Support issues
            campaign_type = 'customer_success_call'
            campaign_message = "Đội Customer Success sẽ gọi để hỗ trợ bạn"

        elif features['logins_30d'] < 2:
            # Low engagement
            campaign_type = 'feature_education'
            campaign_message = "5 tính năng bạn chưa biết trong app"

        else:
            # General win-back
            campaign_type = 'loyalty_reward'
            campaign_message = "Ưu đãi đặc biệt cho khách hàng thân thiết"

        # Send campaign via marketing automation (Mailchimp, Braze, etc.)
        send_campaign(
            customer_id=customer_id,
            campaign_type=campaign_type,
            message=campaign_message,
            discount_percent=20 if 'discount' in campaign_type else 0
        )

        print(f"  Sent {campaign_type} to customer {customer_id} (churn prob: {churn_prob:.2%})")

Step 5: Monitoring & Continuous Improvement

Model Performance Monitoring

# Weekly evaluation on new data
def evaluate_model_performance():
    """
    Evaluate model on past week's predictions vs actual outcomes
    """
    # Get predictions from 60 days ago (when we predicted 60-day churn)
    prediction_date = datetime.now().date() - timedelta(days=60)
    predictions = get_predictions_from_date(prediction_date)

    # Get actual churn outcomes
    actual_churn = get_actual_churn_outcomes(prediction_date, prediction_date + timedelta(60))

    # Merge
    df = predictions.merge(actual_churn, on='customer_id')

    # Evaluate
    auc = roc_auc_score(df['actually_churned'], df['churn_probability'])
    precision = precision_score(df['actually_churned'], df['predicted_high_risk'])
    recall = recall_score(df['actually_churned'], df['predicted_high_risk'])

    print(f"[{datetime.now()}] Model Performance (last 60 days):")
    print(f"  AUC: {auc:.4f}")
    print(f"  Precision: {precision:.2%}")
    print(f"  Recall: {recall:.2%}")

    # Alert if performance degrades
    if auc < 0.70:  # Threshold
        send_alert("CHURN MODEL PERFORMANCE DEGRADED", details=f"AUC dropped to {auc:.4f}")

# Run monthly
schedule.every().month.at("01:00").do(evaluate_model_performance)

Business Impact Tracking

Metrics to track:

-- Overall churn rate (before vs after ML)
WITH monthly_churn AS (
  SELECT
    DATE_TRUNC('month', date) as month,
    COUNT(DISTINCT CASE WHEN churned = TRUE THEN customer_id END) as churned_customers,
    COUNT(DISTINCT customer_id) as total_customers,
    COUNT(DISTINCT CASE WHEN churned = TRUE THEN customer_id END)::FLOAT /
      COUNT(DISTINCT customer_id) * 100 as churn_rate
  FROM customer_status_daily
  GROUP BY month
)
SELECT * FROM monthly_churn
ORDER BY month DESC
LIMIT 12;

-- Retention campaign effectiveness
SELECT
    campaign_type,
    COUNT(*) as customers_targeted,
    SUM(CASE WHEN churned_60d = FALSE THEN 1 ELSE 0 END) as retained,
    SUM(CASE WHEN churned_60d = FALSE THEN 1 ELSE 0 END)::FLOAT / COUNT(*) * 100 as retention_rate
FROM retention_campaigns rc
JOIN customer_outcomes co ON rc.customer_id = co.customer_id
WHERE rc.sent_date >= '2024-01-01'
GROUP BY campaign_type
ORDER BY retention_rate DESC;

Dashboard metrics:

  • Monthly churn rate trend (target: -20% vs baseline)
  • Customers flagged as high-risk
  • Retention campaigns sent
  • Campaign success rate (% retained)
  • ROI (revenue saved vs campaign cost)

Model Retraining Schedule

When to retrain:

  1. Regular schedule: Every 3-6 months (to capture new patterns)
  2. Performance degradation: If AUC drops >5% from baseline
  3. Concept drift: Customer behavior changes (e.g., COVID impact)
  4. New features available: Additional data sources connected

Retraining workflow:

def retrain_churn_model():
    """
    Retrain model with latest data (last 18 months)
    """
    print(f"[{datetime.now()}] Starting model retraining...")

    # 1. Get fresh training data (last 18 months)
    end_date = datetime.now().date() - timedelta(days=60)  # Need 60 days for labels
    start_date = end_date - timedelta(days=540)  # 18 months

    # 2. Generate features & labels for all customers in this period
    # (Same feature engineering code as before)

    # 3. Train new model
    # (Same training code as before)

    # 4. Evaluate on holdout set
    # Compare new model vs old model

    # 5. If new model better → Deploy
    if new_model_auc > old_model_auc + 0.02:  # At least 2% improvement
        save_model(new_model, 'churn_model_v2.pkl')
        print(f"  New model deployed (AUC: {new_model_auc:.4f} vs {old_model_auc:.4f})")
    else:
        print(f"  New model not better, keeping old model")

# Run quarterly
schedule.every(3).months.do(retrain_churn_model)

Case Study: SaaS Company Giảm Churn 30%

Background:

  • Company: B2B SaaS (project management tool), ~5,000 customers
  • Baseline churn: 8% monthly (industry average 5-7%)
  • Annual revenue: 60B VND
  • Problem: Losing ~400 customers/month, no proactive retention

Implementation (4 months):

Month 1: Data preparation

  • Connected Stripe (subscriptions), Mixpanel (product usage), Zendesk (support tickets)
  • Built Data Warehouse (BigQuery) + dbt models
  • Historical data: 24 months

Month 2: Feature engineering & modeling

  • 28 features engineered (RFM, engagement, support tickets, behavioral trends)
  • Trained XGBoost model
  • AUC: 0.81 (excellent for first attempt)
  • Top features:
    1. days_since_last_login (22%)
    2. logins_30d_trend (18%)
    3. features_used_30d (15%)
    4. support_tickets_90d (12%)

Month 3: Deployment

  • Weekly batch scoring (every Monday)
  • CRM integration (Salesforce) - churn scores visible to CSMs
  • Automated email campaigns for high-risk (>70% churn prob)

Month 4: Optimization

  • A/B test: 50% control (no intervention), 50% treatment (retention campaigns)
  • Campaign types:
    • Feature education: Video tutorials for underutilized features
    • Customer success call: Proactive check-in for high-risk accounts
    • Discount offers: 20% off next 3 months for at-risk customers

Results after 6 months:

MetricBeforeAfterChange
Monthly churn rate8.0%5.5%-31% (↓2.5pp)
Customers lost/month400275-125 customers
High-risk customers identifiedN/A600/monthNew
Retention campaign success rateN/A35%210 saved/month
Annual revenue savedN/A18B VND(125 × 12 × 12M LTV)
Campaign costN/A2B VNDDiscounts + CS time
Net benefitN/A16B VND/year8x ROI

Key learnings:

  • Behavioral changes > absolute metrics: Login trend more predictive than absolute login count
  • Support tickets = strong signal: Customers with 2+ tickets in 90 days had 3x churn rate
  • Discount not always needed: 60% retained with feature education alone (cheaper than discounts)
  • Timing matters: Intervene when churn probability 40-70% (still salvageable)

Common Pitfalls & How to Avoid

1. Chasing perfect model, ignoring business impact:

  • Spending 3 months to improve AUC from 0.80 → 0.83
  • Better: Deploy 0.80 model quickly, measure business impact, iterate

2. Not A/B testing:

  • Assume churn reduction due to ML, but could be seasonality, product changes
  • Solution: Always have control group (no intervention)

3. Retention campaigns too generic:

  • Same "20% discount" email to everyone
  • Better: Personalize based on churn reasons (support issues → CS call, low engagement → education)

4. Forgetting about model maintenance:

  • Train once in 2023, use in 2025 → Performance degrades
  • Solution: Quarterly retraining, monthly performance monitoring

5. Over-reliance on model:

  • "Model says low churn risk, ignore this customer complaint"
  • Solution: ML assists, humans decide. Customer Success Managers have final say.

Kết Luận: Churn Prediction = Proven ROI

Customer churn prediction là một trong những ML use cases dễ justify ROI nhất:

  • Clear metric: Churn rate reduction (%)
  • Fast time-to-value: 3-6 months pilot → production
  • High impact: 20-30% churn reduction = tens of billions VND for mid-size companies
  • Transferable: Learnings apply to other prediction problems (upsell, LTV prediction)

Implementation roadmap:

  1. Month 1: Data preparation (Data Warehouse, historical data)
  2. Month 2: Feature engineering, model training
  3. Month 3: Deployment, CRM integration
  4. Month 4-6: A/B testing, optimization
  5. Ongoing: Monitoring, retraining, scaling

Next steps:

  • Assess data readiness (have 12+ months history? Clean data?)
  • Define churn appropriately for your business
  • Start with pilot (top 20% customers by revenue)
  • Liên hệ Carptech nếu cần support end-to-end ML project (carptech.vn/contact)

Tài liệu tham khảo:


Bài viết này là phần 2 của series "Advanced Analytics & AI/ML" tháng 5. Đọc thêm về Analytics Maturity, Recommendation Systems, và Demand Forecasting.

Carptech - Data Platform & ML Solutions for Vietnamese Enterprises. Liên hệ tư vấn miễn phí.

Có câu hỏi về Data Platform?

Đội ngũ chuyên gia của Carptech sẵn sàng tư vấn miễn phí về giải pháp phù hợp nhất cho doanh nghiệp của bạn. Đặt lịch tư vấn 60 phút qua Microsoft Teams hoặc gửi form liên hệ.

✓ Miễn phí 100% • ✓ Microsoft Teams • ✓ Không cam kết dài hạn