Customer Segmentation: Kỹ Thuật Nâng Cao để Personalization & Targeting
TL;DR
Customer Segmentation là việc chia khách hàng thành các nhóm (segments) có đặc điểm, hành vi, hoặc nhu cầu tương tự nhau để personalize marketing, product, và customer experience.
Tại sao quan trọng?
- Personalization: "Blast email to all" có CTR ~2%, segmented email có CTR ~15%
- ROI: Marketing đến đúng segment → CAC giảm 30-50%
- Product: Build features cho segments có LTV cao nhất
Các phương pháp Segmentation phổ biến:
| Method | Based On | Use Case | Difficulty |
|---|---|---|---|
| Demographic | Age, gender, income | B2C marketing | Easy |
| Geographic | Location, timezone | Localization, logistics | Easy |
| Firmographic | Company size, industry | B2B sales | Medium |
| Behavioral | Actions, usage patterns | Product personalization | Medium |
| RFM | Recency, Frequency, Monetary | E-commerce retention | Medium |
| Psychographic | Values, lifestyle, interests | Brand positioning | Hard |
| ML Clustering | All of above | Advanced personalization | Hard |
RFM Analysis (phổ biến nhất cho E-commerce):
- Recency: Mua hàng gần đây nhất khi nào?
- Frequency: Mua hàng bao nhiêu lần?
- Monetary: Tổng chi tiêu bao nhiêu?
→ 11 segments: Champions, Loyal Customers, At Risk, Lost, ...
Example: Amazon dùng behavioral segmentation → Recommend products → 35% revenue từ recommendations.
Customer Segmentation Là Gì?
Định nghĩa
Customer Segmentation là quá trình chia customer base thành các nhóm nhỏ hơn (segments) dựa trên shared characteristics, behaviors, hoặc needs.
Mục đích:
- Personalization: Tailor messaging, offers, product cho từng segment
- Prioritization: Focus resources vào high-value segments
- Understanding: Hiểu sâu từng nhóm khách hàng
One-Size-Fits-All vs Segmentation
Vấn đề với One-Size-Fits-All:
Email campaign: "50% off all products!"
Results:
- New customers: 5% conversion (giá rẻ attract)
- High-value customers: 1% conversion (không cần discount, thậm chí bị offended)
- At-risk customers: 2% conversion (cần re-engagement, không phải discount)
Overall: 2.5% conversion
Với Segmentation:
Segment 1: New customers
→ Email: "Welcome! 20% off first purchase"
→ Conversion: 15%
Segment 2: High-value customers
→ Email: "Exclusive early access to new collection"
→ Conversion: 12%
Segment 3: At-risk customers
→ Email: "We miss you! Here's what's new + personalized recommendations"
→ Conversion: 8%
Overall: 11.7% conversion (gấp 4.7x!)
When to Segment
Nên segment khi:
- ✅ Có đủ customers (>1,000 active customers)
- ✅ Customer base đa dạng (không phải tất cả giống nhau)
- ✅ Có resources để personalize cho từng segment
- ✅ Có data để segment (behavioral, demographic, ...)
KHÔNG cần segment khi:
- ❌ Quá ít customers (<500)
- ❌ Customers rất homogeneous
- ❌ Không có cách personalize (product/messaging giống nhau cho tất cả)
Phương Pháp Segmentation
1. Demographic Segmentation (Đơn giản nhất)
Attributes:
- Age, gender, income
- Education, occupation
- Marital status, family size
SQL Example:
SELECT
CASE
WHEN age BETWEEN 18 AND 24 THEN 'Gen Z'
WHEN age BETWEEN 25 AND 40 THEN 'Millennial'
WHEN age BETWEEN 41 AND 56 THEN 'Gen X'
WHEN age > 56 THEN 'Boomer'
END AS age_segment,
gender,
COUNT(*) AS customer_count,
AVG(lifetime_value) AS avg_ltv
FROM customers
GROUP BY age_segment, gender
ORDER BY avg_ltv DESC;
Use Cases:
- Fashion e-commerce: Segment by gender, age → Personalize homepage
- Banking: Segment by income → Product recommendations (savings vs investment)
Limitations: Demographic không predict behavior tốt (2 người cùng tuổi, cùng giới tính có thể hoàn toàn khác nhau).
2. Geographic Segmentation
Attributes:
- Country, city, state
- Timezone
- Urban vs Rural
SQL Example:
SELECT
country,
city,
COUNT(*) AS customer_count,
SUM(total_revenue) AS total_revenue,
AVG(avg_order_value) AS avg_aov
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY country, city
ORDER BY total_revenue DESC
LIMIT 20;
Use Cases:
- E-commerce: Shipping zones, localized pricing
- Food delivery: Restaurants, delivery times theo location
- SaaS: Language, timezone cho support
3. Firmographic Segmentation (B2B)
Attributes:
- Company size (# employees)
- Industry
- Revenue
- Tech stack
SQL Example:
SELECT
CASE
WHEN employee_count < 10 THEN 'Micro (1-10)'
WHEN employee_count BETWEEN 10 AND 50 THEN 'Small (10-50)'
WHEN employee_count BETWEEN 51 AND 200 THEN 'Medium (51-200)'
ELSE 'Enterprise (200+)'
END AS company_size,
industry,
COUNT(*) AS company_count,
AVG(mrr) AS avg_mrr,
AVG(churn_rate) AS avg_churn
FROM companies
GROUP BY company_size, industry;
Use Cases:
- SaaS: Pricing tiers, features theo company size
- Sales: Prioritize enterprise leads
- Onboarding: Tailored flow cho SMB vs Enterprise
4. Behavioral Segmentation (Mạnh nhất)
Attributes:
- Purchase history
- Product usage patterns
- Feature adoption
- Engagement level
Example: Engagement-based Segmentation:
WITH user_activity AS (
SELECT
user_id,
COUNT(DISTINCT DATE(event_timestamp)) AS active_days_last_30,
COUNT(*) AS total_events_last_30
FROM events
WHERE event_timestamp >= CURRENT_DATE - INTERVAL '30 days'
GROUP BY user_id
)
SELECT
CASE
WHEN active_days_last_30 >= 20 THEN 'Power User'
WHEN active_days_last_30 BETWEEN 10 AND 19 THEN 'Regular User'
WHEN active_days_last_30 BETWEEN 1 AND 9 THEN 'Casual User'
ELSE 'Inactive'
END AS engagement_segment,
COUNT(*) AS user_count,
AVG(total_events_last_30) AS avg_events
FROM user_activity
GROUP BY engagement_segment;
Use Cases:
- SaaS: Power users → Upsell premium features
- E-commerce: Frequent buyers → VIP program
- Mobile app: Casual users → Re-engagement campaigns
RFM Analysis (Recency, Frequency, Monetary)
Giới thiệu RFM
RFM là phương pháp segment customers dựa trên 3 metrics:
- Recency (R): Mua hàng gần đây nhất khi nào? (days since last purchase)
- Frequency (F): Mua hàng bao nhiêu lần? (total orders)
- Monetary (M): Tổng chi tiêu bao nhiêu? (total revenue)
Tại sao RFM hiệu quả?
- Simple, interpretable
- Chỉ cần transaction data (không cần ML)
- Proven effectiveness (80+ năm lịch sử trong direct marketing)
SQL để tính RFM
WITH rfm_calc AS (
SELECT
customer_id,
-- Recency: số ngày kể từ lần mua cuối
DATEDIFF(CURRENT_DATE, MAX(order_date)) AS recency,
-- Frequency: tổng số đơn hàng
COUNT(DISTINCT order_id) AS frequency,
-- Monetary: tổng revenue
SUM(order_value) AS monetary
FROM orders
WHERE order_date >= CURRENT_DATE - INTERVAL '365 days' -- Last 1 year
GROUP BY customer_id
),
rfm_scores AS (
SELECT
customer_id,
recency,
frequency,
monetary,
-- RFM Scores (1-5, 5 = best)
-- Recency: Càng gần đây, score càng cao
NTILE(5) OVER (ORDER BY recency DESC) AS r_score,
-- Frequency: Càng nhiều lần, score càng cao
NTILE(5) OVER (ORDER BY frequency ASC) AS f_score,
-- Monetary: Càng chi nhiều, score càng cao
NTILE(5) OVER (ORDER BY monetary ASC) AS m_score
FROM rfm_calc
)
SELECT
customer_id,
recency,
frequency,
monetary,
r_score,
f_score,
m_score,
CONCAT(r_score, f_score, m_score) AS rfm_score -- e.g., '555' = best customers
FROM rfm_scores;
RFM Segments
11 segments phổ biến:
WITH rfm_segments AS (
SELECT
*,
-- Calculate RFM segment based on scores
CASE
WHEN r_score >= 4 AND f_score >= 4 AND m_score >= 4 THEN 'Champions'
WHEN r_score >= 3 AND f_score >= 3 AND m_score >= 3 THEN 'Loyal Customers'
WHEN r_score >= 4 AND f_score <= 2 AND m_score <= 2 THEN 'New Customers'
WHEN r_score >= 3 AND f_score <= 2 AND m_score <= 2 THEN 'Promising'
WHEN r_score >= 3 AND f_score >= 3 AND m_score <= 2 THEN 'Need Attention'
WHEN r_score >= 4 AND f_score <= 1 AND m_score <= 1 THEN 'About to Sleep'
WHEN r_score <= 2 AND f_score >= 3 AND m_score >= 3 THEN 'At Risk'
WHEN r_score <= 2 AND f_score >= 4 AND m_score >= 4 THEN 'Cant Lose Them'
WHEN r_score <= 1 AND f_score >= 2 AND m_score >= 2 THEN 'Hibernating'
WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN 'Lost'
ELSE 'Others'
END AS rfm_segment
FROM rfm_scores
)
SELECT
rfm_segment,
COUNT(*) AS customer_count,
AVG(recency) AS avg_recency,
AVG(frequency) AS avg_frequency,
AVG(monetary) AS avg_monetary,
SUM(monetary) AS total_revenue
FROM rfm_segments
GROUP BY rfm_segment
ORDER BY total_revenue DESC;
Kết quả ví dụ:
| Segment | Count | Avg Recency | Avg Frequency | Avg Monetary | Total Revenue |
|---|---|---|---|---|---|
| Champions | 500 | 15 days | 25 orders | $5,000 | $2.5M |
| Loyal Customers | 1,200 | 45 days | 15 orders | $2,500 | $3M |
| At Risk | 800 | 150 days | 18 orders | $3,000 | $2.4M |
| New Customers | 2,000 | 10 days | 1 order | $100 | $200K |
| Lost | 3,000 | 300 days | 8 orders | $1,200 | $3.6M |
Actions cho từng RFM Segment
Champions (R=5, F=5, M=5):
- ✅ Action: Reward loyalty, early access, VIP treatment
- 📧 Email: "Thank you! Exclusive sneak peek at our new collection"
- 🎁 Offer: Free shipping forever, beta access
Loyal Customers (R=3-5, F=3-5, M=3-5):
- ✅ Action: Upsell, cross-sell
- 📧 Email: "You might also like..." recommendations
- 🎁 Offer: Bundle deals, loyalty points
At Risk (R=1-2, F=3-5, M=3-5):
- ⚠️ Action: Re-engage ASAP! (high-value customers leaving)
- 📧 Email: "We miss you! What can we do better?" + personalized offers
- 🎁 Offer: Win-back discount (15-25%), survey incentive
New Customers (R=4-5, F=1, M=1-2):
- 🎯 Action: Onboarding, second purchase
- 📧 Email: "Welcome! Here's how to get the most from [product]"
- 🎁 Offer: Second purchase discount (10%)
Lost (R=1, F=1-2, M=1-2):
- 🗑️ Action: Aggressive win-back or let go
- 📧 Email: "Last chance! 30% off to come back"
- 🎁 Offer: Deep discount (30-50%) or remove from list
K-Means Clustering (Machine Learning Approach)
Khi nào dùng K-Means?
RFM rất tốt cho e-commerce, nhưng K-Means mạnh hơn khi:
- Nhiều features (>3): Demographics + Behavioral + Firmographic
- Patterns phức tạp, không rõ ràng
- Muốn discover hidden segments
Python Implementation
import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import seaborn as sns
# Load customer data
df = pd.read_sql("""
SELECT
customer_id,
recency,
frequency,
monetary,
avg_order_value,
total_sessions,
avg_session_duration,
products_viewed,
cart_abandonment_rate
FROM customer_features
""", conn)
# Features to cluster on
features = ['recency', 'frequency', 'monetary', 'avg_order_value',
'total_sessions', 'avg_session_duration', 'products_viewed',
'cart_abandonment_rate']
X = df[features]
# Standardize features (important for K-Means!)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)
# Elbow method to find optimal K
inertias = []
K_range = range(2, 11)
for k in K_range:
kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)
kmeans.fit(X_scaled)
inertias.append(kmeans.inertia_)
# Plot elbow curve
plt.figure(figsize=(10, 6))
plt.plot(K_range, inertias, 'bo-')
plt.xlabel('Number of Clusters (K)')
plt.ylabel('Inertia')
plt.title('Elbow Method for Optimal K')
plt.savefig('elbow_curve.png')
# Based on elbow, choose K=5
optimal_k = 5
kmeans = KMeans(n_clusters=optimal_k, random_state=42, n_init=10)
df['cluster'] = kmeans.fit_predict(X_scaled)
# Analyze clusters
cluster_summary = df.groupby('cluster')[features].mean()
print(cluster_summary)
# Cluster sizes
print("\nCluster sizes:")
print(df['cluster'].value_counts().sort_index())
Output ví dụ:
Cluster Summary:
recency frequency monetary avg_order_value total_sessions
cluster
0 15 25 5000 200 150
1 45 15 2500 170 100
2 150 18 3000 165 80
3 10 1 100 100 20
4 300 8 1200 150 30
Cluster sizes:
0 500 # High-value, engaged
1 1200 # Regular customers
2 800 # At-risk
3 2000 # New/low-value
4 3000 # Churned
Cluster Profiling & Naming
# Profile each cluster
for cluster_id in range(optimal_k):
cluster_data = df[df['cluster'] == cluster_id]
print(f"\n{'='*60}")
print(f"Cluster {cluster_id}")
print(f"{'='*60}")
print(f"Size: {len(cluster_data):,} customers ({len(cluster_data)/len(df)*100:.1f}%)")
print(f"Avg Recency: {cluster_data['recency'].mean():.0f} days")
print(f"Avg Frequency: {cluster_data['frequency'].mean():.1f} orders")
print(f"Avg Monetary: ${cluster_data['monetary'].mean():,.0f}")
print(f"Total Revenue: ${cluster_data['monetary'].sum():,.0f}")
# Assign meaningful names based on characteristics
if cluster_data['recency'].mean() < 30 and cluster_data['frequency'].mean() > 20:
cluster_name = "Champions"
elif cluster_data['recency'].mean() < 60 and cluster_data['frequency'].mean() > 10:
cluster_name = "Loyal Customers"
elif cluster_data['recency'].mean() > 120 and cluster_data['frequency'].mean() > 10:
cluster_name = "At Risk"
elif cluster_data['recency'].mean() < 30 and cluster_data['frequency'].mean() <= 2:
cluster_name = "New Customers"
else:
cluster_name = "Lost/Churned"
print(f"Suggested Name: {cluster_name}")
Visualize Clusters
from sklearn.decomposition import PCA
# Reduce to 2D for visualization
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)
# Plot
plt.figure(figsize=(12, 8))
scatter = plt.scatter(X_pca[:, 0], X_pca[:, 1], c=df['cluster'],
cmap='viridis', alpha=0.6, s=50)
plt.xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.1%} variance)')
plt.ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.1%} variance)')
plt.title('Customer Segments (K-Means Clustering)')
plt.colorbar(scatter, label='Cluster')
plt.savefig('clusters_visualization.png', dpi=300)
Case Studies
Case Study 1: Shopee Vietnam - RFM Segmentation
Context:
- E-commerce platform với 10M+ users
- Trước đây: Blast promotions to all users
- Vấn đề: Low engagement (2% email open rate), high unsubscribe
Implementation:
Step 1: RFM Calculation
-- Tính RFM cho 10M users
WITH rfm AS (
SELECT
user_id,
DATEDIFF(CURRENT_DATE, MAX(order_date)) AS recency,
COUNT(DISTINCT order_id) AS frequency,
SUM(order_value) AS monetary
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY user_id
)
-- ... (scoring như ở trên)
Step 2: Segment Actions
| Segment | Size | Action | Channel | Offer |
|---|---|---|---|---|
| Champions | 500K (5%) | Reward | Email, App push | Free shipping voucher, early access |
| Loyal | 1.5M (15%) | Upsell | Bundle deals, 10% off | |
| At Risk | 800K (8%) | Win-back | Email, SMS, Ads | 20% discount, personalized recs |
| New | 3M (30%) | Onboarding | App, Email | Second purchase 15% off |
| Lost | 4.2M (42%) | Aggressive win-back or suppress | SMS (one-time) | 40% off or remove |
Step 3: Personalized Campaigns
Champions Email:
Subject: [VIP] Bạn được chọn trước 24h! 🌟
Body: Cảm ơn vì đã là khách hàng thân thiết!
Sản phẩm mới sẽ mở bán ngày mai, nhưng bạn có thể mua NGAY BÂY GIỜ.
[CTA: Mua Ngay]
At Risk Email:
Subject: Chúng mình nhớ bạn! 💔 Voucher 20% đặc biệt
Body: Bạn đã không mua hàng 3 tháng rồi.
Có gì chúng mình làm chưa tốt không?
Đây là voucher 20% để bạn quay lại thử.
[CTA: Mua Hàng] [CTA: Góp Ý]
Results (3 tháng sau):
| Metric | Before | After | Change |
|---|---|---|---|
| Email Open Rate | 2% | 18% | +800% |
| Click Rate | 0.5% | 4.2% | +740% |
| Conversion Rate | 1.2% | 6.8% | +467% |
| Unsubscribe Rate | 0.8% | 0.2% | -75% |
| Revenue/Email | $0.15 | $1.20 | +700% |
Impact: Revenue tăng $12M/năm chỉ từ email marketing.
Case Study 2: SaaS Startup - Behavioral Clustering
Context:
- B2B SaaS (project management tool)
- 50K users, muốn tăng upgrade rate từ Free → Paid
Analysis: K-Means clustering với features:
features = [
'projects_created',
'tasks_created',
'team_members_invited',
'integrations_used',
'api_calls',
'support_tickets',
'days_active_last_30'
]
Discovered 4 Segments:
Cluster 0: "Power Users" (5%, 2.5K users)
- Nhiều projects (avg 15)
- Nhiều team members (avg 20)
- Dùng integrations (avg 5)
- Already on Paid plan → Action: Upsell Enterprise
Cluster 1: "Growing Teams" (15%, 7.5K users)
- Medium projects (avg 5)
- Medium team (avg 8)
- Ít integrations (avg 1)
- Free plan, hitting limits → Action: Upgrade campaign
Cluster 2: "Solo Users" (60%, 30K users)
- Ít projects (avg 2)
- Không invite team members
- Không dùng integrations
- Low engagement → Action: Educational content, onboarding
Cluster 3: "Churned" (20%, 10K users)
- Created account nhưng không dùng
- Days active = 0
- Never activated → Action: Re-onboarding campaign
Targeted Actions:
Growing Teams (high-potential):
- Email: "Your team is growing! Upgrade to unlock unlimited projects"
- In-app: "You're using 4/5 projects. Upgrade now to avoid limits"
- Offer: 20% off first year
- Result: 25% upgrade rate (vs 2% baseline)
Solo Users (education):
- Drip email series: "How to collaborate with your team"
- Webinar: "Project management best practices"
- Result: 12% invited first team member → 40% of those upgraded
Impact:
- MRR tăng từ $200K → $350K (+75%) trong 6 tháng
- Focus efforts vào "Growing Teams" (highest ROI)
Case Study 3: Netflix - Taste Clusters
Context (based on public info):
- Netflix có 2,000+ "taste clusters" (micro-segments)
- Mỗi user thuộc multiple clusters
Example Clusters:
- "Romantic Comedy Enthusiasts"
- "Dark European Crime Dramas"
- "Family-Friendly Animated Films"
- "Sci-Fi Action Blockbusters"
How it works:
# Simplified example (real system much more complex)
# User viewing history
user_watched = [
"Stranger Things",
"Breaking Bad",
"Narcos",
"Ozark",
# ...
]
# Each show has genre vectors
show_vectors = {
"Stranger Things": [0.8_scifi, 0.6_drama, 0.4_horror, ...],
"Breaking Bad": [0.9_drama, 0.7_crime, 0.3_thriller, ...],
# ...
}
# Aggregate to user taste profile
user_profile = average(show_vectors for all watched shows)
# Find similar users (KNN)
similar_users = find_k_nearest_neighbors(user_profile, k=100)
# Recommend what similar users watched
recommendations = [shows watched by similar_users but not by user]
Impact:
- 75% of watched content comes from recommendations
- Segmentation giúp tăng retention, giảm churn
- Tiết kiệm $1B/năm (estimated) trong marketing costs
Advanced Techniques
1. Hierarchical Clustering
Khi nào dùng: Muốn nested segments (e.g., "Loyal Customers" → "High Spenders" + "Frequent Buyers").
from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.cluster import AgglomerativeClustering
# Hierarchical clustering
linkage_matrix = linkage(X_scaled, method='ward')
# Plot dendrogram
plt.figure(figsize=(15, 8))
dendrogram(linkage_matrix)
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('Customer')
plt.ylabel('Distance')
plt.savefig('dendrogram.png')
# Cut at specific height or number of clusters
clustering = AgglomerativeClustering(n_clusters=5, linkage='ward')
df['h_cluster'] = clustering.fit_predict(X_scaled)
2. DBSCAN (Density-Based)
Advantage: Tự động phát hiện số clusters, xử lý noise/outliers tốt.
from sklearn.cluster import DBSCAN
# DBSCAN clustering
dbscan = DBSCAN(eps=0.5, min_samples=10)
df['db_cluster'] = dbscan.fit_predict(X_scaled)
# -1 = noise/outliers
print(f"Outliers: {sum(df['db_cluster'] == -1)} ({sum(df['db_cluster'] == -1)/len(df)*100:.1f}%)")
print(f"Clusters found: {df['db_cluster'].nunique() - 1}") # -1 not counted
3. Cohort + Segment Analysis
Combine cohort analysis với segmentation:
WITH user_cohorts AS (
SELECT
user_id,
DATE_TRUNC('month', created_at) AS cohort_month
FROM users
),
user_segments AS (
-- RFM or cluster segments
SELECT user_id, segment
FROM rfm_segments
)
SELECT
uc.cohort_month,
us.segment,
COUNT(DISTINCT uc.user_id) AS users,
AVG(retention_rate_month_3) AS avg_retention
FROM user_cohorts uc
JOIN user_segments us ON uc.user_id = us.user_id
GROUP BY uc.cohort_month, us.segment
ORDER BY uc.cohort_month, avg_retention DESC;
Insights: "Champions" cohort Jan 2025 có retention 85%, "New Customers" chỉ 40%.
Activation & Personalization
1. Email Personalization
# Segment-specific email templates
email_templates = {
'Champions': {
'subject': '[VIP] Exclusive Early Access for Our Best Customers',
'body': 'As one of our top customers, you deserve the best...',
'offer': 'free_shipping_lifetime',
},
'At Risk': {
'subject': 'We Miss You! 25% Off to Welcome You Back',
'body': 'It\'s been a while since your last purchase...',
'offer': 'discount_25',
},
'New Customers': {
'subject': 'Welcome! Here\'s 10% Off Your Second Order',
'body': 'Thanks for your first purchase! Here\'s a gift...',
'offer': 'discount_10_second_purchase',
},
}
# Send emails
for segment, customers in segments.items():
template = email_templates[segment]
for customer in customers:
send_email(
to=customer.email,
subject=template['subject'],
body=template['body'].format(name=customer.name),
offer=template['offer']
)
2. Product Recommendations
# Segment-based recommendations
if user.segment == 'Champions':
# Show premium/new products
recommendations = get_new_arrivals() + get_premium_products()
elif user.segment == 'At Risk':
# Show products they previously viewed
recommendations = get_previously_viewed(user) + get_similar_products(user)
elif user.segment == 'New Customers':
# Show bestsellers
recommendations = get_bestsellers()
3. Dynamic Pricing (Controversial)
Ethical approach: Discounts for at-risk, not price hikes for loyal.
# Segment-based offers
if user.segment == 'At Risk':
offer = '25% off' # Win-back
elif user.segment == 'New Customers':
offer = '10% off second purchase' # Encourage repeat
elif user.segment == 'Champions':
offer = 'Free express shipping' # Non-price benefit
else:
offer = None # Regular pricing
Best Practices
1. Start Simple, Iterate
Phase 1: Demographic/Geographic (Week 1)
Phase 2: RFM Analysis (Week 2-3)
Phase 3: Behavioral segments (Month 2)
Phase 4: ML Clustering (Month 3+)
2. Actionable Segments
Bad Segment: "Users who like both cats and dogs"
- Cute, nhưng không actionable (làm gì với insight này?)
Good Segment: "At-Risk High-Value Customers"
- Actionable: Win-back campaign với personalized offers
3. Segment Size Balance
Too many segments: 50 segments → Không đủ resources để personalize Too few segments: 2 segments → Bỏ lỡ nuances
Rule of thumb: 5-10 segments cho most businesses.
4. Regular Re-Segmentation
Customer behaviors change → Re-segment quarterly.
-- Monthly job: Recalculate RFM scores
-- Move customers between segments
-- Update in CRM/Email marketing tool
5. Privacy & Ethics
- ✅ Disclose segmentation trong Privacy Policy
- ✅ Allow opt-out from personalized marketing
- ❌ Không discriminate (e.g., higher prices for certain demographics)
- ❌ Không dùng sensitive data (health, religion, ...)
Tools & Stack
1. Analysis & Segmentation
SQL + Python:
- BigQuery/Snowflake: Compute RFM, segments
- Python (Pandas, Scikit-learn): Clustering
- Jupyter Notebooks: Exploratory analysis
BI Tools:
- Looker, Tableau, Metabase: Visualize segments
- Dashboards: Monitor segment sizes, metrics over time
2. Activation
Email/Marketing Automation:
- Braze ($$$): Sophisticated segmentation, multi-channel
- Customer.io ($$): Behavioral triggers, segment-based campaigns
- Mailchimp ($): Basic segmentation
CDP (Customer Data Platform):
- Segment ($$): Collect data, sync segments to tools
- RudderStack ($): Open-source alternative
- mParticle ($$$): Enterprise CDP
3. Recommendations
Built-in:
- Build với Collaborative Filtering (Python)
- Use dbt for segment-based rec logic
Tools:
- Algolia Recommend ($$)
- Dynamic Yield ($$$)
- Amazon Personalize (Pay-as-you-go)
Kết Luận
Customer Segmentation là nền tảng của personalization và targeted marketing. Thay vì treat all customers the same, segmentation cho phép bạn:
Key Takeaways
- Start với RFM (simplest, most effective cho e-commerce)
- Layer behavioral data khi có đủ resources
- Use ML clustering cho advanced personalization (khi đủ data + team)
- Make segments actionable → Mỗi segment cần clear action
- Re-segment regularly → Customers change, segments phải update
- Measure impact → Track metrics by segment (LTV, retention, conversion)
Metrics to Track
| Metric | Definition | Goal |
|---|---|---|
| Segment Conversion Rate | Conversion rate per segment | Increase over time |
| Segment LTV | Lifetime value per segment | Grow high-value segments |
| Segment Movement | % customers moving up/down | More moving up |
| Campaign ROI by Segment | ROI of campaigns per segment | Focus on high-ROI |
Next Steps
Sau khi master Customer Segmentation:
- Customer Churn Prediction: Predict churn, LTV per segment
- Cohort Analysis: Phân tích hành vi theo nhóm
- Attribution Modeling: Multi-touch marketing attribution
Carptech - Giải Pháp Customer Segmentation cho Doanh Nghiệp Việt Nam
Tại Carptech, chúng tôi giúp doanh nghiệp Việt Nam xây dựng segmentation strategy:
Dịch vụ của chúng tôi
- Segmentation Analysis: RFM, behavioral, ML clustering
- Data Pipeline: Automated segment calculation, daily updates
- Activation Support: Integrate với email tools, CRM, ad platforms
- Dashboards: Real-time segment monitoring (Looker, Metabase)
Case Studies
- E-commerce: RFM segmentation, personalized campaigns → Revenue +40%
- SaaS: Behavioral clustering → Upgrade rate tăng 3x
- Fintech: Segment-based product recommendations → Cross-sell +60%
Liên hệ: https://carptech.vn
Bài viết được viết bởi Carptech Team - Chuyên gia về Data Platform & Analytics tại Việt Nam.




