Customer Segmentation: Kỹ Thuật Nâng Cao để Personalization & Targeting

TL;DR

Customer Segmentation là việc chia khách hàng thành các nhóm (segments) có đặc điểm, hành vi, hoặc nhu cầu tương tự nhau để personalize marketing, product, và customer experience.

Tại sao quan trọng?

Personalization: "Blast email to all" có CTR ~2%, segmented email có CTR ~15%
ROI: Marketing đến đúng segment → CAC giảm 30-50%
Product: Build features cho segments có LTV cao nhất

Các phương pháp Segmentation phổ biến:

Method	Based On	Use Case	Difficulty
Demographic	Age, gender, income	B2C marketing	Easy
Geographic	Location, timezone	Localization, logistics	Easy
Firmographic	Company size, industry	B2B sales	Medium
Behavioral	Actions, usage patterns	Product personalization	Medium
RFM	Recency, Frequency, Monetary	E-commerce retention	Medium
Psychographic	Values, lifestyle, interests	Brand positioning	Hard
ML Clustering	All of above	Advanced personalization	Hard

RFM Analysis (phổ biến nhất cho E-commerce):

Recency: Mua hàng gần đây nhất khi nào?
Frequency: Mua hàng bao nhiêu lần?
Monetary: Tổng chi tiêu bao nhiêu?

→ 11 segments: Champions, Loyal Customers, At Risk, Lost, ...

Example: Amazon dùng behavioral segmentation → Recommend products → 35% revenue từ recommendations.

Customer Segmentation Là Gì?

Định nghĩa

Customer Segmentation là quá trình chia customer base thành các nhóm nhỏ hơn (segments) dựa trên shared characteristics, behaviors, hoặc needs.

Mục đích:

Personalization: Tailor messaging, offers, product cho từng segment
Prioritization: Focus resources vào high-value segments
Understanding: Hiểu sâu từng nhóm khách hàng

One-Size-Fits-All vs Segmentation

Vấn đề với One-Size-Fits-All:

Email campaign: "50% off all products!"

Results:
- New customers: 5% conversion (giá rẻ attract)
- High-value customers: 1% conversion (không cần discount, thậm chí bị offended)
- At-risk customers: 2% conversion (cần re-engagement, không phải discount)

Overall: 2.5% conversion

Với Segmentation:

Segment 1: New customers
→ Email: "Welcome! 20% off first purchase"
→ Conversion: 15%

Segment 2: High-value customers
→ Email: "Exclusive early access to new collection"
→ Conversion: 12%

Segment 3: At-risk customers
→ Email: "We miss you! Here's what's new + personalized recommendations"
→ Conversion: 8%

Overall: 11.7% conversion (gấp 4.7x!)

When to Segment

Nên segment khi:

✅ Có đủ customers (>1,000 active customers)
✅ Customer base đa dạng (không phải tất cả giống nhau)
✅ Có resources để personalize cho từng segment
✅ Có data để segment (behavioral, demographic, ...)

KHÔNG cần segment khi:

❌ Quá ít customers (<500)
❌ Customers rất homogeneous
❌ Không có cách personalize (product/messaging giống nhau cho tất cả)

Phương Pháp Segmentation

1. Demographic Segmentation (Đơn giản nhất)

Attributes:

Age, gender, income
Education, occupation
Marital status, family size

SQL Example:

SELECT
  CASE
    WHEN age BETWEEN 18 AND 24 THEN 'Gen Z'
    WHEN age BETWEEN 25 AND 40 THEN 'Millennial'
    WHEN age BETWEEN 41 AND 56 THEN 'Gen X'
    WHEN age > 56 THEN 'Boomer'
  END AS age_segment,
  gender,
  COUNT(*) AS customer_count,
  AVG(lifetime_value) AS avg_ltv
FROM customers
GROUP BY age_segment, gender
ORDER BY avg_ltv DESC;

Use Cases:

Fashion e-commerce: Segment by gender, age → Personalize homepage
Banking: Segment by income → Product recommendations (savings vs investment)

Limitations: Demographic không predict behavior tốt (2 người cùng tuổi, cùng giới tính có thể hoàn toàn khác nhau).

2. Geographic Segmentation

Attributes:

Country, city, state
Timezone
Urban vs Rural

SQL Example:

SELECT
  country,
  city,
  COUNT(*) AS customer_count,
  SUM(total_revenue) AS total_revenue,
  AVG(avg_order_value) AS avg_aov
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
GROUP BY country, city
ORDER BY total_revenue DESC
LIMIT 20;

Use Cases:

E-commerce: Shipping zones, localized pricing
Food delivery: Restaurants, delivery times theo location
SaaS: Language, timezone cho support

3. Firmographic Segmentation (B2B)

Attributes:

Company size (# employees)
Industry
Revenue
Tech stack

SQL Example:

SELECT
  CASE
    WHEN employee_count < 10 THEN 'Micro (1-10)'
    WHEN employee_count BETWEEN 10 AND 50 THEN 'Small (10-50)'
    WHEN employee_count BETWEEN 51 AND 200 THEN 'Medium (51-200)'
    ELSE 'Enterprise (200+)'
  END AS company_size,
  industry,
  COUNT(*) AS company_count,
  AVG(mrr) AS avg_mrr,
  AVG(churn_rate) AS avg_churn
FROM companies
GROUP BY company_size, industry;

Use Cases:

SaaS: Pricing tiers, features theo company size
Sales: Prioritize enterprise leads
Onboarding: Tailored flow cho SMB vs Enterprise

4. Behavioral Segmentation (Mạnh nhất)

Attributes:

Purchase history
Product usage patterns
Feature adoption
Engagement level

Example: Engagement-based Segmentation:

WITH user_activity AS (
  SELECT
    user_id,
    COUNT(DISTINCT DATE(event_timestamp)) AS active_days_last_30,
    COUNT(*) AS total_events_last_30
  FROM events
  WHERE event_timestamp >= CURRENT_DATE - INTERVAL '30 days'
  GROUP BY user_id
)

SELECT
  CASE
    WHEN active_days_last_30 >= 20 THEN 'Power User'
    WHEN active_days_last_30 BETWEEN 10 AND 19 THEN 'Regular User'
    WHEN active_days_last_30 BETWEEN 1 AND 9 THEN 'Casual User'
    ELSE 'Inactive'
  END AS engagement_segment,
  COUNT(*) AS user_count,
  AVG(total_events_last_30) AS avg_events
FROM user_activity
GROUP BY engagement_segment;

Use Cases:

SaaS: Power users → Upsell premium features
E-commerce: Frequent buyers → VIP program
Mobile app: Casual users → Re-engagement campaigns

RFM Analysis (Recency, Frequency, Monetary)

Giới thiệu RFM

RFM là phương pháp segment customers dựa trên 3 metrics:

Recency (R): Mua hàng gần đây nhất khi nào? (days since last purchase)
Frequency (F): Mua hàng bao nhiêu lần? (total orders)
Monetary (M): Tổng chi tiêu bao nhiêu? (total revenue)

Tại sao RFM hiệu quả?

Simple, interpretable
Chỉ cần transaction data (không cần ML)
Proven effectiveness (80+ năm lịch sử trong direct marketing)

SQL để tính RFM

WITH rfm_calc AS (
  SELECT
    customer_id,
    -- Recency: số ngày kể từ lần mua cuối
    DATEDIFF(CURRENT_DATE, MAX(order_date)) AS recency,
    -- Frequency: tổng số đơn hàng
    COUNT(DISTINCT order_id) AS frequency,
    -- Monetary: tổng revenue
    SUM(order_value) AS monetary
  FROM orders
  WHERE order_date >= CURRENT_DATE - INTERVAL '365 days'  -- Last 1 year
  GROUP BY customer_id
),

rfm_scores AS (
  SELECT
    customer_id,
    recency,
    frequency,
    monetary,
    -- RFM Scores (1-5, 5 = best)
    -- Recency: Càng gần đây, score càng cao
    NTILE(5) OVER (ORDER BY recency DESC) AS r_score,
    -- Frequency: Càng nhiều lần, score càng cao
    NTILE(5) OVER (ORDER BY frequency ASC) AS f_score,
    -- Monetary: Càng chi nhiều, score càng cao
    NTILE(5) OVER (ORDER BY monetary ASC) AS m_score
  FROM rfm_calc
)

SELECT
  customer_id,
  recency,
  frequency,
  monetary,
  r_score,
  f_score,
  m_score,
  CONCAT(r_score, f_score, m_score) AS rfm_score  -- e.g., '555' = best customers
FROM rfm_scores;

RFM Segments

11 segments phổ biến:

WITH rfm_segments AS (
  SELECT
    *,
    -- Calculate RFM segment based on scores
    CASE
      WHEN r_score >= 4 AND f_score >= 4 AND m_score >= 4 THEN 'Champions'
      WHEN r_score >= 3 AND f_score >= 3 AND m_score >= 3 THEN 'Loyal Customers'
      WHEN r_score >= 4 AND f_score <= 2 AND m_score <= 2 THEN 'New Customers'
      WHEN r_score >= 3 AND f_score <= 2 AND m_score <= 2 THEN 'Promising'
      WHEN r_score >= 3 AND f_score >= 3 AND m_score <= 2 THEN 'Need Attention'
      WHEN r_score >= 4 AND f_score <= 1 AND m_score <= 1 THEN 'About to Sleep'
      WHEN r_score <= 2 AND f_score >= 3 AND m_score >= 3 THEN 'At Risk'
      WHEN r_score <= 2 AND f_score >= 4 AND m_score >= 4 THEN 'Cant Lose Them'
      WHEN r_score <= 1 AND f_score >= 2 AND m_score >= 2 THEN 'Hibernating'
      WHEN r_score <= 2 AND f_score <= 2 AND m_score <= 2 THEN 'Lost'
      ELSE 'Others'
    END AS rfm_segment
  FROM rfm_scores
)

SELECT
  rfm_segment,
  COUNT(*) AS customer_count,
  AVG(recency) AS avg_recency,
  AVG(frequency) AS avg_frequency,
  AVG(monetary) AS avg_monetary,
  SUM(monetary) AS total_revenue
FROM rfm_segments
GROUP BY rfm_segment
ORDER BY total_revenue DESC;

Kết quả ví dụ:

Segment	Count	Avg Recency	Avg Frequency	Avg Monetary	Total Revenue
Champions	500	15 days	25 orders	$5,000	$2.5M
Loyal Customers	1,200	45 days	15 orders	$2,500	$3M
At Risk	800	150 days	18 orders	$3,000	$2.4M
New Customers	2,000	10 days	1 order	$100	$200K
Lost	3,000	300 days	8 orders	$1,200	$3.6M

Actions cho từng RFM Segment

Champions (R=5, F=5, M=5):

✅ Action: Reward loyalty, early access, VIP treatment
📧 Email: "Thank you! Exclusive sneak peek at our new collection"
🎁 Offer: Free shipping forever, beta access

Loyal Customers (R=3-5, F=3-5, M=3-5):

✅ Action: Upsell, cross-sell
📧 Email: "You might also like..." recommendations
🎁 Offer: Bundle deals, loyalty points

At Risk (R=1-2, F=3-5, M=3-5):

⚠️ Action: Re-engage ASAP! (high-value customers leaving)
📧 Email: "We miss you! What can we do better?" + personalized offers
🎁 Offer: Win-back discount (15-25%), survey incentive

New Customers (R=4-5, F=1, M=1-2):

🎯 Action: Onboarding, second purchase
📧 Email: "Welcome! Here's how to get the most from [product]"
🎁 Offer: Second purchase discount (10%)

Lost (R=1, F=1-2, M=1-2):

🗑️ Action: Aggressive win-back or let go
📧 Email: "Last chance! 30% off to come back"
🎁 Offer: Deep discount (30-50%) or remove from list

K-Means Clustering (Machine Learning Approach)

Khi nào dùng K-Means?

RFM rất tốt cho e-commerce, nhưng K-Means mạnh hơn khi:

Nhiều features (>3): Demographics + Behavioral + Firmographic
Patterns phức tạp, không rõ ràng
Muốn discover hidden segments

Python Implementation

import pandas as pd
import numpy as np
from sklearn.cluster import KMeans
from sklearn.preprocessing import StandardScaler
import matplotlib.pyplot as plt
import seaborn as sns

# Load customer data
df = pd.read_sql("""
  SELECT
    customer_id,
    recency,
    frequency,
    monetary,
    avg_order_value,
    total_sessions,
    avg_session_duration,
    products_viewed,
    cart_abandonment_rate
  FROM customer_features
""", conn)

# Features to cluster on
features = ['recency', 'frequency', 'monetary', 'avg_order_value',
            'total_sessions', 'avg_session_duration', 'products_viewed',
            'cart_abandonment_rate']

X = df[features]

# Standardize features (important for K-Means!)
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Elbow method to find optimal K
inertias = []
K_range = range(2, 11)
for k in K_range:
    kmeans = KMeans(n_clusters=k, random_state=42, n_init=10)
    kmeans.fit(X_scaled)
    inertias.append(kmeans.inertia_)

# Plot elbow curve
plt.figure(figsize=(10, 6))
plt.plot(K_range, inertias, 'bo-')
plt.xlabel('Number of Clusters (K)')
plt.ylabel('Inertia')
plt.title('Elbow Method for Optimal K')
plt.savefig('elbow_curve.png')

# Based on elbow, choose K=5
optimal_k = 5
kmeans = KMeans(n_clusters=optimal_k, random_state=42, n_init=10)
df['cluster'] = kmeans.fit_predict(X_scaled)

# Analyze clusters
cluster_summary = df.groupby('cluster')[features].mean()
print(cluster_summary)

# Cluster sizes
print("\nCluster sizes:")
print(df['cluster'].value_counts().sort_index())

Output ví dụ:

Cluster Summary:
         recency  frequency  monetary  avg_order_value  total_sessions
cluster
0           15          25      5000              200             150
1           45          15      2500              170             100
2          150          18      3000              165              80
3           10           1       100              100              20
4          300           8      1200              150              30

Cluster sizes:
0     500  # High-value, engaged
1    1200  # Regular customers
2     800  # At-risk
3    2000  # New/low-value
4    3000  # Churned

Cluster Profiling & Naming

# Profile each cluster
for cluster_id in range(optimal_k):
    cluster_data = df[df['cluster'] == cluster_id]

    print(f"\n{'='*60}")
    print(f"Cluster {cluster_id}")
    print(f"{'='*60}")
    print(f"Size: {len(cluster_data):,} customers ({len(cluster_data)/len(df)*100:.1f}%)")
    print(f"Avg Recency: {cluster_data['recency'].mean():.0f} days")
    print(f"Avg Frequency: {cluster_data['frequency'].mean():.1f} orders")
    print(f"Avg Monetary: ${cluster_data['monetary'].mean():,.0f}")
    print(f"Total Revenue: ${cluster_data['monetary'].sum():,.0f}")

    # Assign meaningful names based on characteristics
    if cluster_data['recency'].mean() < 30 and cluster_data['frequency'].mean() > 20:
        cluster_name = "Champions"
    elif cluster_data['recency'].mean() < 60 and cluster_data['frequency'].mean() > 10:
        cluster_name = "Loyal Customers"
    elif cluster_data['recency'].mean() > 120 and cluster_data['frequency'].mean() > 10:
        cluster_name = "At Risk"
    elif cluster_data['recency'].mean() < 30 and cluster_data['frequency'].mean() <= 2:
        cluster_name = "New Customers"
    else:
        cluster_name = "Lost/Churned"

    print(f"Suggested Name: {cluster_name}")

Visualize Clusters

from sklearn.decomposition import PCA

# Reduce to 2D for visualization
pca = PCA(n_components=2)
X_pca = pca.fit_transform(X_scaled)

# Plot
plt.figure(figsize=(12, 8))
scatter = plt.scatter(X_pca[:, 0], X_pca[:, 1], c=df['cluster'],
                     cmap='viridis', alpha=0.6, s=50)
plt.xlabel(f'PC1 ({pca.explained_variance_ratio_[0]:.1%} variance)')
plt.ylabel(f'PC2 ({pca.explained_variance_ratio_[1]:.1%} variance)')
plt.title('Customer Segments (K-Means Clustering)')
plt.colorbar(scatter, label='Cluster')
plt.savefig('clusters_visualization.png', dpi=300)

Case Studies

Case Study 1: Shopee Vietnam - RFM Segmentation

Context:

E-commerce platform với 10M+ users
Trước đây: Blast promotions to all users
Vấn đề: Low engagement (2% email open rate), high unsubscribe

Implementation:

Step 1: RFM Calculation

-- Tính RFM cho 10M users
WITH rfm AS (
  SELECT
    user_id,
    DATEDIFF(CURRENT_DATE, MAX(order_date)) AS recency,
    COUNT(DISTINCT order_id) AS frequency,
    SUM(order_value) AS monetary
  FROM orders
  WHERE order_date >= '2024-01-01'
  GROUP BY user_id
)
-- ... (scoring như ở trên)

Step 2: Segment Actions

Segment	Size	Action	Channel	Offer
Champions	500K (5%)	Reward	Email, App push	Free shipping voucher, early access
Loyal	1.5M (15%)	Upsell	Email	Bundle deals, 10% off
At Risk	800K (8%)	Win-back	Email, SMS, Ads	20% discount, personalized recs
New	3M (30%)	Onboarding	App, Email	Second purchase 15% off
Lost	4.2M (42%)	Aggressive win-back or suppress	SMS (one-time)	40% off or remove

Step 3: Personalized Campaigns

Champions Email:

Subject: [VIP] Bạn được chọn trước 24h! 🌟
Body: Cảm ơn vì đã là khách hàng thân thiết!
Sản phẩm mới sẽ mở bán ngày mai, nhưng bạn có thể mua NGAY BÂY GIỜ.
[CTA: Mua Ngay]

At Risk Email:

Subject: Chúng mình nhớ bạn! 💔 Voucher 20% đặc biệt
Body: Bạn đã không mua hàng 3 tháng rồi.
Có gì chúng mình làm chưa tốt không?
Đây là voucher 20% để bạn quay lại thử.
[CTA: Mua Hàng] [CTA: Góp Ý]

Results (3 tháng sau):

Metric	Before	After	Change
Email Open Rate	2%	18%	+800%
Click Rate	0.5%	4.2%	+740%
Conversion Rate	1.2%	6.8%	+467%
Unsubscribe Rate	0.8%	0.2%	-75%
Revenue/Email	$0.15	$1.20	+700%

Impact: Revenue tăng $12M/năm chỉ từ email marketing.

Case Study 2: SaaS Startup - Behavioral Clustering

Context:

B2B SaaS (project management tool)
50K users, muốn tăng upgrade rate từ Free → Paid

Analysis: K-Means clustering với features:

features = [
    'projects_created',
    'tasks_created',
    'team_members_invited',
    'integrations_used',
    'api_calls',
    'support_tickets',
    'days_active_last_30'
]

Discovered 4 Segments:

Cluster 0: "Power Users" (5%, 2.5K users)

Nhiều projects (avg 15)
Nhiều team members (avg 20)
Dùng integrations (avg 5)
Already on Paid plan → Action: Upsell Enterprise

Cluster 1: "Growing Teams" (15%, 7.5K users)

Medium projects (avg 5)
Medium team (avg 8)
Ít integrations (avg 1)
Free plan, hitting limits → Action: Upgrade campaign

Cluster 2: "Solo Users" (60%, 30K users)

Ít projects (avg 2)
Không invite team members
Không dùng integrations
Low engagement → Action: Educational content, onboarding

Cluster 3: "Churned" (20%, 10K users)

Created account nhưng không dùng
Days active = 0
Never activated → Action: Re-onboarding campaign

Targeted Actions:

Growing Teams (high-potential):

Email: "Your team is growing! Upgrade to unlock unlimited projects"
In-app: "You're using 4/5 projects. Upgrade now to avoid limits"
Offer: 20% off first year
Result: 25% upgrade rate (vs 2% baseline)

Solo Users (education):

Drip email series: "How to collaborate with your team"
Webinar: "Project management best practices"
Result: 12% invited first team member → 40% of those upgraded

Impact:

MRR tăng từ $200K → $350K (+75%) trong 6 tháng
Focus efforts vào "Growing Teams" (highest ROI)

Case Study 3: Netflix - Taste Clusters

Context (based on public info):

Netflix có 2,000+ "taste clusters" (micro-segments)
Mỗi user thuộc multiple clusters

Example Clusters:

"Romantic Comedy Enthusiasts"
"Dark European Crime Dramas"
"Family-Friendly Animated Films"
"Sci-Fi Action Blockbusters"

How it works:

# Simplified example (real system much more complex)

# User viewing history
user_watched = [
    "Stranger Things",
    "Breaking Bad",
    "Narcos",
    "Ozark",
    # ...
]

# Each show has genre vectors
show_vectors = {
    "Stranger Things": [0.8_scifi, 0.6_drama, 0.4_horror, ...],
    "Breaking Bad": [0.9_drama, 0.7_crime, 0.3_thriller, ...],
    # ...
}

# Aggregate to user taste profile
user_profile = average(show_vectors for all watched shows)

# Find similar users (KNN)
similar_users = find_k_nearest_neighbors(user_profile, k=100)

# Recommend what similar users watched
recommendations = [shows watched by similar_users but not by user]

Impact:

75% of watched content comes from recommendations
Segmentation giúp tăng retention, giảm churn
Tiết kiệm $1B/năm (estimated) trong marketing costs

Advanced Techniques

1. Hierarchical Clustering

Khi nào dùng: Muốn nested segments (e.g., "Loyal Customers" → "High Spenders" + "Frequent Buyers").

from scipy.cluster.hierarchy import dendrogram, linkage
from sklearn.cluster import AgglomerativeClustering

# Hierarchical clustering
linkage_matrix = linkage(X_scaled, method='ward')

# Plot dendrogram
plt.figure(figsize=(15, 8))
dendrogram(linkage_matrix)
plt.title('Hierarchical Clustering Dendrogram')
plt.xlabel('Customer')
plt.ylabel('Distance')
plt.savefig('dendrogram.png')

# Cut at specific height or number of clusters
clustering = AgglomerativeClustering(n_clusters=5, linkage='ward')
df['h_cluster'] = clustering.fit_predict(X_scaled)

2. DBSCAN (Density-Based)

Advantage: Tự động phát hiện số clusters, xử lý noise/outliers tốt.

from sklearn.cluster import DBSCAN

# DBSCAN clustering
dbscan = DBSCAN(eps=0.5, min_samples=10)
df['db_cluster'] = dbscan.fit_predict(X_scaled)

# -1 = noise/outliers
print(f"Outliers: {sum(df['db_cluster'] == -1)} ({sum(df['db_cluster'] == -1)/len(df)*100:.1f}%)")
print(f"Clusters found: {df['db_cluster'].nunique() - 1}")  # -1 not counted

3. Cohort + Segment Analysis

Combine cohort analysis với segmentation:

WITH user_cohorts AS (
  SELECT
    user_id,
    DATE_TRUNC('month', created_at) AS cohort_month
  FROM users
),

user_segments AS (
  -- RFM or cluster segments
  SELECT user_id, segment
  FROM rfm_segments
)

SELECT
  uc.cohort_month,
  us.segment,
  COUNT(DISTINCT uc.user_id) AS users,
  AVG(retention_rate_month_3) AS avg_retention
FROM user_cohorts uc
JOIN user_segments us ON uc.user_id = us.user_id
GROUP BY uc.cohort_month, us.segment
ORDER BY uc.cohort_month, avg_retention DESC;

Insights: "Champions" cohort Jan 2025 có retention 85%, "New Customers" chỉ 40%.

Activation & Personalization

1. Email Personalization

# Segment-specific email templates
email_templates = {
    'Champions': {
        'subject': '[VIP] Exclusive Early Access for Our Best Customers',
        'body': 'As one of our top customers, you deserve the best...',
        'offer': 'free_shipping_lifetime',
    },
    'At Risk': {
        'subject': 'We Miss You! 25% Off to Welcome You Back',
        'body': 'It\'s been a while since your last purchase...',
        'offer': 'discount_25',
    },
    'New Customers': {
        'subject': 'Welcome! Here\'s 10% Off Your Second Order',
        'body': 'Thanks for your first purchase! Here\'s a gift...',
        'offer': 'discount_10_second_purchase',
    },
}

# Send emails
for segment, customers in segments.items():
    template = email_templates[segment]
    for customer in customers:
        send_email(
            to=customer.email,
            subject=template['subject'],
            body=template['body'].format(name=customer.name),
            offer=template['offer']
        )

2. Product Recommendations

# Segment-based recommendations
if user.segment == 'Champions':
    # Show premium/new products
    recommendations = get_new_arrivals() + get_premium_products()
elif user.segment == 'At Risk':
    # Show products they previously viewed
    recommendations = get_previously_viewed(user) + get_similar_products(user)
elif user.segment == 'New Customers':
    # Show bestsellers
    recommendations = get_bestsellers()

3. Dynamic Pricing (Controversial)

Ethical approach: Discounts for at-risk, not price hikes for loyal.

# Segment-based offers
if user.segment == 'At Risk':
    offer = '25% off'  # Win-back
elif user.segment == 'New Customers':
    offer = '10% off second purchase'  # Encourage repeat
elif user.segment == 'Champions':
    offer = 'Free express shipping'  # Non-price benefit
else:
    offer = None  # Regular pricing

Best Practices

1. Start Simple, Iterate

Phase 1: Demographic/Geographic (Week 1)
Phase 2: RFM Analysis (Week 2-3)
Phase 3: Behavioral segments (Month 2)
Phase 4: ML Clustering (Month 3+)

2. Actionable Segments

Bad Segment: "Users who like both cats and dogs"

Cute, nhưng không actionable (làm gì với insight này?)

Good Segment: "At-Risk High-Value Customers"

Actionable: Win-back campaign với personalized offers

3. Segment Size Balance

Too many segments: 50 segments → Không đủ resources để personalize Too few segments: 2 segments → Bỏ lỡ nuances

Rule of thumb: 5-10 segments cho most businesses.

4. Regular Re-Segmentation

Customer behaviors change → Re-segment quarterly.

-- Monthly job: Recalculate RFM scores
-- Move customers between segments
-- Update in CRM/Email marketing tool

5. Privacy & Ethics

✅ Disclose segmentation trong Privacy Policy
✅ Allow opt-out from personalized marketing
❌ Không discriminate (e.g., higher prices for certain demographics)
❌ Không dùng sensitive data (health, religion, ...)

Tools & Stack

1. Analysis & Segmentation

SQL + Python:

BigQuery/Snowflake: Compute RFM, segments
Python (Pandas, Scikit-learn): Clustering
Jupyter Notebooks: Exploratory analysis

BI Tools:

Looker, Tableau, Metabase: Visualize segments
Dashboards: Monitor segment sizes, metrics over time

2. Activation

Email/Marketing Automation:

Braze ($$$): Sophisticated segmentation, multi-channel
Customer.io ($$): Behavioral triggers, segment-based campaigns
Mailchimp ($): Basic segmentation

CDP (Customer Data Platform):

Segment ($$): Collect data, sync segments to tools
RudderStack ($): Open-source alternative
mParticle ($$$): Enterprise CDP

3. Recommendations

Built-in:

Build với Collaborative Filtering (Python)
Use dbt for segment-based rec logic

Tools:

Algolia Recommend ($$)
Dynamic Yield ($$$)
Amazon Personalize (Pay-as-you-go)

Kết Luận

Customer Segmentation là nền tảng của personalization và targeted marketing. Thay vì treat all customers the same, segmentation cho phép bạn:

Key Takeaways

Start với RFM (simplest, most effective cho e-commerce)
Layer behavioral data khi có đủ resources
Use ML clustering cho advanced personalization (khi đủ data + team)
Make segments actionable → Mỗi segment cần clear action
Re-segment regularly → Customers change, segments phải update
Measure impact → Track metrics by segment (LTV, retention, conversion)

Metrics to Track

Metric	Definition	Goal
Segment Conversion Rate	Conversion rate per segment	Increase over time
Segment LTV	Lifetime value per segment	Grow high-value segments
Segment Movement	% customers moving up/down	More moving up
Campaign ROI by Segment	ROI of campaigns per segment	Focus on high-ROI

Next Steps

Sau khi master Customer Segmentation:

Customer Churn Prediction: Predict churn, LTV per segment
Cohort Analysis: Phân tích hành vi theo nhóm
Attribution Modeling: Multi-touch marketing attribution

Carptech - Giải Pháp Customer Segmentation cho Doanh Nghiệp Việt Nam

Tại Carptech, chúng tôi giúp doanh nghiệp Việt Nam xây dựng segmentation strategy:

Dịch vụ của chúng tôi

Segmentation Analysis: RFM, behavioral, ML clustering
Data Pipeline: Automated segment calculation, daily updates
Activation Support: Integrate với email tools, CRM, ad platforms
Dashboards: Real-time segment monitoring (Looker, Metabase)

Case Studies

E-commerce: RFM segmentation, personalized campaigns → Revenue +40%
SaaS: Behavioral clustering → Upgrade rate tăng 3x
Fintech: Segment-based product recommendations → Cross-sell +60%

Liên hệ: https://carptech.vn

Bài viết được viết bởi Carptech Team - Chuyên gia về Data Platform & Analytics tại Việt Nam.

Customer Segmentation: Kỹ Thuật Nâng Cao để Personalization & Targeting

Customer Segmentation: Kỹ Thuật Nâng Cao để Personalization & Targeting

TL;DR

Customer Segmentation Là Gì?

Định nghĩa

One-Size-Fits-All vs Segmentation

When to Segment

Phương Pháp Segmentation

1. Demographic Segmentation (Đơn giản nhất)

2. Geographic Segmentation

3. Firmographic Segmentation (B2B)

4. Behavioral Segmentation (Mạnh nhất)

RFM Analysis (Recency, Frequency, Monetary)

Giới thiệu RFM

SQL để tính RFM

RFM Segments

Actions cho từng RFM Segment

K-Means Clustering (Machine Learning Approach)

Khi nào dùng K-Means?

Python Implementation

Cluster Profiling & Naming

Visualize Clusters

Case Studies

Case Study 1: Shopee Vietnam - RFM Segmentation

Case Study 2: SaaS Startup - Behavioral Clustering

Case Study 3: Netflix - Taste Clusters

Advanced Techniques

1. Hierarchical Clustering

2. DBSCAN (Density-Based)

3. Cohort + Segment Analysis

Activation & Personalization

1. Email Personalization

2. Product Recommendations

3. Dynamic Pricing (Controversial)

Best Practices

1. Start Simple, Iterate

2. Actionable Segments

3. Segment Size Balance

4. Regular Re-Segmentation

5. Privacy & Ethics

Tools & Stack

1. Analysis & Segmentation

2. Activation

3. Recommendations

Kết Luận

Key Takeaways

Metrics to Track

Next Steps

Carptech - Giải Pháp Customer Segmentation cho Doanh Nghiệp Việt Nam

Dịch vụ của chúng tôi

Case Studies

Có câu hỏi về Data Platform?

Bài viết liên quan

Cohort Analysis: Phân Tích Hành Vi Theo Nhóm để Hiểu Customer Journey

A/B Testing Best Practices: Hướng Dẫn Toàn Diện cho Data Teams

Attribution Modeling: Multi-Touch Attribution cho Marketing & Product

Dịch Vụ

Công Ty

Tài Nguyên

Pháp Lý