Quay lại Blog
Data PlatformCập nhật: 20 tháng 5, 202520 phút đọc

Recommendation Systems: Netflix-style Personalization cho Doanh Nghiệp VN

Hướng dẫn xây dựng recommendation engine từ collaborative filtering, content-based, đến hybrid approaches. Code examples với BigQuery ML, Python, và case study e-commerce tăng 18% conversion rate từ personalized recommendations.

Đặng Quỳnh Hương

Đặng Quỳnh Hương

Senior Data Scientist

Biểu đồ recommendation system với user-item matrix, collaborative filtering và personalized recommendations
#Recommendation Systems#Personalization#Collaborative Filtering#Machine Learning#BigQuery ML#E-commerce

Khi bạn mở Netflix, 80% nội dung bạn xem đến từ recommendations. Khi bạn mua sắm trên Amazon, 35% doanh thu đến từ "Customers who bought this also bought...". Personalization không chỉ là nice-to-have - nó là competitive advantage trong thời đại information overload.

Tuy nhiên, theo khảo sát của Carptech với 50+ e-commerce và content platforms tại Việt Nam, chỉ 18% đã triển khai recommendation systems, còn lại 82% vẫn show cùng sản phẩm cho tất cả users hoặc chỉ dựa vào "trending/popular" items. Kết quả: Missed opportunity hàng tỷ đồng mỗi năm.

Bài viết này sẽ demystify recommendation systems - từ algorithms (Collaborative Filtering, Content-Based, Hybrid), implementation options (DIY vs BigQuery ML vs managed services), đến evaluation metrics và production deployment. Kèm case study thực tế về Vietnamese e-commerce tăng conversion rate 18% và cart value 12% nhờ personalized recommendations.

TL;DR - Key Takeaways

  • Recommendation types: Collaborative Filtering (user behavior), Content-Based (item attributes), Hybrid (combine both)
  • Cold start problem: New users/items need special handling (popularity-based, content features)
  • Algorithms: Matrix Factorization, ALS (Alternating Least Squares), Neural Collaborative Filtering
  • Implementation: BigQuery ML (easiest), Python libraries (Surprise, LightFM), or managed services (AWS Personalize)
  • Evaluation: Precision@K, Recall@K, NDCG - plus A/B testing in production
  • ROI: 10-20% increase in conversion rate, 12-18% increase in average order value typical

Why Recommendations Matter: The Business Case

Netflix: 80% Watch Time từ Recommendations

Netflix estimates that their recommendation system saves $1 billion per year in customer retention:

  • Without recommendations: Users overwhelmed by 100K+ titles → Frustrated → Churn
  • With recommendations: Personalized suggestions → Users find content they love → Stay subscribed

Key insight: Recommendations không chỉ increase engagement, mà còn reduce churn (customers feel understood, valued).

Amazon: 35% Revenue từ Recommendations

"Customers who bought this also bought..." là một trong những successful product features của Amazon:

  • Increase cart size: Users discover complementary products
  • Increase discovery: Long-tail products get visibility (not just bestsellers)
  • Increase retention: Personalized experience → Loyalty

Vietnamese Market: Untapped Opportunity

Current state (Carptech survey, 80+ companies):

  • 82% show same products to everyone (homepage featured items)
  • 12% use rule-based ("Recently viewed", "Trending")
  • Only 6% use ML-powered recommendations

Opportunity:

  • E-commerce: 15-25% revenue from recommendations (global benchmark)
  • Vietnamese e-commerce ~$15B/year → Potential $2-3B from better recommendations
  • Content platforms (news, video): 30-50% increase in engagement

Types of Recommendation Systems

1. Collaborative Filtering - "People Like You Also Liked"

Core idea: If User A and User B have similar tastes (liked same items in past), recommend items that B liked to A.

Example:

  • Alice likes Movies: Inception, Interstellar, The Matrix
  • Bob likes Movies: Inception, Interstellar, The Prestige
  • Alice and Bob have similar taste (overlap: Inception, Interstellar)
  • Recommendation: Show "The Prestige" to Alice (Bob liked it, Alice likely to like it too)

Two variants:

User-based Collaborative Filtering:

  1. Find users similar to you
  2. Recommend items those similar users liked

Item-based Collaborative Filtering:

  1. Find items similar to items you liked (based on who else liked them)
  2. Recommend those similar items

Example (Item-based):

  • Many users who bought "iPhone 15" also bought "AirPods Pro"
  • User just bought "iPhone 15" → Recommend "AirPods Pro"

Pros:

  • ✅ No need for item features (just user-item interactions)
  • ✅ Captures complex patterns (user tastes)
  • ✅ Serendipity: Discover unexpected items

Cons:

  • ❌ Cold start: Can't recommend to new users (no history)
  • ❌ Sparsity: Most users only interact with small % of items (sparse matrix)
  • ❌ Popularity bias: Popular items get recommended more

When to use: E-commerce, streaming (movies, music), content platforms with rich interaction data.

2. Content-Based Filtering - "Similar to What You Liked"

Core idea: Recommend items similar to items you liked in the past (based on item attributes/features).

Example:

  • User likes Action movies starring Tom Cruise
  • Recommendation: Other Tom Cruise action movies (Mission Impossible series)

Features used:

  • Movies: Genre, director, actors, year, keywords
  • Products: Category, brand, price range, color, specifications
  • News articles: Topics, keywords, author, publication

Algorithm:

# Simplified example
user_profile = average(features_of_items_user_liked)
# User liked: [Inception, Interstellar, The Matrix]
# User profile: [genre: sci-fi, director: Nolan (2/3), rating: high]

# Score each item by similarity to user profile
for item in all_items:
    similarity_score = cosine_similarity(user_profile, item_features)
    # Recommend items with high similarity

Pros:

  • ✅ No cold start for new users (can recommend based on first interaction)
  • ✅ Transparency: Easy to explain ("Because you liked X")
  • ✅ Works with few users (doesn't need collaborative data)

Cons:

  • ❌ Requires rich item features (manual curation or NLP)
  • ❌ Over-specialization: Only recommends similar items (filter bubble)
  • ❌ No serendipity: Won't recommend outside user's known tastes

When to use: Content-rich platforms (news, blogs, podcasts) or new products without interaction history.

3. Hybrid Approaches - Best of Both Worlds

Combine collaborative + content-based to overcome limitations:

Approach 1: Weighted hybrid

final_score = 0.7 * collaborative_score + 0.3 * content_score

Approach 2: Switching

  • New users: Use content-based (no interaction history)
  • Established users: Use collaborative filtering (rich history)

Approach 3: Feature augmentation

  • Use content features as additional features in collaborative filtering model

Pros: Overcomes cold start, reduces over-specialization, best accuracy

Cons: More complex, harder to implement

Comparison Table

TypeData NeededCold Start?Serendipity?Best For
CollaborativeUser-item interactions❌ Poor✅ HighE-commerce, Streaming
Content-BasedItem features✅ Good❌ LowNews, Blogs, New products
HybridBoth✅ Good✅ MediumLarge platforms (Amazon, Netflix)

Collaborative Filtering: Deep Dive

Matrix Factorization - The Core Algorithm

User-Item Matrix (ratings/interactions):

iPhoneAirPodsMacBookiPadWatch
Alice5?4?3
Bob?5?4?
Carol44???
David??554

Problem: Matrix is sparse (most cells are ?) Goal: Fill in the ? (predict missing ratings)

Matrix Factorization idea:

  • Decompose matrix into two smaller matrices: User Factors × Item Factors
  • User Factors: Each user represented by K latent features (e.g., "how much this user likes tech products", "price sensitivity", etc.)
  • Item Factors: Each item represented by K latent features (e.g., "how premium this product is", "tech-forward", etc.)

Mathematical formulation:

R ≈ U × V^T

R: n_users × n_items (sparse)
U: n_users × k (user latent factors)
V: n_items × k (item latent factors)

Rating(user_i, item_j) ≈ U[i, :] · V[j, :] (dot product)

Training: Minimize squared error on known ratings

minimize: Σ (actual_rating - predicted_rating)^2 + regularization

Algorithm: Alternating Least Squares (ALS)

  1. Initialize U, V randomly
  2. Fix V, optimize U (least squares problem)
  3. Fix U, optimize V
  4. Repeat until convergence

Implementation with BigQuery ML (Easiest!)

BigQuery ML has built-in collaborative filtering via Matrix Factorization.

Step 1: Prepare data

-- User-item interactions (ratings, purchases, clicks)
CREATE OR REPLACE TABLE `project.dataset.user_item_interactions` AS
SELECT
  user_id,
  item_id,
  -- Implicit feedback: combine multiple signals
  1 * viewed +
  2 * added_to_cart +
  5 * purchased AS rating  -- Purchased = 5x weight
FROM (
  SELECT
    user_id,
    product_id AS item_id,
    COUNTIF(event = 'product_viewed') AS viewed,
    COUNTIF(event = 'add_to_cart') AS added_to_cart,
    COUNTIF(event = 'purchase') AS purchased
  FROM events
  WHERE event_date >= '2024-01-01'
  GROUP BY user_id, product_id
);

Step 2: Train model

CREATE OR REPLACE MODEL `project.dataset.product_recommendation_model`
OPTIONS(
  model_type='MATRIX_FACTORIZATION',
  user_col='user_id',
  item_col='item_id',
  rating_col='rating',
  feedback_type='implicit',  -- or 'explicit' for actual ratings (1-5 stars)
  num_factors=20,  -- Latent dimensions (more = more expressive, but overfitting risk)
  regularization=0.1
) AS
SELECT user_id, item_id, rating
FROM `project.dataset.user_item_interactions`;

Training time: 5-20 minutes for 1M interactions

Step 3: Generate recommendations

-- For a specific user
SELECT *
FROM ML.RECOMMEND(
  MODEL `project.dataset.product_recommendation_model`,
  (SELECT 'user_12345' AS user_id)
)
ORDER BY predicted_rating DESC
LIMIT 10;

Output:

user_iditem_idpredicted_rating
user_12345product_7894.23
user_12345product_4563.87
user_12345product_2343.65

Step 4: Batch recommendations for all users

-- Generate top 10 recommendations for each user
CREATE OR REPLACE TABLE `project.dataset.recommendations` AS
SELECT user_id, ARRAY_AGG(item_id ORDER BY predicted_rating DESC LIMIT 10) AS recommended_items
FROM ML.RECOMMEND(
  MODEL `project.dataset.product_recommendation_model`,
  (SELECT DISTINCT user_id FROM `project.dataset.users` WHERE active = TRUE)
)
GROUP BY user_id;

Pros of BigQuery ML:

  • ✅ No code (pure SQL)
  • ✅ Scales automatically (Google infrastructure)
  • ✅ Fast (distributed training)
  • ✅ Integrated with data warehouse

Cons:

  • ❌ Less flexible (can't customize algorithm deeply)
  • ❌ Limited to matrix factorization (no deep learning options)

Implementation with Python (More Flexible)

Using Surprise library (simple, scikit-learn-like API):

from surprise import SVD, Dataset, Reader
from surprise.model_selection import cross_validate

# Step 1: Prepare data
import pandas as pd

interactions = pd.read_csv('user_item_interactions.csv')
# Columns: user_id, item_id, rating

# Convert to Surprise format
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(interactions[['user_id', 'item_id', 'rating']], reader)

# Step 2: Train model (SVD = Matrix Factorization)
algo = SVD(n_factors=20, n_epochs=20, lr_all=0.005, reg_all=0.02, random_state=42)

# Cross-validation
cross_validate(algo, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)

# Train on full dataset
trainset = data.build_full_trainset()
algo.fit(trainset)

# Step 3: Generate recommendations for a user
def get_recommendations(user_id, n=10):
    # Get all items
    all_items = interactions['item_id'].unique()

    # Get items user already interacted with
    user_items = set(interactions[interactions['user_id'] == user_id]['item_id'])

    # Predict ratings for items user hasn't seen
    predictions = []
    for item_id in all_items:
        if item_id not in user_items:
            pred = algo.predict(user_id, item_id)
            predictions.append((item_id, pred.est))

    # Sort by predicted rating
    predictions.sort(key=lambda x: x[1], reverse=True)

    return predictions[:n]

# Get top 10 recommendations for user
recs = get_recommendations(user_id='user_12345', n=10)
print(recs)

Advanced: Deep Learning with Neural Collaborative Filtering

import tensorflow as tf
from tensorflow import keras

# Step 1: Encode users and items as integers
user_ids = interactions['user_id'].astype('category').cat.codes.values
item_ids = interactions['item_id'].astype('category').cat.codes.values
ratings = interactions['rating'].values

n_users = interactions['user_id'].nunique()
n_items = interactions['item_id'].nunique()

# Step 2: Build neural network
def build_ncf_model(n_users, n_items, embedding_dim=50):
    # User input
    user_input = keras.Input(shape=(1,), name='user_input')
    user_embedding = keras.layers.Embedding(n_users, embedding_dim, name='user_embedding')(user_input)
    user_vec = keras.layers.Flatten()(user_embedding)

    # Item input
    item_input = keras.Input(shape=(1,), name='item_input')
    item_embedding = keras.layers.Embedding(n_items, embedding_dim, name='item_embedding')(item_input)
    item_vec = keras.layers.Flatten()(item_embedding)

    # Concatenate user and item embeddings
    concat = keras.layers.Concatenate()([user_vec, item_vec])

    # Dense layers
    dense = keras.layers.Dense(128, activation='relu')(concat)
    dense = keras.layers.Dropout(0.3)(dense)
    dense = keras.layers.Dense(64, activation='relu')(dense)

    # Output: predicted rating
    output = keras.layers.Dense(1, activation='linear')(dense)

    model = keras.Model(inputs=[user_input, item_input], outputs=output)
    model.compile(optimizer='adam', loss='mse', metrics=['mae'])

    return model

# Step 3: Train
model = build_ncf_model(n_users, n_items)
model.fit(
    [user_ids, item_ids],
    ratings,
    batch_size=256,
    epochs=10,
    validation_split=0.2,
    verbose=1
)

# Step 4: Predict
def predict_rating(user_id, item_id):
    user_idx = interactions[interactions['user_id'] == user_id]['user_id'].astype('category').cat.codes.iloc[0]
    item_idx = interactions[interactions['item_id'] == item_id]['item_id'].astype('category').cat.codes.iloc[0]

    pred = model.predict([[user_idx], [item_idx]])
    return pred[0][0]

# Recommend items for user
def recommend_items(user_id, n=10):
    user_idx = interactions[interactions['user_id'] == user_id]['user_id'].astype('category').cat.codes.iloc[0]
    all_item_indices = range(n_items)

    predictions = []
    for item_idx in all_item_indices:
        pred = model.predict([[user_idx], [item_idx]], verbose=0)
        predictions.append((item_idx, pred[0][0]))

    predictions.sort(key=lambda x: x[1], reverse=True)
    return predictions[:n]

When to use deep learning:

  • ✅ Very large datasets (10M+ interactions)
  • ✅ Complex patterns (non-linear relationships)
  • ✅ Additional features (user demographics, item metadata)

When NOT to use:

  • ❌ Small datasets (<100K interactions) → Overfitting
  • ❌ Need explainability (neural networks = black box)

Handling Cold Start Problem

Cold start = No data for new users or new items

Cold Start: New Users

Problem: New user signs up, no interaction history → Can't use collaborative filtering

Solutions:

1. Popularity-based (simplest):

-- Show trending/popular items to new users
SELECT
  item_id,
  COUNT(*) as interaction_count,
  AVG(rating) as avg_rating
FROM user_item_interactions
WHERE interaction_date >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY item_id
ORDER BY interaction_count DESC, avg_rating DESC
LIMIT 10;

2. Ask for preferences (onboarding):

  • "What are you interested in?" (select categories)
  • "Rate these popular items" (collect initial ratings)
  • Use content-based filtering with these preferences

3. Demographic-based:

# Recommend items popular among users with similar demographics
similar_users = users[(users['age_group'] == new_user['age_group']) &
                       (users['gender'] == new_user['gender']) &
                       (users['country'] == new_user['country'])]

popular_among_similar = interactions[
    interactions['user_id'].isin(similar_users['user_id'])
].groupby('item_id').size().nlargest(10)

4. Hybrid approach: Content-based initially, switch to collaborative after 5-10 interactions

Cold Start: New Items

Problem: New product launched, no one bought it yet → Won't be recommended

Solutions:

1. Content-based:

# Find similar items based on attributes
new_item_features = [category, brand, price_range, color]
similar_items = find_items_with_similar_features(new_item_features)

# Recommend new item to users who liked similar items
users_who_liked_similar = get_users_who_liked(similar_items)
recommend_new_item_to(users_who_liked_similar)

2. Editorial/featured:

  • Manually feature new products on homepage
  • Email campaigns to likely interested users

3. Exploration-exploitation:

  • Show new items to random sample of users (exploration)
  • Based on initial feedback, recommend to relevant users (exploitation)

Evaluation Metrics: Measuring Recommendation Quality

Offline metrics (historical data):

1. Precision@K and Recall@K

Precision@K: Of K recommendations, how many are relevant?

Precision@10 = (Relevant items in top 10) / 10

Recall@K: Of all relevant items, how many did we recommend in top K?

Recall@10 = (Relevant items in top 10) / (Total relevant items)

Example:

  • User has 50 relevant items (items they'll like)
  • We recommend top 10 items
  • 6 of the 10 are relevant
Precision@10 = 6/10 = 60%
Recall@10 = 6/50 = 12%

Trade-off: Higher K → Higher recall, lower precision

Code:

def precision_at_k(recommended_items, relevant_items, k=10):
    recommended_k = recommended_items[:k]
    relevant_in_k = set(recommended_k) & set(relevant_items)
    return len(relevant_in_k) / k

def recall_at_k(recommended_items, relevant_items, k=10):
    recommended_k = recommended_items[:k]
    relevant_in_k = set(recommended_k) & set(relevant_items)
    return len(relevant_in_k) / len(relevant_items) if len(relevant_items) > 0 else 0

2. NDCG (Normalized Discounted Cumulative Gain)

Idea: Not all positions equal - item at position 1 more important than position 10

Formula:

DCG@K = Σ(relevance[i] / log2(i+1)) for i in 1..K
NDCG@K = DCG@K / Ideal_DCG@K (normalized to 0-1)

Code:

import numpy as np

def ndcg_at_k(recommended_items, relevance_scores, k=10):
    """
    relevance_scores: dict {item_id: relevance (0-5)}
    """
    recommended_k = recommended_items[:k]

    # DCG
    dcg = sum([
        relevance_scores.get(item, 0) / np.log2(i + 2)
        for i, item in enumerate(recommended_k)
    ])

    # Ideal DCG (if we ranked by relevance perfectly)
    ideal_order = sorted(relevance_scores.values(), reverse=True)[:k]
    idcg = sum([
        rel / np.log2(i + 2)
        for i, rel in enumerate(ideal_order)
    ])

    return dcg / idcg if idcg > 0 else 0

Interpretation:

  • NDCG = 1.0: Perfect ranking
  • NDCG = 0.7-0.9: Good
  • NDCG < 0.5: Poor

3. Coverage and Diversity

Coverage: % of items that get recommended at least once

coverage = len(set(all_recommendations)) / total_items * 100
  • High coverage (>50%): Good (long-tail items get exposure)
  • Low coverage (<20%): Popular items dominate (filter bubble)

Diversity: How different are recommended items from each other?

# Average pairwise dissimilarity
diversity = mean([
    dissimilarity(item_i, item_j)
    for item_i, item_j in combinations(recommended_items, 2)
])

Trade-off: Accuracy vs Diversity

  • Highly accurate recommendations may be too similar (boring)
  • Diverse recommendations may include less relevant items (interesting)

Online Metrics (A/B Testing in Production)

Business metrics (most important!):

1. Click-through rate (CTR):

CTR = (Clicks on recommendations) / (Impressions) * 100
  • Baseline (no personalization): 1-2%
  • Good recommendations: 3-8%

2. Conversion rate:

Conversion = (Purchases from recommendations) / (Clicks on recommendations) * 100

3. Revenue per user (RPU):

RPU = Total revenue from recommendations / Total users

4. Average order value (AOV):

  • Recommendations increase cart size (cross-sell, upsell)

5. Engagement metrics:

  • Time on site
  • Pages per session
  • Repeat visits

A/B testing setup:

# Control group: No personalization (popular items)
# Treatment group: ML recommendations

# Split users randomly 50/50
if user_id % 2 == 0:
    recommendations = get_popular_items()
    group = 'control'
else:
    recommendations = get_ml_recommendations(user_id)
    group = 'treatment'

# Log for analysis
log_recommendation_shown(user_id, recommendations, group)

Analyze results:

SELECT
  test_group,
  COUNT(DISTINCT user_id) as users,
  SUM(clicked) / COUNT(*) * 100 as ctr,
  SUM(purchased) / SUM(clicked) * 100 as conversion_rate,
  SUM(revenue) / COUNT(DISTINCT user_id) as revenue_per_user
FROM recommendation_logs
WHERE experiment_date >= '2025-01-01'
GROUP BY test_group;

Production Deployment: Architecture

Batch Recommendations (Daily/Weekly)

For: E-commerce product recommendations, email campaigns

Architecture:

Daily schedule (2 AM)
    ↓
Run ML model (BigQuery ML or Python)
    ↓
Generate top 20 recommendations per user
    ↓
Store in recommendations table (BigQuery)
    ↓
Application reads from table (fast lookup)

Pros: Simple, cheap, fresh enough for most use cases Cons: Not real-time (today's purchases not reflected until tomorrow)

Implementation:

-- Scheduled query (runs daily at 2 AM)
CREATE OR REPLACE TABLE `project.dataset.daily_recommendations` AS
SELECT
  user_id,
  ARRAY_AGG(item_id ORDER BY predicted_rating DESC LIMIT 20) as recommended_items
FROM ML.RECOMMEND(
  MODEL `project.dataset.recommendation_model`,
  (SELECT DISTINCT user_id FROM `project.dataset.active_users`)
)
GROUP BY user_id;

-- Application queries this table
SELECT recommended_items
FROM `project.dataset.daily_recommendations`
WHERE user_id = 'user_12345';

Real-time Recommendations

For: News feeds, video streaming (where recency matters)

Architecture:

User action (purchase, view) → Event stream (Kafka)
    ↓
Update user profile (feature store)
    ↓
Trigger recommendation API
    ↓
Model inference (<100ms)
    ↓
Return recommendations

Tech stack:

  • Model serving: TensorFlow Serving, TorchServe, FastAPI
  • Feature store: Feast, Tecton (cache user features)
  • Caching: Redis (cache recommendations for 5-60 minutes)

Example API:

from fastapi import FastAPI
import joblib

app = FastAPI()

# Load model on startup
model = joblib.load('recommendation_model.pkl')

@app.get("/recommend/{user_id}")
async def recommend(user_id: str, n: int = 10):
    # Get user features from feature store (fast)
    user_features = get_user_features(user_id)

    # Generate recommendations (model inference)
    recommendations = model.predict(user_features, n=n)

    return {
        "user_id": user_id,
        "recommendations": recommendations.tolist(),
        "timestamp": datetime.now().isoformat()
    }

Caching layer:

import redis

cache = redis.Redis(host='localhost', port=6379)

@app.get("/recommend/{user_id}")
async def recommend(user_id: str, n: int = 10):
    # Check cache first
    cached = cache.get(f"recs:{user_id}")
    if cached:
        return json.loads(cached)

    # Generate recommendations
    recommendations = model.predict(...)

    # Cache for 30 minutes
    cache.setex(
        f"recs:{user_id}",
        1800,  # 30 min TTL
        json.dumps(recommendations)
    )

    return recommendations

Case Study: Vietnamese E-commerce - 18% Conversion Increase

Background:

  • Company: Fashion e-commerce, ~500K users, ~10K SKUs
  • Baseline: Homepage shows "Trending" products (same for everyone)
  • Problem:
    • Low engagement (users browse, don't buy)
    • High bounce rate (60%)
    • Low repeat purchase rate (25%)

Implementation (12 weeks):

Week 1-4: Data preparation

  • Collected historical data:
    • 2M product views (last 6 months)
    • 300K add-to-carts
    • 80K purchases
  • Built user-item interaction matrix:
    • View = 1 point
    • Add-to-cart = 2 points
    • Purchase = 5 points

Week 5-8: Model training

  • Algorithm: Matrix Factorization (BigQuery ML)
  • Training data: 6 months interactions
  • Evaluation (offline):
    • Precision@10: 32% (baseline random: 1%)
    • NDCG@10: 0.68 (good)
    • Coverage: 65% of SKUs (vs 10% with trending)

Week 9-10: A/B testing

  • Control (50% users): Homepage trending + category pages
  • Treatment (50% users): Personalized recommendations
    • Homepage: "Recommended for you"
    • Product pages: "You may also like"
    • Cart: "Complete your look"

Week 11-12: Analysis & scale

  • Analyze A/B test results
  • Scale to 100% traffic

Results after 3 months:

MetricControl (Baseline)Treatment (ML Recs)Change
Homepage CTR2.1%5.8%+176%
Conversion rate (overall)3.2%3.8%+18%
Average order value850K VND950K VND+12%
Items per order1.82.2+22%
Revenue per user27K VND36K VND+33%
Repeat purchase rate (90 days)25%31%+24%

Financial impact (annual):

  • Additional revenue: 33% RPU increase × 500K users × 12 months = ~2B VND/year
  • ML project cost: 200M VND (one-time) + 30M VND/year (maintenance)
  • ROI: 900% first year

Key insights:

  • "You may also like" on product pages: Highest CTR (8-12%), drives cross-sell
  • Homepage recommendations: Good for engagement, but conversion lower than category browse
  • Cart recommendations: Impulse purchases, increased AOV significantly
  • Long-tail products: 40% of recommendations were non-trending items (discovery!)

Surprising finding: Diversity matters

  • Initial model optimized purely for accuracy → Recommendations too similar (all same brand)
  • Users found it boring, CTR dropped after 2 weeks
  • Solution: Add diversity constraint (max 3 items from same brand in top 10)
  • Result: CTR recovered, users happier

Implementation Options Comparison

OptionDifficultyCostFlexibilityBest For
BigQuery ML⭐ Easy$ Cheap⭐⭐ LimitedQuick start, e-commerce
Python (Surprise)⭐⭐ Medium$$ Medium⭐⭐⭐ GoodCustom needs, medium scale
Deep Learning⭐⭐⭐⭐ Hard$$$ Expensive⭐⭐⭐⭐⭐ HighLarge scale, complex patterns
AWS Personalize⭐⭐ Medium$$$ Expensive⭐⭐⭐ GoodAWS-based, managed service
Google Recommendations AI⭐⭐ Medium$$$ Expensive⭐⭐⭐ GoodGCP-based, retail focused

Recommendation for Vietnamese companies:

  • Start: BigQuery ML (if using BigQuery) or Python Surprise
  • Scale: Deep learning if >10M interactions
  • Enterprise: Consider managed services (AWS Personalize) if budget allows

Common Pitfalls & How to Avoid

1. Filter bubble (recommending only similar items):

  • Problem: User likes action movies → Only recommend action → User bored
  • Solution: Inject diversity (10-20% "exploration" - random or different genres)

2. Popularity bias (recommending only popular items):

  • Problem: Long-tail products never recommended
  • Solution: Penalize popular items in scoring, or sample recommendations from different popularity tiers

3. Ignoring business rules:

  • Problem: Recommending out-of-stock products, competitors' brands
  • Solution: Post-filter recommendations (remove out-of-stock, apply business constraints)
def apply_business_rules(recommendations, user, catalog):
    filtered = []
    for item in recommendations:
        # Rule 1: In stock
        if catalog[item]['stock'] == 0:
            continue

        # Rule 2: Not already purchased recently
        if item in user['recent_purchases']:
            continue

        # Rule 3: Price filter (don't recommend 10x more expensive)
        if catalog[item]['price'] > user['avg_purchase_price'] * 10:
            continue

        # Rule 4: Business constraints (e.g., don't recommend competitor brands)
        if catalog[item]['brand'] in COMPETITOR_BRANDS:
            continue

        filtered.append(item)

    return filtered[:10]  # Top 10 after filtering

4. Not A/B testing:

  • Problem: Assume recommendations work, but users don't click
  • Solution: Always A/B test (control vs treatment), measure business metrics

5. Stale recommendations:

  • Problem: Model trained 6 months ago, user tastes changed
  • Solution: Retrain regularly (weekly/monthly), or use online learning

Kết Luận: Personalization = Proven Business Impact

Recommendation systems không phải magic - chúng là proven technology với clear ROI:

  • 18-25% increase in conversion rate (typical)
  • 10-15% increase in average order value (cross-sell)
  • 20-30% increase in engagement (time on site, pages/session)

Implementation timeline:

  • Month 1-2: Data preparation, model training
  • Month 3: A/B testing, iteration
  • Month 4-6: Scale, optimize, monitor
  • Ongoing: Retrain monthly, add features, improve

Start small:

  • Begin with 1 placement (e.g., "You may also like" on product pages)
  • Use simple algorithm (Matrix Factorization via BigQuery ML)
  • Measure impact (CTR, conversion, revenue)
  • Expand to more placements (homepage, cart, email)

Next steps:

  • Audit current recommendation approach (random? trending? none?)
  • Assess data readiness (have user-item interactions?)
  • Choose implementation (BigQuery ML for MVP)
  • A/B test (treatment vs control)
  • Liên hệ Carptech nếu cần hỗ trợ (carptech.vn/contact)

Tài liệu tham khảo:


Bài viết này là phần 3 của series "Advanced Analytics & AI/ML" tháng 5. Đọc thêm về Analytics Maturity, Churn Prediction, và Demand Forecasting.

Carptech - Data Platform & ML Solutions for Vietnamese Enterprises. Liên hệ tư vấn miễn phí.

Có câu hỏi về Data Platform?

Đội ngũ chuyên gia của Carptech sẵn sàng tư vấn miễn phí về giải pháp phù hợp nhất cho doanh nghiệp của bạn. Đặt lịch tư vấn 60 phút qua Microsoft Teams hoặc gửi form liên hệ.

✓ Miễn phí 100% • ✓ Microsoft Teams • ✓ Không cam kết dài hạn