Khi bạn mở Netflix, 80% nội dung bạn xem đến từ recommendations. Khi bạn mua sắm trên Amazon, 35% doanh thu đến từ "Customers who bought this also bought...". Personalization không chỉ là nice-to-have - nó là competitive advantage trong thời đại information overload.
Tuy nhiên, theo khảo sát của Carptech với 50+ e-commerce và content platforms tại Việt Nam, chỉ 18% đã triển khai recommendation systems, còn lại 82% vẫn show cùng sản phẩm cho tất cả users hoặc chỉ dựa vào "trending/popular" items. Kết quả: Missed opportunity hàng tỷ đồng mỗi năm.
Bài viết này sẽ demystify recommendation systems - từ algorithms (Collaborative Filtering, Content-Based, Hybrid), implementation options (DIY vs BigQuery ML vs managed services), đến evaluation metrics và production deployment. Kèm case study thực tế về Vietnamese e-commerce tăng conversion rate 18% và cart value 12% nhờ personalized recommendations.
TL;DR - Key Takeaways
- Recommendation types: Collaborative Filtering (user behavior), Content-Based (item attributes), Hybrid (combine both)
- Cold start problem: New users/items need special handling (popularity-based, content features)
- Algorithms: Matrix Factorization, ALS (Alternating Least Squares), Neural Collaborative Filtering
- Implementation: BigQuery ML (easiest), Python libraries (Surprise, LightFM), or managed services (AWS Personalize)
- Evaluation: Precision@K, Recall@K, NDCG - plus A/B testing in production
- ROI: 10-20% increase in conversion rate, 12-18% increase in average order value typical
Why Recommendations Matter: The Business Case
Netflix: 80% Watch Time từ Recommendations
Netflix estimates that their recommendation system saves $1 billion per year in customer retention:
- Without recommendations: Users overwhelmed by 100K+ titles → Frustrated → Churn
- With recommendations: Personalized suggestions → Users find content they love → Stay subscribed
Key insight: Recommendations không chỉ increase engagement, mà còn reduce churn (customers feel understood, valued).
Amazon: 35% Revenue từ Recommendations
"Customers who bought this also bought..." là một trong những successful product features của Amazon:
- Increase cart size: Users discover complementary products
- Increase discovery: Long-tail products get visibility (not just bestsellers)
- Increase retention: Personalized experience → Loyalty
Vietnamese Market: Untapped Opportunity
Current state (Carptech survey, 80+ companies):
- 82% show same products to everyone (homepage featured items)
- 12% use rule-based ("Recently viewed", "Trending")
- Only 6% use ML-powered recommendations
Opportunity:
- E-commerce: 15-25% revenue from recommendations (global benchmark)
- Vietnamese e-commerce ~$15B/year → Potential $2-3B from better recommendations
- Content platforms (news, video): 30-50% increase in engagement
Types of Recommendation Systems
1. Collaborative Filtering - "People Like You Also Liked"
Core idea: If User A and User B have similar tastes (liked same items in past), recommend items that B liked to A.
Example:
- Alice likes Movies: Inception, Interstellar, The Matrix
- Bob likes Movies: Inception, Interstellar, The Prestige
- Alice and Bob have similar taste (overlap: Inception, Interstellar)
- Recommendation: Show "The Prestige" to Alice (Bob liked it, Alice likely to like it too)
Two variants:
User-based Collaborative Filtering:
- Find users similar to you
- Recommend items those similar users liked
Item-based Collaborative Filtering:
- Find items similar to items you liked (based on who else liked them)
- Recommend those similar items
Example (Item-based):
- Many users who bought "iPhone 15" also bought "AirPods Pro"
- User just bought "iPhone 15" → Recommend "AirPods Pro"
Pros:
- ✅ No need for item features (just user-item interactions)
- ✅ Captures complex patterns (user tastes)
- ✅ Serendipity: Discover unexpected items
Cons:
- ❌ Cold start: Can't recommend to new users (no history)
- ❌ Sparsity: Most users only interact with small % of items (sparse matrix)
- ❌ Popularity bias: Popular items get recommended more
When to use: E-commerce, streaming (movies, music), content platforms with rich interaction data.
2. Content-Based Filtering - "Similar to What You Liked"
Core idea: Recommend items similar to items you liked in the past (based on item attributes/features).
Example:
- User likes Action movies starring Tom Cruise
- Recommendation: Other Tom Cruise action movies (Mission Impossible series)
Features used:
- Movies: Genre, director, actors, year, keywords
- Products: Category, brand, price range, color, specifications
- News articles: Topics, keywords, author, publication
Algorithm:
# Simplified example
user_profile = average(features_of_items_user_liked)
# User liked: [Inception, Interstellar, The Matrix]
# User profile: [genre: sci-fi, director: Nolan (2/3), rating: high]
# Score each item by similarity to user profile
for item in all_items:
similarity_score = cosine_similarity(user_profile, item_features)
# Recommend items with high similarity
Pros:
- ✅ No cold start for new users (can recommend based on first interaction)
- ✅ Transparency: Easy to explain ("Because you liked X")
- ✅ Works with few users (doesn't need collaborative data)
Cons:
- ❌ Requires rich item features (manual curation or NLP)
- ❌ Over-specialization: Only recommends similar items (filter bubble)
- ❌ No serendipity: Won't recommend outside user's known tastes
When to use: Content-rich platforms (news, blogs, podcasts) or new products without interaction history.
3. Hybrid Approaches - Best of Both Worlds
Combine collaborative + content-based to overcome limitations:
Approach 1: Weighted hybrid
final_score = 0.7 * collaborative_score + 0.3 * content_score
Approach 2: Switching
- New users: Use content-based (no interaction history)
- Established users: Use collaborative filtering (rich history)
Approach 3: Feature augmentation
- Use content features as additional features in collaborative filtering model
Pros: Overcomes cold start, reduces over-specialization, best accuracy
Cons: More complex, harder to implement
Comparison Table
| Type | Data Needed | Cold Start? | Serendipity? | Best For |
|---|---|---|---|---|
| Collaborative | User-item interactions | ❌ Poor | ✅ High | E-commerce, Streaming |
| Content-Based | Item features | ✅ Good | ❌ Low | News, Blogs, New products |
| Hybrid | Both | ✅ Good | ✅ Medium | Large platforms (Amazon, Netflix) |
Collaborative Filtering: Deep Dive
Matrix Factorization - The Core Algorithm
User-Item Matrix (ratings/interactions):
| iPhone | AirPods | MacBook | iPad | Watch | |
|---|---|---|---|---|---|
| Alice | 5 | ? | 4 | ? | 3 |
| Bob | ? | 5 | ? | 4 | ? |
| Carol | 4 | 4 | ? | ? | ? |
| David | ? | ? | 5 | 5 | 4 |
Problem: Matrix is sparse (most cells are ?) Goal: Fill in the ? (predict missing ratings)
Matrix Factorization idea:
- Decompose matrix into two smaller matrices: User Factors × Item Factors
- User Factors: Each user represented by K latent features (e.g., "how much this user likes tech products", "price sensitivity", etc.)
- Item Factors: Each item represented by K latent features (e.g., "how premium this product is", "tech-forward", etc.)
Mathematical formulation:
R ≈ U × V^T
R: n_users × n_items (sparse)
U: n_users × k (user latent factors)
V: n_items × k (item latent factors)
Rating(user_i, item_j) ≈ U[i, :] · V[j, :] (dot product)
Training: Minimize squared error on known ratings
minimize: Σ (actual_rating - predicted_rating)^2 + regularization
Algorithm: Alternating Least Squares (ALS)
- Initialize U, V randomly
- Fix V, optimize U (least squares problem)
- Fix U, optimize V
- Repeat until convergence
Implementation with BigQuery ML (Easiest!)
BigQuery ML has built-in collaborative filtering via Matrix Factorization.
Step 1: Prepare data
-- User-item interactions (ratings, purchases, clicks)
CREATE OR REPLACE TABLE `project.dataset.user_item_interactions` AS
SELECT
user_id,
item_id,
-- Implicit feedback: combine multiple signals
1 * viewed +
2 * added_to_cart +
5 * purchased AS rating -- Purchased = 5x weight
FROM (
SELECT
user_id,
product_id AS item_id,
COUNTIF(event = 'product_viewed') AS viewed,
COUNTIF(event = 'add_to_cart') AS added_to_cart,
COUNTIF(event = 'purchase') AS purchased
FROM events
WHERE event_date >= '2024-01-01'
GROUP BY user_id, product_id
);
Step 2: Train model
CREATE OR REPLACE MODEL `project.dataset.product_recommendation_model`
OPTIONS(
model_type='MATRIX_FACTORIZATION',
user_col='user_id',
item_col='item_id',
rating_col='rating',
feedback_type='implicit', -- or 'explicit' for actual ratings (1-5 stars)
num_factors=20, -- Latent dimensions (more = more expressive, but overfitting risk)
regularization=0.1
) AS
SELECT user_id, item_id, rating
FROM `project.dataset.user_item_interactions`;
Training time: 5-20 minutes for 1M interactions
Step 3: Generate recommendations
-- For a specific user
SELECT *
FROM ML.RECOMMEND(
MODEL `project.dataset.product_recommendation_model`,
(SELECT 'user_12345' AS user_id)
)
ORDER BY predicted_rating DESC
LIMIT 10;
Output:
| user_id | item_id | predicted_rating |
|---|---|---|
| user_12345 | product_789 | 4.23 |
| user_12345 | product_456 | 3.87 |
| user_12345 | product_234 | 3.65 |
Step 4: Batch recommendations for all users
-- Generate top 10 recommendations for each user
CREATE OR REPLACE TABLE `project.dataset.recommendations` AS
SELECT user_id, ARRAY_AGG(item_id ORDER BY predicted_rating DESC LIMIT 10) AS recommended_items
FROM ML.RECOMMEND(
MODEL `project.dataset.product_recommendation_model`,
(SELECT DISTINCT user_id FROM `project.dataset.users` WHERE active = TRUE)
)
GROUP BY user_id;
Pros of BigQuery ML:
- ✅ No code (pure SQL)
- ✅ Scales automatically (Google infrastructure)
- ✅ Fast (distributed training)
- ✅ Integrated with data warehouse
Cons:
- ❌ Less flexible (can't customize algorithm deeply)
- ❌ Limited to matrix factorization (no deep learning options)
Implementation with Python (More Flexible)
Using Surprise library (simple, scikit-learn-like API):
from surprise import SVD, Dataset, Reader
from surprise.model_selection import cross_validate
# Step 1: Prepare data
import pandas as pd
interactions = pd.read_csv('user_item_interactions.csv')
# Columns: user_id, item_id, rating
# Convert to Surprise format
reader = Reader(rating_scale=(1, 5))
data = Dataset.load_from_df(interactions[['user_id', 'item_id', 'rating']], reader)
# Step 2: Train model (SVD = Matrix Factorization)
algo = SVD(n_factors=20, n_epochs=20, lr_all=0.005, reg_all=0.02, random_state=42)
# Cross-validation
cross_validate(algo, data, measures=['RMSE', 'MAE'], cv=5, verbose=True)
# Train on full dataset
trainset = data.build_full_trainset()
algo.fit(trainset)
# Step 3: Generate recommendations for a user
def get_recommendations(user_id, n=10):
# Get all items
all_items = interactions['item_id'].unique()
# Get items user already interacted with
user_items = set(interactions[interactions['user_id'] == user_id]['item_id'])
# Predict ratings for items user hasn't seen
predictions = []
for item_id in all_items:
if item_id not in user_items:
pred = algo.predict(user_id, item_id)
predictions.append((item_id, pred.est))
# Sort by predicted rating
predictions.sort(key=lambda x: x[1], reverse=True)
return predictions[:n]
# Get top 10 recommendations for user
recs = get_recommendations(user_id='user_12345', n=10)
print(recs)
Advanced: Deep Learning with Neural Collaborative Filtering
import tensorflow as tf
from tensorflow import keras
# Step 1: Encode users and items as integers
user_ids = interactions['user_id'].astype('category').cat.codes.values
item_ids = interactions['item_id'].astype('category').cat.codes.values
ratings = interactions['rating'].values
n_users = interactions['user_id'].nunique()
n_items = interactions['item_id'].nunique()
# Step 2: Build neural network
def build_ncf_model(n_users, n_items, embedding_dim=50):
# User input
user_input = keras.Input(shape=(1,), name='user_input')
user_embedding = keras.layers.Embedding(n_users, embedding_dim, name='user_embedding')(user_input)
user_vec = keras.layers.Flatten()(user_embedding)
# Item input
item_input = keras.Input(shape=(1,), name='item_input')
item_embedding = keras.layers.Embedding(n_items, embedding_dim, name='item_embedding')(item_input)
item_vec = keras.layers.Flatten()(item_embedding)
# Concatenate user and item embeddings
concat = keras.layers.Concatenate()([user_vec, item_vec])
# Dense layers
dense = keras.layers.Dense(128, activation='relu')(concat)
dense = keras.layers.Dropout(0.3)(dense)
dense = keras.layers.Dense(64, activation='relu')(dense)
# Output: predicted rating
output = keras.layers.Dense(1, activation='linear')(dense)
model = keras.Model(inputs=[user_input, item_input], outputs=output)
model.compile(optimizer='adam', loss='mse', metrics=['mae'])
return model
# Step 3: Train
model = build_ncf_model(n_users, n_items)
model.fit(
[user_ids, item_ids],
ratings,
batch_size=256,
epochs=10,
validation_split=0.2,
verbose=1
)
# Step 4: Predict
def predict_rating(user_id, item_id):
user_idx = interactions[interactions['user_id'] == user_id]['user_id'].astype('category').cat.codes.iloc[0]
item_idx = interactions[interactions['item_id'] == item_id]['item_id'].astype('category').cat.codes.iloc[0]
pred = model.predict([[user_idx], [item_idx]])
return pred[0][0]
# Recommend items for user
def recommend_items(user_id, n=10):
user_idx = interactions[interactions['user_id'] == user_id]['user_id'].astype('category').cat.codes.iloc[0]
all_item_indices = range(n_items)
predictions = []
for item_idx in all_item_indices:
pred = model.predict([[user_idx], [item_idx]], verbose=0)
predictions.append((item_idx, pred[0][0]))
predictions.sort(key=lambda x: x[1], reverse=True)
return predictions[:n]
When to use deep learning:
- ✅ Very large datasets (10M+ interactions)
- ✅ Complex patterns (non-linear relationships)
- ✅ Additional features (user demographics, item metadata)
When NOT to use:
- ❌ Small datasets (<100K interactions) → Overfitting
- ❌ Need explainability (neural networks = black box)
Handling Cold Start Problem
Cold start = No data for new users or new items
Cold Start: New Users
Problem: New user signs up, no interaction history → Can't use collaborative filtering
Solutions:
1. Popularity-based (simplest):
-- Show trending/popular items to new users
SELECT
item_id,
COUNT(*) as interaction_count,
AVG(rating) as avg_rating
FROM user_item_interactions
WHERE interaction_date >= CURRENT_DATE - INTERVAL '7 days'
GROUP BY item_id
ORDER BY interaction_count DESC, avg_rating DESC
LIMIT 10;
2. Ask for preferences (onboarding):
- "What are you interested in?" (select categories)
- "Rate these popular items" (collect initial ratings)
- Use content-based filtering with these preferences
3. Demographic-based:
# Recommend items popular among users with similar demographics
similar_users = users[(users['age_group'] == new_user['age_group']) &
(users['gender'] == new_user['gender']) &
(users['country'] == new_user['country'])]
popular_among_similar = interactions[
interactions['user_id'].isin(similar_users['user_id'])
].groupby('item_id').size().nlargest(10)
4. Hybrid approach: Content-based initially, switch to collaborative after 5-10 interactions
Cold Start: New Items
Problem: New product launched, no one bought it yet → Won't be recommended
Solutions:
1. Content-based:
# Find similar items based on attributes
new_item_features = [category, brand, price_range, color]
similar_items = find_items_with_similar_features(new_item_features)
# Recommend new item to users who liked similar items
users_who_liked_similar = get_users_who_liked(similar_items)
recommend_new_item_to(users_who_liked_similar)
2. Editorial/featured:
- Manually feature new products on homepage
- Email campaigns to likely interested users
3. Exploration-exploitation:
- Show new items to random sample of users (exploration)
- Based on initial feedback, recommend to relevant users (exploitation)
Evaluation Metrics: Measuring Recommendation Quality
Offline metrics (historical data):
1. Precision@K and Recall@K
Precision@K: Of K recommendations, how many are relevant?
Precision@10 = (Relevant items in top 10) / 10
Recall@K: Of all relevant items, how many did we recommend in top K?
Recall@10 = (Relevant items in top 10) / (Total relevant items)
Example:
- User has 50 relevant items (items they'll like)
- We recommend top 10 items
- 6 of the 10 are relevant
Precision@10 = 6/10 = 60%
Recall@10 = 6/50 = 12%
Trade-off: Higher K → Higher recall, lower precision
Code:
def precision_at_k(recommended_items, relevant_items, k=10):
recommended_k = recommended_items[:k]
relevant_in_k = set(recommended_k) & set(relevant_items)
return len(relevant_in_k) / k
def recall_at_k(recommended_items, relevant_items, k=10):
recommended_k = recommended_items[:k]
relevant_in_k = set(recommended_k) & set(relevant_items)
return len(relevant_in_k) / len(relevant_items) if len(relevant_items) > 0 else 0
2. NDCG (Normalized Discounted Cumulative Gain)
Idea: Not all positions equal - item at position 1 more important than position 10
Formula:
DCG@K = Σ(relevance[i] / log2(i+1)) for i in 1..K
NDCG@K = DCG@K / Ideal_DCG@K (normalized to 0-1)
Code:
import numpy as np
def ndcg_at_k(recommended_items, relevance_scores, k=10):
"""
relevance_scores: dict {item_id: relevance (0-5)}
"""
recommended_k = recommended_items[:k]
# DCG
dcg = sum([
relevance_scores.get(item, 0) / np.log2(i + 2)
for i, item in enumerate(recommended_k)
])
# Ideal DCG (if we ranked by relevance perfectly)
ideal_order = sorted(relevance_scores.values(), reverse=True)[:k]
idcg = sum([
rel / np.log2(i + 2)
for i, rel in enumerate(ideal_order)
])
return dcg / idcg if idcg > 0 else 0
Interpretation:
- NDCG = 1.0: Perfect ranking
- NDCG = 0.7-0.9: Good
- NDCG < 0.5: Poor
3. Coverage and Diversity
Coverage: % of items that get recommended at least once
coverage = len(set(all_recommendations)) / total_items * 100
- High coverage (>50%): Good (long-tail items get exposure)
- Low coverage (<20%): Popular items dominate (filter bubble)
Diversity: How different are recommended items from each other?
# Average pairwise dissimilarity
diversity = mean([
dissimilarity(item_i, item_j)
for item_i, item_j in combinations(recommended_items, 2)
])
Trade-off: Accuracy vs Diversity
- Highly accurate recommendations may be too similar (boring)
- Diverse recommendations may include less relevant items (interesting)
Online Metrics (A/B Testing in Production)
Business metrics (most important!):
1. Click-through rate (CTR):
CTR = (Clicks on recommendations) / (Impressions) * 100
- Baseline (no personalization): 1-2%
- Good recommendations: 3-8%
2. Conversion rate:
Conversion = (Purchases from recommendations) / (Clicks on recommendations) * 100
3. Revenue per user (RPU):
RPU = Total revenue from recommendations / Total users
4. Average order value (AOV):
- Recommendations increase cart size (cross-sell, upsell)
5. Engagement metrics:
- Time on site
- Pages per session
- Repeat visits
A/B testing setup:
# Control group: No personalization (popular items)
# Treatment group: ML recommendations
# Split users randomly 50/50
if user_id % 2 == 0:
recommendations = get_popular_items()
group = 'control'
else:
recommendations = get_ml_recommendations(user_id)
group = 'treatment'
# Log for analysis
log_recommendation_shown(user_id, recommendations, group)
Analyze results:
SELECT
test_group,
COUNT(DISTINCT user_id) as users,
SUM(clicked) / COUNT(*) * 100 as ctr,
SUM(purchased) / SUM(clicked) * 100 as conversion_rate,
SUM(revenue) / COUNT(DISTINCT user_id) as revenue_per_user
FROM recommendation_logs
WHERE experiment_date >= '2025-01-01'
GROUP BY test_group;
Production Deployment: Architecture
Batch Recommendations (Daily/Weekly)
For: E-commerce product recommendations, email campaigns
Architecture:
Daily schedule (2 AM)
↓
Run ML model (BigQuery ML or Python)
↓
Generate top 20 recommendations per user
↓
Store in recommendations table (BigQuery)
↓
Application reads from table (fast lookup)
Pros: Simple, cheap, fresh enough for most use cases Cons: Not real-time (today's purchases not reflected until tomorrow)
Implementation:
-- Scheduled query (runs daily at 2 AM)
CREATE OR REPLACE TABLE `project.dataset.daily_recommendations` AS
SELECT
user_id,
ARRAY_AGG(item_id ORDER BY predicted_rating DESC LIMIT 20) as recommended_items
FROM ML.RECOMMEND(
MODEL `project.dataset.recommendation_model`,
(SELECT DISTINCT user_id FROM `project.dataset.active_users`)
)
GROUP BY user_id;
-- Application queries this table
SELECT recommended_items
FROM `project.dataset.daily_recommendations`
WHERE user_id = 'user_12345';
Real-time Recommendations
For: News feeds, video streaming (where recency matters)
Architecture:
User action (purchase, view) → Event stream (Kafka)
↓
Update user profile (feature store)
↓
Trigger recommendation API
↓
Model inference (<100ms)
↓
Return recommendations
Tech stack:
- Model serving: TensorFlow Serving, TorchServe, FastAPI
- Feature store: Feast, Tecton (cache user features)
- Caching: Redis (cache recommendations for 5-60 minutes)
Example API:
from fastapi import FastAPI
import joblib
app = FastAPI()
# Load model on startup
model = joblib.load('recommendation_model.pkl')
@app.get("/recommend/{user_id}")
async def recommend(user_id: str, n: int = 10):
# Get user features from feature store (fast)
user_features = get_user_features(user_id)
# Generate recommendations (model inference)
recommendations = model.predict(user_features, n=n)
return {
"user_id": user_id,
"recommendations": recommendations.tolist(),
"timestamp": datetime.now().isoformat()
}
Caching layer:
import redis
cache = redis.Redis(host='localhost', port=6379)
@app.get("/recommend/{user_id}")
async def recommend(user_id: str, n: int = 10):
# Check cache first
cached = cache.get(f"recs:{user_id}")
if cached:
return json.loads(cached)
# Generate recommendations
recommendations = model.predict(...)
# Cache for 30 minutes
cache.setex(
f"recs:{user_id}",
1800, # 30 min TTL
json.dumps(recommendations)
)
return recommendations
Case Study: Vietnamese E-commerce - 18% Conversion Increase
Background:
- Company: Fashion e-commerce, ~500K users, ~10K SKUs
- Baseline: Homepage shows "Trending" products (same for everyone)
- Problem:
- Low engagement (users browse, don't buy)
- High bounce rate (60%)
- Low repeat purchase rate (25%)
Implementation (12 weeks):
Week 1-4: Data preparation
- Collected historical data:
- 2M product views (last 6 months)
- 300K add-to-carts
- 80K purchases
- Built user-item interaction matrix:
- View = 1 point
- Add-to-cart = 2 points
- Purchase = 5 points
Week 5-8: Model training
- Algorithm: Matrix Factorization (BigQuery ML)
- Training data: 6 months interactions
- Evaluation (offline):
- Precision@10: 32% (baseline random: 1%)
- NDCG@10: 0.68 (good)
- Coverage: 65% of SKUs (vs 10% with trending)
Week 9-10: A/B testing
- Control (50% users): Homepage trending + category pages
- Treatment (50% users): Personalized recommendations
- Homepage: "Recommended for you"
- Product pages: "You may also like"
- Cart: "Complete your look"
Week 11-12: Analysis & scale
- Analyze A/B test results
- Scale to 100% traffic
Results after 3 months:
| Metric | Control (Baseline) | Treatment (ML Recs) | Change |
|---|---|---|---|
| Homepage CTR | 2.1% | 5.8% | +176% |
| Conversion rate (overall) | 3.2% | 3.8% | +18% |
| Average order value | 850K VND | 950K VND | +12% |
| Items per order | 1.8 | 2.2 | +22% |
| Revenue per user | 27K VND | 36K VND | +33% |
| Repeat purchase rate (90 days) | 25% | 31% | +24% |
Financial impact (annual):
- Additional revenue: 33% RPU increase × 500K users × 12 months = ~2B VND/year
- ML project cost: 200M VND (one-time) + 30M VND/year (maintenance)
- ROI: 900% first year
Key insights:
- "You may also like" on product pages: Highest CTR (8-12%), drives cross-sell
- Homepage recommendations: Good for engagement, but conversion lower than category browse
- Cart recommendations: Impulse purchases, increased AOV significantly
- Long-tail products: 40% of recommendations were non-trending items (discovery!)
Surprising finding: Diversity matters
- Initial model optimized purely for accuracy → Recommendations too similar (all same brand)
- Users found it boring, CTR dropped after 2 weeks
- Solution: Add diversity constraint (max 3 items from same brand in top 10)
- Result: CTR recovered, users happier
Implementation Options Comparison
| Option | Difficulty | Cost | Flexibility | Best For |
|---|---|---|---|---|
| BigQuery ML | ⭐ Easy | $ Cheap | ⭐⭐ Limited | Quick start, e-commerce |
| Python (Surprise) | ⭐⭐ Medium | $$ Medium | ⭐⭐⭐ Good | Custom needs, medium scale |
| Deep Learning | ⭐⭐⭐⭐ Hard | $$$ Expensive | ⭐⭐⭐⭐⭐ High | Large scale, complex patterns |
| AWS Personalize | ⭐⭐ Medium | $$$ Expensive | ⭐⭐⭐ Good | AWS-based, managed service |
| Google Recommendations AI | ⭐⭐ Medium | $$$ Expensive | ⭐⭐⭐ Good | GCP-based, retail focused |
Recommendation for Vietnamese companies:
- Start: BigQuery ML (if using BigQuery) or Python Surprise
- Scale: Deep learning if >10M interactions
- Enterprise: Consider managed services (AWS Personalize) if budget allows
Common Pitfalls & How to Avoid
1. Filter bubble (recommending only similar items):
- Problem: User likes action movies → Only recommend action → User bored
- Solution: Inject diversity (10-20% "exploration" - random or different genres)
2. Popularity bias (recommending only popular items):
- Problem: Long-tail products never recommended
- Solution: Penalize popular items in scoring, or sample recommendations from different popularity tiers
3. Ignoring business rules:
- Problem: Recommending out-of-stock products, competitors' brands
- Solution: Post-filter recommendations (remove out-of-stock, apply business constraints)
def apply_business_rules(recommendations, user, catalog):
filtered = []
for item in recommendations:
# Rule 1: In stock
if catalog[item]['stock'] == 0:
continue
# Rule 2: Not already purchased recently
if item in user['recent_purchases']:
continue
# Rule 3: Price filter (don't recommend 10x more expensive)
if catalog[item]['price'] > user['avg_purchase_price'] * 10:
continue
# Rule 4: Business constraints (e.g., don't recommend competitor brands)
if catalog[item]['brand'] in COMPETITOR_BRANDS:
continue
filtered.append(item)
return filtered[:10] # Top 10 after filtering
4. Not A/B testing:
- Problem: Assume recommendations work, but users don't click
- Solution: Always A/B test (control vs treatment), measure business metrics
5. Stale recommendations:
- Problem: Model trained 6 months ago, user tastes changed
- Solution: Retrain regularly (weekly/monthly), or use online learning
Kết Luận: Personalization = Proven Business Impact
Recommendation systems không phải magic - chúng là proven technology với clear ROI:
- 18-25% increase in conversion rate (typical)
- 10-15% increase in average order value (cross-sell)
- 20-30% increase in engagement (time on site, pages/session)
Implementation timeline:
- Month 1-2: Data preparation, model training
- Month 3: A/B testing, iteration
- Month 4-6: Scale, optimize, monitor
- Ongoing: Retrain monthly, add features, improve
Start small:
- Begin with 1 placement (e.g., "You may also like" on product pages)
- Use simple algorithm (Matrix Factorization via BigQuery ML)
- Measure impact (CTR, conversion, revenue)
- Expand to more placements (homepage, cart, email)
Next steps:
- Audit current recommendation approach (random? trending? none?)
- Assess data readiness (have user-item interactions?)
- Choose implementation (BigQuery ML for MVP)
- A/B test (treatment vs control)
- Liên hệ Carptech nếu cần hỗ trợ (carptech.vn/contact)
Tài liệu tham khảo:
- BigQuery ML Matrix Factorization
- Surprise Library Documentation
- Neural Collaborative Filtering Paper
- AWS Personalize
Bài viết này là phần 3 của series "Advanced Analytics & AI/ML" tháng 5. Đọc thêm về Analytics Maturity, Churn Prediction, và Demand Forecasting.
Carptech - Data Platform & ML Solutions for Vietnamese Enterprises. Liên hệ tư vấn miễn phí.




