[[Why Algorithm-Generated Recommendations Fall Short.pdf]]
### Dell recommendation system
'Customers who bought this item also bought'
'Frequently bought together'
Phase 1: Basic filtering: Manual choice pool based on popular SNP purchases in a certain geography and customer segment slice. 
Phase 2: Content filtering: Build ML model on historical data with business validation (XPS users should only see premium accessories) for recommendation and compare with ph1 performance. 
Phase 3: Hybrid of content based filtering and collaborative filtering. 

White paper - Dell - [[Personalization at Scale - Final 2.5 1.pdf]]

Amazon has released its **deep-learning software DSSTNE** as open source
Data sources: 
   1. Product data - name, description, price, availability, brand, inventory, franchise
   2. Customer data - History of user events, shopper persona/profile, device, orders, geo, domain/industry, custom attributes
   
   Perform eligibility checks to include/exclude SKU rules (availability, delivery SLA, exclusion based on purchase history)
   
   Build choice pool
   
   Choice pools based on 
   
   1. Purchase history of self or industry or frequently bought together combinations
   2. Business promotions (bundles, upsells to increase attach rates)
   3. Default software n peripherals recommendations
   
   Recommendation type: 
   
   1. Recommended for you
   2. Other products you may like
   3. Frequently bought together
   
   Objectives: 
   
   1. Increase CTR
   2. Increase conversion rate
   3. Increase revenue per session
   
Questions: 
1. How is choice pool decided for frequently bought together? 
2. What kind of data is pre-processed to come up with the choice pools real-time?
	1. Only available to sell data is pre-processed
	2. Data is processed to build ranking
	3. Business validations are applied on ranked data to filter further
	4. Show filtered ranked data in recommendation aisle
3. Which type of algorithms do we use (clustering, logistic regressions, predictions)?

Stages of recommendation system
![[Pasted image 20230222174632.png]]

1. Data retrieval / Candidate generation
    1. Used to first understand what kind of data is relevant to be used as “Features” for a recommendations system. 
    2. User waiting time (latency) is a tight window (500 mili seconds)
    3. Need to retrieve data based on business sense and product thinking heuristics
        
        Social media feed example: 
        
        1. Content from user’s geography in last x hours
        2. Content from user’s friends/followers
        3. Content related to one that user liked in the past
2. Filtering
    1. Data could get very large without filtering invalid attributes such as -
        
        Social media feed example:
        
        1. spammy links
        2. geo-licensing filter
        
        Dating app feed example: 
        
        1. Location
        2. Sexual preferences
    2. Filtering and retrieval can have attributes across, so this requires product thinking to understand relevance to user
3. Feature extraction
    1. Feature Extraction aims to reduce the number of features in a dataset by creating new features from the existing ones (and then discarding the original features).
        
        These new reduced set of features should then be able to summarize most of the information contained in the original set of features.
        
        Principled component analysis is one of the techniques used to extract features based on their relevance to users actions (e.g. in case of social media feed - like/comment a post)
        
    2. Important to optimize features cause if you have 100 features and 1000 entries in db, then have to fetch 100K attributes within a small time which can become challenging.
4. Scoring
    1. Used to assign a score to each feature to predict probability of user’s action. 
    2. Product thinking is required to understand what will make sense. 
        
        Social media example: if probability of click is scored and sorted, the clickbaity posts will be ranked higher. For user engagement, we want mix of clicks, likes, comments. 
        
5. Ranking
    1. This is done to select top x datapoints to show in recommendation system
    2. Sorting can be complicated set of business rules to diversify results a bit
        
        Social media example: Sort by author’s relevance but don’t show posts from same author back to back.
        
6. Feature logging
    1. Used to log desirable fraction of features to train model as well as to provide observability
7. Training data
    1. Feature data is logged to understand if users acted on recommendations. 
8. Modeling
    1. Stage which takes training dataset and spits out a model which can be used during storing stage
    
    3 substages: 
    
    1. Problem formulation
    2. Data preparation
    3. Algorithms

![[Pasted image 20230222174859.png]]