Understanding Recommendation Systems: Using Various Filtering Techniques

Last updated on December 23rd, 2023 at 11:23 am

Every B2C website or app that we use today from Amazon, LinkedIn, Facebook, Netflix, Instagram uses some form of recommendation engine. In this post we will go over Understanding Recommendation Systems: Using Various Filtering Techniques and understand their advantages and drawbacks and where they could be effectively used.

I have been fortunate to envision, design, architect, and bring to life recommendation engines that utilize Collaborative Filtering, Content-Based Filtering, Community-based filtering, and a Hybrid System. The recommendation systems we built were used by millions of active users and were also later patented.

Recommendations systems primarily use various existing data points along with some algorithms or filtering mechanisms to suggest to the user what he/she might be interested in.

There are primarily 6 types of recommendation systems –

Collaborative Filtering Recommender Systems
Content-Based Filtering Recommender Systems
Demographic Recommender Systems
Knowledge-base Recommender Systems
Community-based Recommender Systems
Hybrid Recommender Systems

Collaborative Filtering Based Recommendation Method

Collaborative Filtering Based Recommendation Method is a form of recommendation or suggestion methodology where the system uses actions of other users to predict the current user might be interested in.

Recommendation systems – Collaborative Filtering based Recommendation Method

Check out the creativity 😛 So in the above diagram, you can see the guy in the Red Cape Loves – Harry Potter, Lord of the Rings, and Fantastic Beasts. Black Cape loves HP and LR so the system here would recommend Fantastic Beasts to Black cape. In a real system, you would have the engine would collate data from multiple people to recommend Fantastic Beasts to Black Cape.

Collaborative filtering methods primarily collect large amounts of user data and analyze patterns to predict what the user would like based on other user action data. The key advantage of the collaborative filtering technique is that it relies on user data rather than just using attribute data (i.e Content filtering see below).

The primary disadvantage of this system is the cold start problem. The cold start problem is when you want to launch a service like a movie streaming service in this case you would not have user utilization data at the launch of the service. For collaborative filtering to work you would need to have enough usable user data to make strong recommendations.

To do the above you need certain attributes/features like “Genre” The genre attribute could be one eg: Fantasy or could be several Fantasy, young adult, contemporary fantasy, Science Fiction. An attribute like genre could have a hierarchy or a weight associated with it. The better the data of the products, movies the system has the easier it is for recommendation systems to build better systems.

Content-Based Recommendation Method (Cognitive Filtering)

The content-based method primarily involves identifying attributes/features that the system can use to be correlation between two or more attributes. In the above example, if Black Cape needs recommendations, the system would look for movies with genre=Fiction Fantasy and recommend that. It does not take into account any user input.

You would be pleasantly surprised how effective this system can be if you have very good empirical data. Suppose the system has classified the three movies watched by Black Cape as “Fiction Fantasy” it would then recommend other movies which have the genre=”Fiction Fantasy”. While doing this the system could also consider one other attribute/feature like a movie rating. The result would then be movies with fall in the “Fiction Fantasy” genre ordered by their IMDB rating for example. One can keep adding +ve or -ve attribute/features in combination to make content-based filtering to be one step better.

Another way to do this would be to add a weight attribute to be associated to “Fiction Fantasy”. Then you would only recommend movies with a high “Fiction Fantasy” score to the user.

Group A = User has watched 3 Movies. All movies are in “Fictional Fantasy” Genre

Group B = Recommendations, ranked by IMDB Score.

Content Based Recommendation Method (Cognitive Filtering)

The advantage of content-based recommendation is that you do not have the cold start problem. But the disadvantage of the Content-based filtering methodology is that it relies on having accurate data and does not do well when you have data sparsity. The data sparsity problem is a pretty acute issue in content-based systems, take for example that there is a new genre that has become increasingly popular but you don’t have any movies that tie into the genre, this would then be clarified as a Data Sparsity problem.

The other issue with the content-based filtering method is that most often when you see large content-based data sets you obviously need a large team of people dedicated to manually curate the content data set. The manpower required here is quite large. I have seen content-based recommendation engines that need teams of 100s of people manually updating attributes/ features on a daily basis. The quality of your recommendation in a Content based recommendation system is only as good as the data you have.

Demographic Recommendation systems

Demographic recommendation systems is exactly what you think it is 🙂 It uses user demographic information to make recommendation. These systems take all available demographic information like gender, age, education, profession, occupation, race, ethnicity, income level, location.

Let’s take an example of movies loved by women then it can cluster and rank these moves and recommend them to other female users. Most often you find that this type of classification is too broad or coarse to be used as is, but that’s why most often you find that Demographic Recommendation Systems need to be used in conjunction with other systems.

Knowledge-Based Recommendation Systems

Knowledge-based recommendation systems take into account the explicit knowledge of a variety of user preferences and criteria to make recommendations. This method is used where content and collaborative methods fail. Take for example recommending houses for a user, the system would need to take into account a vast combination of attributes of houses as well as user preferences to make recommendations.

Community-Based Recommendation Systems

Community-based recommendation systems are systems that use a user group or friends to make recommendations. This technique follows the proverb Tell me who your friends are, and I will tell you who you are 🙂 This is a system that gets recommendations from your friends and people you follow. With the dramatic growth of social networks over the last decade this type of recommendation system has come to the forefront in recent times.

Hybrid Recommendation Systems

This is by far the most commonly used approach in all modern recommendation applications. Hybrid recommendation systems combine one or more of the above-discussed methods to give one output. Take for example combining both content-based recommendation and collaborative recommendation methods.

Every situation is kinda unique sometimes you carry out the content-based recommendation first as this data does not change as frequently once you have this you layer on top of it the collaborative filtering using the data from the content filtering. There are some systems where both methods are intertwined. Hybrid methods provide the best of both worlds and overcome the cold start and the data sparsity problems.

Evaluation: Quantitative Analysis

Once you have built your recommendation system you built an A/B testing framework. You can pit one recommendation system against another to evaluate the effectiveness of the recommendation systems. This is where you can tweak the algorithms and check if there is a favorable result or not.

Primarily you are looking for both explicit and implicit user feedback when evaluating one system against the other. Explicit feedback is when the user takes specific action to rate or give feedback on the recommendation while implicit feedback is when you monitor the user action/activity to give you insight into positive or negative bias towards the recommendations

Explicit feedback in a scenario like Netflix would be if a movie recommendation was actually viewed and rated by the user, i.e the user watched and rated the movie recommended by the system. A implicit feedback would be when the user only watches the move but does not take an action to rate it. You can also measure things like when a recommended movie was watched for less than 15 min, which implies that the user tried out the movie but did not like it enough to continue watching it.

Recommended Reading :

4 Comments

mobilchet on at 8:54 am

thankss admın
Anonymous on at 2:56 pm

Thanks Smair !
samir majhi on at 1:00 am

Thank you Mario. I was asked for a recommendation algorithm for Netflix in an interview.

I thought of content based filtering and rejected it because it wouldn’t capture the magic that makes one movie better than another with the same content. So I thought of the collaborative method and even a way of testing it without building it and was very satisfied with my own answer.

Reading your article shows me that I should have started with the user first and what we know about them. Then movies and what we know about them. Then think of the other methods as well.
Anonymous on at 12:54 pm

Nice article.