From the course: Machine Learning with Python: Foundations

What is unsupervised learning? - Python Tutorial

From the course: Machine Learning with Python: Foundations

What is unsupervised learning?

- [Instructor] Unsupervised machine learning is a process of building a descriptive model. We can think of descriptive models as machine learning models that help us summarize and group data in new and interesting ways. When we have data that may have patterns that we cannot readily observe, we use a descriptive model to help us identify those patterns. To illustrate how unsupervised learning works, let's assume that we are part of the analytics team at the local credit union. Our task is to figure out better ways to market our products to our bank card customers. Instead of creating a marketing strategy for each customer, we could decide to use machine learning to group or segment our customers based on similarity, and then create a marketing strategy for each segment. Let's also assume that we already have two kinds of information about our customers. The first is historical information about the spending habits of each bank card customer. With this information, we could assign a spending score to each customer, depending on how often they use their bank card and how much money they spent on an annual basis. The second kind of information we have is demographic information. This includes age, gender, annual income, educational level, and so forth. In order to segment customers based on similarity, we pass the historical data to an unsupervised machine learning model. And this type of model is called an unsupervised model because there is no pre-existing rubric on which to evaluate the models output. In other words, there is no external supervisor telling the model whether its output is right or wrong. Unsupervised machine learning models analyze the input they receive using a descriptive approach in order to highlight the patterns or interesting relationships that exist in the data. To add some specifics to the previous example, let's say that we have the data shown here for 10 of our bank card customers. Our objective is to group customers based on spending score and income level alone. On a two dimensional plane, each of our 10 customers could be represented this way. Using just a spending score and income level, an unsupervised machine learning model could automatically evaluate each customer and group them based on how similar they are to other customers. With each customer assigned to a group, we then assign them a segment label. This is customer segmentation. With this information, we can develop a marketing strategy targeted to each segment with the expectation that customers within each group will likely respond to a campaign in a similar way because they share similar characteristics.

Contents