Select your favorite method to visualize the cluster hierarchy.
The default coloring only reflects the depth of the cluster.
You can enable a color overlay to denote cluster compactness or modularity.
Lower Value Higher Value
Whisper Behavioral Clusters
About This Tool
We build a tool to automatically discover the natural formation of user categories in Whisper, based on their behavior as captured by clickstream data. A user's clickstream is a sequence of click events generated by the user when she uses the app. At a high-level, we assume that human behavior would naturally form clusters or groups, because large user populations tend to break into different types of users according to their habits, goals, and personalities. Our tool seeks to discover these natural clusters among users, where each cluster represents a specific user type or behavioral pattern. To understand the meaning of these resulting clusters (since they are unlabeled and do not necessarily match preconceived categories), we identify key behavioral patterns for each cluster that are primarily responsible for distinguishing users in the cluster from others. These key patterns thus can serve as the cluster labels.
Our tool builds clusters hierarchically, where clusters are nested, and bigger clusters represent the higher level categories of users. Smaller, sub-clusters represent more fine-grained user types. The circle size represents the number of users in a cluster.
How to Use this Tool
The clusters displayed on the left are generated from 100k Whisper users' clickstreams. They represent the major user behavioral categories in Whisper. You can browse different clusters to understand the detailed user behavioral patterns. To reference the clusters easily, we add labels for the top-level clusters (the text at the bottom of each cluster). For example, the biggest cluster contains users who are likely to read whispers sequentially.
By double-clicking on any cluster, you can zoom in on the cluster for more focused inspection. To zoom out, simply double click on the current cluster.
Clicking on a cluster will pop up a window to show more detailed information about this cluster. Here, we explain how to read the information in the pop-up window. First, on the top of the window, it shows the ClusterID and the Number of Users in this cluster. Then, below that, we show a list of Action Patterns that can characterize the behaviors of users in the cluster. Each row contains one Action Pattern. These Action Patterns are ranked by their distinguishing power in classifying users in this cluster (more important patterns on top). We only display the most important Action Patterns in the pop-up window and the rest are omitted because they are significantly weaker.
- The first column shows the Rank of the Action Pattern. A pattern with a higher ranking means this pattern is more important in classifying users in this cluster.
This is an example Action Pattern:
View Whisper 1M View Whisper 1M View Whisper
This pattern indicates users like to "view whispers" one after another with time gaps less than a minute. A pattern is likely to repetitively appear in a user's clickstreams. Note that the time gaps have been discretized as time gap events.
1S < one second 1M [1 second, 1 minute) 1H [1 minute, 1 hour) 1D [1 hour, 1 day) 1D+ > one day.
The Frequency Distribution shows how frequently does the Action Pattern appear in users in this cluster versus outside the cluster.
You can see how users in this cluster are different from users outside of the cluster on this particular Action Pattern. The red bars show the pattern frequency distribution (PDF) for users within the selected cluster. As a baseline comparison, the green bars show the pattern frequency outside the cluster. In this example, the red distribution is more skewed to the right, indicating users in this cluster perform this activity more frequently than those outside (i.e., more likely to read whispers sequentially). The more different the two distributions are, the more useful the Pattern is to characterize users in this cluster.
- Finally, the Score column shows the Chi-Square score of the Action Pattern. We rely on this score to rank Action Patterns (a higher score means the pattern is more important). Socres are colored from higher to lower.
- Inactive Users: this cluster contains about 20% of users in the dataset. Users in this cluster usually do nothing but passively receive push notifications. These users are likely to turn dormant (quitting the app), and further actions are needed to regain their engagements.
- Block Users in Chat: Users in this cluster tend to block others during their private chat. It is likely that these users are harassed by others, or having unpleasant conversations.