How to Segment EBAY Mobile Buyers?
来源:互联网 发布:南风知我意txt百度云 编辑:程序博客网 时间:2024/05/22 04:57
Author:Li Zhong
Gartner Predicts Mobile Web Beats PC by 2013.
By 2013, mobile phones will overtake PCs as the most common Web access device worldwide. According to Gartner's PC installed base forecast, the total number of PCs in use will reach 1.78 billion units in 2013. By 2013, the combined installed base of smartphones and browser-equipped enhanced phones will exceed 1.82 billion units and will be greater than the installed base for PCs thereafter.
The same trend has been reflected for EBAY mobile buyers as the prediction (see Fig 1-1 and Fig 1-2), based on EBAY next 3 year’s strategy, Mobile will be the key contributor for EBAY market place, so in this case study, I will dive deeply into EBAY Mobile Buyer segmentation which is the cornerstone for EBAY mobile marketing, and apply k-means++ algorithm instead of classical 125 RFM categories to segment the buyers in this article.
Fig1-1: Global Mobile vs. Desktop Internet User Projection Fig 1-2:Global TMV by Retail Weeks
1. RFM introduction
RFM is a method used for analyzing customer behavior and defining market segments. It is commonly used in database marketing and direct marketing and has received particular attention in retail. RFM model is proposed by Hughes in 1994, and has been used in direct marketing for several decades. This model identifies customer behavior and represents customer behavior characteristics by three variables:
RFM stands for
- Recency - How recently did the customer purchase?
- Frequency - How often do they purchase?
- Monetary Value - How much do they spend?
2. Data Preprocessing
In this case study, I will only segment EBAY USA mobile buyers for simplicity, same approach can be extended to other countries as well.
2.1 RFM Preparation
Data preparation is the most important parts of the data mining process. In this step, the EBAY USA mobile buyer’s purchase data for the previous 12 months (from 2011/07 to 2012/07) will be converted to an appropriate format for the RFM model (please check out the source code attached in the appendix) .Totally we have 5,430,406 US mobile buyers in this data set. Three metrics that are used for RFM model will be calculated based on the below logic.
- Recency - The interval between the latest purchase behavior happens and present, it can be calculated by latest purchase date minus start date.
- Frequency- The number of transactions that a customer has made within the last 12 months, it can be calculated as
No.
Buyer id
Recency
Frequency
Monetary
1
69336277
2
2
177
2
445960625
119
1
25
3
55812448
47
6
260
4
34443332
91
19
749
5
117080525
267
1
34
- Monetary- The cumulative total of money(USD) spent by a particular customer, it can be calculated as
2.2 RFM Metrics Visualization
Before we apply segment model for RFM dataset, we need visualize the three metrics(R,F,M) distribution for further normalization.
Fig 2-2-1: Recency Histogram & Density Graph
Fig 2-2-2: Frequency Histogram & Density Graph
Fig 2-2-3: Monetary Histogram & Density Graph
2.3 RFM Normalization
Due to skewed distribution for recency, frequency and monetary, I will consider to adopt log normalization which uses logarithms to better represent data that is highly skewed. Log normalization is helpful when values are clustered around small values with few large values.
Due to the correlation between recency and customer loyalty is negative, so the log_recy will be 1 minus logarithmic recency value, and I will choose natural log instead of log base 10. the reason is based on Figure 2-3-2, you can clearly see that the log base-10 transforms the values substantially, whereas the natural log still provides some variation that will allow us to capture the meaningful change.
- Normalized recency
- Normalized frequency
- Normalized monetary
Fig 2-3-1: Natural Log Normalization Result
No.
Buyer ID
Recy
Freq
Monty
Log_recy
Log_freq
Log_monty
1
69336277
2
2
177
0.81396359
0.1860364
0.3456098
2
445960625
119
1
25
0.18929748
0.1173759
0.2173055
3
55812448
47
6
260
0.34445998
0.3295158
0.3711372
4
34443332
91
19
749
0.23428102
0.5072902
0.4415395
5
117080525
267
1
34
0.0532355
0.1173759
0.2371313
Fig 2-3-2: Monetary, with absolute value and ln and log10 transformations
3. Clustering
3.1 Cluster Number K Parameter Estimation
Considering the RFM 8 different variations (2*2*2) and easy marketing operation, I will choose 8 as the cluster number, also based on the below SSE(sum of squared error) graph, 8 is also the reasonable number for the model.
Fig 3-1: Number of Cluster SSE(Sum of Squared Error)
3.2 K-means++ Cluster
I will use k-means++ algorithm to segment the mobile customer, here is the reason why I choose k-means++.
The k-means++ algorithm can initialize the cluster centers before proceeding with the standard k-means optimization iterations, with the k-means++ initialization, the algorithm is guaranteed to find a solution that is O(log k) competitive to the optimalk-means solution.
After apply k-means++ algorithm for EBAY mobile buyer purchase data, we got 8 segments which are clearly visualized in 2D and 3D in the below graphs.
3.3 Customer Segment Summary Table
3.4 2D Customer Segmentation Visualization
3.5 3D Customer Segmentation Visualization
4. Case Study Summary
Figure 3-3 presents the result, listing eight clusters, each with the corresponding cluster no, their average actual and normalized R, F and M values, cluster size, % to total customers, cluster score, cluster rank, RFM pattern, and customer loyalty. The last row also shows the total average for all customers. (Note for RFM Pattern, if the average (R,F, M) value of a cluster exceeded the overall average (R, F, M), then an upward arrow ↑ was included, otherwise and downward arrow ↓ was included.), and we can see each segment has very clear characteristics.
Fig 4-1: EBAY Mobile Buyer Eight Segment(% of Total Customers & % of Total GMV)
- C7(Best, 3.23%, R↑F↑M↑) - The most valuable customer segment, because it consists of customers who have recently made frequent purchases, and also have higher average purchase frequency and purchase amount. So potential marketing action is to initiate the VIP program to keep and nurture this high value segment.
- C2(Valuable, 7.49%, R↑F↑M↑) - The next valuable customer segment who has nearly the same characteristic as C7’s customer, we can apply same potential marketing action as Best segment.
- C5(Shopper, 5.30%, R↑F↑M↓) - The frequent shopper segment, they purchase frequently but with low monetary, so the potential marketing action is to recommend the cross-sell or up-sell items to them to increase order size.
- C8(Churn,13.19%, R↓F↑M↑) - The churn segment, they have made a high number of purchases with high monetary values but not for a long time. It seems to be an indicator of churn likelihood. So the potential marketing action is to initiate customer reactivation program for this segment.
- C4(Recent Visitor,11.95%, R↑F↓M↓)- The recent visitor segment, they have recently visited the EBAY site, with higher recency and lower purchase frequency and monetary value. So the potential marketing action is to recommend discount or promotion program to them frequently in order to convert them to be high value segment.
- C3(Recent Visitor Churn, 18.27%, R↑F↓M↓) - The recent visitor churn segment, they have visited the EBAY site not long time ago, with higher recency and lower purchase frequency and monetary value, but indicate will churn. So the potential marketing action is to initiate customer reactivation program for this segment.
- C6(Spender,19.91%, R↓F↓M↑) – The spender segment, they not visit EBAY site recently and frequently, but if they come to EBAY, they will purchase a lot, so the potential marketing plan is to recommend items they like in order to attract them visiting eBay frequently.
- C1(Uncertain, 20.64%, R↓F↓M↓) - The least valuable segment for EBAY business, they are generally the least likely to buy again, so no marketing plan for this segment.
- How to Segment EBAY Mobile Buyers?
- How to use segment advisor
- How to implement segment tree
- The 3 Types of Buyers, and How to Optimize for Each One
- How to Make Money Online with eBay, Yahoo!, and Google
- How to read calllogs on windows mobile
- How to develop a window mobile application
- How to debug mobile safari in iOS
- 一HOW TO EXCUTE CODE IN DATA SEGMENT
- How to list processes attached to a shared memory segment in linux?
- How is Windows Mobile related to Windows CE?(zz)
- How is Windows Mobile related to Windows CE?
- How to configuratie the Windows Mobile DLL debug environment
- How to Do Everything with Your Smartphone, Windows Mobile Edition
- 两个在Windows mobile开发中常用的how to.
- How To Make Your Websites Faster On Mobile Devices
- How to update jQuery Mobile in Dreamweaver CS6
- (转载)How to create a mobile WordPress theme with jQuery Mobile
- cvc-complex-type.2.4.a: Invalid content was found starting with element 'init-param'.
- handler机制原理全面整理
- C++输出有效数字位数
- Linux时间子系统之四:定时器的引擎:clock_event_device
- EL表达式
- How to Segment EBAY Mobile Buyers?
- android spinner 修改字体颜色和大小
- UVA - 12206 Stammering Aliens (hash)
- dom4j读写xml简单demo
- 510D Fox And Jumping(dp+gcd)
- Win7环境变量下的用户变量和系统变量的区别
- Linux时间子系统之五:低分辨率定时器的原理和实现
- uva 11218 KTV(DFS+回溯)
- 使用Redis构建消息队列和发布订阅系统