Predective Analytics in E-commerce Application
Abstract—This paper outlines the principal of machine learning and predictive analytics. It explains the fundamentals of big data and evolving E-commerce technology. Business over web has observed tremendous growth during recent times. Organizations around the globe are realizing that E-commerce is just no more about buying or selling over the internet, rather it’s more about how dynamic and interactive a user experience can get. Analyzing customers behavior, the organization will be able to improve sales and retain customers. There are several customer behaviours which can be predicted using predictive analytics in an E-commerce organisation. It is important to start with the outcomes that you would like to predict. This paper focuses on tracking customers purchasing behaviour to improve sales by recommending items to customers on an E-commerce platform. A case study has been presented in which we are going to see how to build recommender system using python. In this study we present an intelligent recommendation system to suggest items to customer.
Keywords—E-commerce; technology; Predective analytics; Customer purshasing behaviour; shopping pattern; recommender system; pedective modelling
I. Introduction
In today’s generation of online commerce, predictive analytics technology plays very crucial role. There are several ways with which predictive analytics can help an organization to grow, it is important to categorize which use is relevant to your business and pick the area that will create the maximum opportunity by analyzing the desired targets. You may consider increasing the company revenue, detection of fraud, optimizing customer service, cost effective techniques, customer behavior insights. Once the appropriate target is selected predictive analytics can generate huge competitive advantage for an online retailer. Though there are few limitations, for instance models need to undergo quality check before implementation and further human intervention is necessary to maintain and run the model, however advantages outweigh the drawbacks. There are numerous advantages for using predictive analytics in E-commerce and once deployed, benefits are observed instantly. Here are some leading trends that are making their ways to the forefront of the business today. Recommendation engines similar to those used in Netflix and Amazon uses past purchases and buying behavior to recommend new purchases to consumers. Risk engines to forecast market strategy, innovation engines for new product innovation, customer insight engines and optimization engines for complex operation and decision making. Today we are at the tip of iceberg in terms of applying predictive analytics to solve real world problems. Predictive analytics approach unleashed the might of the data. In short, this approach allows us to predict the future. Data science algorithms can effortlessly predict who will buy, cheat, lie, or die in the near future.
Get Help With Your Essay
If you need assistance with writing your essay, our professional essay writing service is here to help!
II. Introduction to Predictive modelling
Predictive modelling is an ensemble of statistical algorithms coded in a statistical tool, which when applied on historical data, outputs a mathematical function or equation. It can in turn be used to predict outcomes based on some inputs (on which the model operates) from the future to drive a business context or enable better decision making in general. Predictive modelling continues to generate great deal of interest in recent generation. (Konnie L. Wescott, R. Joe Brandon, 1999, 6). To understand what predictive modelling is, let us focus on terms highlighted previously.
A. Ensemble of statistical algorithms
Statistics are important to understand data. It tells volumes about data. How is the data distributed? Is it centered with little variance or it varies widely? Statistics helps us answer these questions. Algorithms, on the other hand are the blueprints of a model. They are responsible for creating, mathematical equations from historical data. They analyze the data, quantify the relations between the variables and convert it in to a mathematical equation. There are variety of them: Linear regression, logistics regression, clustering, decision trees, natural language processing and so on. These models can be classified under two classes: Supervised algorithms and unsupervised algorithms.
Supervised algorithms: These are the algorithms wherein the historical data, an output variable in additional to the input variables. The model makes use of the output from historical data, apart from the input variables. The example of such algorithms includes Linear regression, Logistic Regression Decision Trees and so on.
Unsupervised algorithm: These algorithm work without an output variable in the historical data. The examples of such algorithm include clustering.
B. Historical data
In general, model is built on historical data and works on the future data, Additionally, a predictive model can be used to fill the missing values in historical data by interpolating the model over sparse historical data. During modelling future data is unavailable hence historical data is used in sampling to act as future data.
C. Mathematical function
Most of the data science algorithms have underlying mathematics behind them. In many of the algorithms, such as regression, equation is assumed and parameters are derived by fitting the data to the equation.
D. Business context
All the effort that goes into predictive analytics and all the worth, which accrues to data, is because it solves a business problem. Business problems can be anything and varies widely.
As discussed earlier, predictive modelling is and interdisciplinary field sitting at the interface and requiring knowledge of four disciplines such as statistics, algorithms, tools, techniques and business sense.
III. Recommender System
Recommender systems are widely used in the e-commerce market for personalized and unique recommendations of other products for each customer.” In a world where a site’s competitors are only a click or two away, gaining customer loyalty is an essential business strategy” (Reichheld and Sesser, 1990) (Reichheld, 1993) The recommended products can be anything for example physical goods, films, music, articles, social tags and services. The system enriches the online experience, increases the conversion rate and affects the revenues positively (Schafer, Konstan and Riedl, 1999). Theoretically, recommender systems are a “spectrum of systems describing any system that provides individualization of the recommendation results and leads to a procedure that helps users in a personalized way to interesting or useful objects in a large space of possible options”(Lampropoulus and Tsihrintzis 2015, p.1). A recommender system helps its user by filtering an overload of information by providing the most appropriate and valuable information for the specific user. To make recommendations, personal information about the user preference is required in order to predict the user’s rating for other items than they have been in touch before. There are three different methods of collecting knowledge about user preferences: implicit, explicit and mixing approach. The implicit approach does not require any active involvement from the user and is based on recording the user behavior. A typical example of implicit rating is a historic purchase data. The explicit approach is based on user interrogation by requiring the user to specify their preference for any particular item. Lastly, the mixing approach is a combination of the previous two. There are two main approaches of designing a recommender system: content-based methods and collaborative methods. By assuming that a user’s preferences remain unchanged through time, one can predict their future actions based on past user behaviors. In other words, all the information stored about the user will be used to customize the services offered. While, the main assumption for collaborative filtering is that similar users prefer similar items. This method relies entirely on interest ratings from the users and can be categorized into two different branches: model-based and memory-based. The model-based algorithms use statistical and machine-learning techniques to make predictions based on the underlying data. The memory-based methods can be further divided into two classes: user-based and item-based. User-based collaborative systems make user-user similarity calculations by matching the user against a database of other users who have similar interests. Items that the other users have bought but unknown to the specific user are offered as a recommendation for the specific user. The item-based collaborative system is, on other hand, based on matching a specific item to a database of other items. Thus, this approach is based on item relations rather than user relations and makes the final prediction based on similarities between items which have been rated by a common user.
In order to build a recommender system to recommend products to the customer we will be using collaborative filtering. Collaborative filtering works on just three pieces of data. A user or a customer, an item, and an affinity score between the user and the item.
IV. Examples of recommender system
In this section we will see few of the reputed E-commerce companies that utilize one or more variations of recommender system technology in their web sites.
A. Amazon.com
Amazon uses the recommender system in many aspects, Amazon videos, Amazon Appstore, Amazon logistics, web page recommendations, customer and seller services. Let’s see how Amazon uses each aspect in detail.
In books, Amazon used customer who brought feature. This feature is found on the information page for each book in the catalog. The first recommends books frequently purchased by customers who purchased the selected book. The second recommends authors whose books are frequently purchased by customers who purchased works by the author of the selected book.
B. Netflix
More than 80 percent of the TV show people watch on Netflix are discovered through the platform’s recommendation system. That means the majority of what you decide to watch on Netflix is the result of decision made by machine learning and algorithm. Netflix uses machine learning and algorithms to help break viewers preconceived notion and find shows that they might not have initially chosen.
C. eBay
The Feedback Profile feature at eBay.com™ (www.ebay.com) allows both buyers and sellers to contribute to feedback profiles of other customers with whom they have done business. The feedback consists of a satisfaction rating (satisfied/neutral/dissatisfied) as well as a specific comment about the other customer. Feedback is used to provide a recommender system for purchasers, who are able to view the profile of sellers. The seller profile consists of historical rating from the sales made in past years and all the seller feedback and reviews are available for the customer.
V. Case study
Let’s take an example of person purchasing a laptop from a E-commerce website. Addition to laptop one might need charging pads, mouse and additional warranties for damage. Knowledge of the customer’s purchasing desires and situations will create upsell and cross sell opportunities for the companies to sell the product and make some quick profits from the data available.
Up-sell means selling additional items in the same category along with the main motivational purchase. Cross-sell relates to selling addition items in different categories that the customer might desire.
If a person purchases a high end laptop, the person might be further interested in purchasing a high end game, gaming accessories, hard disk, router, antivirus software or Microsoft office suit. There are a few factors we might want to consider to determine the cross and upsell opportunities related to particular customer.
If we can predict such events, related or desired products can be recommended to customer.
In this case study we are going to see how to implement recommended items in python. In order to recommend the product to customer which similar people brought. In this case we will use data about which customer brought which products and based on that build an item to item affinity score and then use it to recommend items to customer. Here is a data file which includes the UserId and ItemId
The data file meant for representation consists of user ID and item ID. From the data we can see the use 1001 has purchased items 5001, 5002 and 5005
To extract information, we will load the file on jupyter notebook and build an affinity score between items based on users who purchased them.
We are going to find affinity of every item to other item and the way I’m going to do it is by finding out how many customers have bought both these products. The higher the customers who has brought the items, the higher is going to be the affinity score.
Once the affinity scores between each item have been printed. We see here Item 1 to 2 has a high affinity score of .4, whereas 5,001 to 5,003, there is no affinity at all.
In this list of affinity score, in order to recommend items to customer, we are going to go back to this table, go to all the records that are item one in the first column, and get the list of all the items two and their scores. And we can do that in descending order. And those items that you see here is what I want to recommend. Let’s further see how we can use the affinity scores to know which products can be recommended to customer 50001.
VI. Results
In the following case study, we were able to construct a simple recommender system based on customers purchasing behavior. We have taken in to consideration the item and user data to find the affinity score so that products can be recommended to customers. So for 5001, we see that 5002 and 5005 has a score of .4, of 5004 has .2, and 5003 has zero. We can further classify a threshold limit above which we will recommend items. For example, we are going to only recommend those items whose score is above a .25, then we would recommend the products 5,002 and 5,005 to the customer. Ryan Aminollahi
https://towardsdatascience.com/predictive-customer-analytics-part-iv-ab15843c8c63
VII. Conclusion and future of recommendation system
Current state of technology
The industry is trying to integrate various recommender system which works on Point of interest or meta data or group recommendations. Every system is built according to the requirements of the organization. Sofiya Mujawar, former Data Scientist at Big Data Solutions
In my opinion the recommender systems can be applied to ever more broader aspects which includes daily life issue. Recommender systems can be applied to solve daily life issue and recommend curse of the day, which includes day to day activity and food habits. Which provide functionalities to keep track of nutritional consumption as well as to persuade users to change their eating behavior in positive ways. Web services in particular suffer from producing recommendations of millions of items to millions of users. The time and computational power can even limit the performance of the best hybrid systems. For larger dataset, we can work on scalability problems of recommendation systems.
References
The template will number citations consecutively within brackets [1]. The sentence punctuation follows the bracket [2].
[1] Konnie L. Wescott, R. Joe Brandon, “Practical Applications of GIS for Archaeologists: A Predictive Modelling Toolkit” , 1999, pp. 6
[2] Frederick F. Reichheld and W. Earl Sasser, Jr 1990. Zero Defections: Quality Comes to Services. Harvard Business School Review, 1990(5): pp. 105-111.
[3] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68-73.
[4] I.S. Jacobs and C.P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G.T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271-350.
[5] K. Elissa, “Title of paper if known,” unpublished.
[6] R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press.
[7] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740-741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
[8] M. Young, The Technical Writer’s Handbook. Mill Valley, CA: University Science, 1989.
Cite This Work
To export a reference to this article please select a referencing style below: