Machine Vs. Operational Learning

Today’s economy, led by advances in changing economic cycles and communication, relies more on data science. Big data techniques, a significant component of data science and business intelligence, are utilized to harness vast amounts of data quickly for analysis. Increasingly, the volume and availability of granular data, coupled with highly specific and powerful analytical tools such as R and SAS drive organizations toward making more accurate predictions with the prospect of increasing sales and generating organizational efficiencies. These predictions help enable efficient supply chains, driving down costs for producers and leading to more expedient delivery of products and services for consumers.

The widely acclaimed global phenomenon known as machine learning helps push boundaries of analysis and decision making perspective. It is only in the past decade where machine learning has become part of the daily conversation. Machine learning in its essence is the ability of machines and computer based systems to learn without requiring human guidance. Deep learning is one of the features of machine learning where algorithms use artificial neural networks to form complex models. Deep learning algorithms learn beyond levels of human insight, and help organizations forecast pitfalls in their product demand curves and show prospective areas of customer demand yet to be realized by both, consumers and producers. Companies such as Netflix utilize machine learning and deep learning to gauge demand, set price levels effectively, and even create in-house productions based on latent (hidden) viewer demands. It is not a coincidence that Netflix produced shows relating to current events and culture trends experienced in society. Netflix’s viewing suggestions are generated according to the user’s preferences and are quite accurate. Many of these recommendations would never have been directly communicated by consumers to Netflix.

There needs to be a medley of human intuition and analytical insight in order to attain relevant outcomes.

Machine learning and deep learning help present the larger picture. However, the efficacy of the analysis depends on the ability of the teams involved to derive actionable insights that align with the organization’s goals. Machine learning needs to be paired with operational learning in order to form the right analysis. Operational learning represents an innate understanding of the industry on two levels. On the macro level, it refers to the knowledge of the product/service offering and that of the competitors’ offerings, and the awareness of consumer’s needs and that of the industry trends and standards. On the micro level, OL means having a familiarity with the organization’s goals, its implementation structure, its work culture, and the individuals involved.

“The intuitive recognition of the instant, thus reality… is the highest act of wisdom.” –  D. T. Suzuki

Often, textbook applications of techniques fail to achieve the desired results, giving data analytics a bad reputation.  There is no substitute for human insight and experience, which can be loosely termed as wisdom. There needs to be a medley of human intuition and analytical insight in order to attain relevant outcomes. The discussions between individuals and across teams surrounding a data science project are probably as valuable as the analysis itself. The data presents everything as it is, and provides latitude for the interpretations of everyone involved. These discussions help unearth truly latent variables, not directly included in models, because they don’t exist directly in the data.

Discovery of latent, yet important variables, provides an understanding that transcends the initial project scope and leads to insights beyond the team’s initial goals. For example, we undertook a data science initiative for understanding user search behavior for one of our clients. The analysis aimed at providing users with promotional links relevant to their search query. We analyzed user-behavior patterns using k-means clustering, a data exploration and dimension reduction technique, and further analyzed individual interactions using random forest decision trees. The project kept expanding in scope, and multiple business units (product development, marketing and finance) became increasingly involved. The project began with a single aim of categorizing search queries, but resulted in a process increasing paid clicks, enhancing user engagement and optimizing marketing spend, based on consumer’s relative value. The project fostered communication between teams, and increased their understanding of team goals.

Conducting a complete well rounded analysis is paramount for relevant and successful outcomes. However, machine learning needs to be paired with operational learning in order to generate value. A balanced implementation enables the mathematical models to be interpreted and implemented in the appropriate context and leads to actionable results, which should be the goal of any data science/analytics project. Machine learning projects with a clearly defined business objective from an operations perspective tend to have a high success rate since they are designed, from the very beginning, with the most relevant needs of the organization.

United Airlines Passenger Incident Causing a Twitter Storm

There are dark clouds surrounding United Airlines and the ongoing PR debacle. The recent passenger incident for United Airlines has caused quite an uproar in the media. It has also created a major firestorm on twitter, with a number of comments from passengers and customers voicing their displeasure. Continue reading United Airlines Passenger Incident Causing a Twitter Storm

There’s No Free Lunch, Stupid

“Tea is an act complete in its simplicity.
When I drink tea, there is only me and the tea.
The rest of the world dissolves” – Thich Nhat Hanh

A picture is worth a thousand words, and numbers have the capacity to summarize a picture with just a few statistics, especially in today’s data driven world. The right perspective is necessary for the right kind of analysis. It is not just employing the right technique , but rather, it’s implementation  determines the efficacy of the analysis and the relevance of the insight. Continue reading There’s No Free Lunch, Stupid

Why Lifetime value (LTV) calculations need Data Science 

Lifetime Value (LTV), sometimes referred to as Customer Lifetime Value (CLTV), is a technique used by businesses to predict the net profit of the entire future relationship with a customer. LTV is best thought of at a high-level as simply Total Customer Revenue – Total Customer Costs. Two key components to recognize and understand regarding LTV are the fact some customers hold more value than others and a customer is not just a single transaction but rather a relationship far more valuable than just a one-time deal. Continue reading Why Lifetime value (LTV) calculations need Data Science 

The Importance of ‘1’: A Different Perspective on Data Science

As data scientists, we are always looking for data, more data, different tools, or new techniques. We develop models enabling us to find higher areas of crime, make our society safer, or find ways to assist companies increase their profits or find efficiencies. Data scientists can help us identify patterns to determine what customers will buy, when they will buy it and where it will be bought. It can even assist the customer in making suggestions for cross-selling and up-selling opportunities and determining what customers will buy before they even buy it. The capabilities and opportunities of data science are endless and its uses are boundless.

Data scientists can easily forget the true nature of the data, since the massive amount of data available and the complexity of the techniques clouds each observation. Depending on the dataset, every single observation represents a human being, or a living being. Statisticians and data scientists have always referred to the the size of the sample as ‘n’, for example n=100, meaning 100 observations. However, when looking at large amounts of data, it obscures the most important ‘n’, n=1. ‘N’ equals one (N=1) could be you, your spouse, your friend, a sister or brother, a child or parent. It can be someone you know, or a friend of a friend. It is not uncommon for many data scientists to be working with a dataset and realize, that one of the observations refers to themselves.

When we analyze data, of course we analyze the numbers as they are, but we should inspect and respect the data, not as numbers but as human beings, as members of our community, or as a precious life. Of course we can de-identify the data as a means of protecting privacy, the fact that they represent a fellow human or even another life, such as a dog, cat, or other animal, cannot be ignored and should not be considered contrary to our mission as data scientists.

Data scientists must strive to conduct their analysis under a strong ethical code

When we apply this consideration to data science, I believe we are embarking on a new, moral, ethical branch of data science, which can be called Neohumanist data science. As data scientists, we are given an awesome responsibility to see the environment from a different lens. We are entrusted with the knowledge of how to find the proverbial needle in a haystack, and seek truth in the cloud of information. The decisions made from data science impact society as a whole and can greatly help our community, our country or our planet. Understanding the importance of the findings uncovered and its’ impact on the lives of others, therefore, becomes an entrusted gift, when we work with an unbiased perspective and a goal of finding the truth, wherever it may lead.

Data scientists, statisticians and business analysts should always strive to learn new techniques and perform the analyses requested. However, they should always maintain a moral compass that grounds them with a perspective of their responsibility to N=1. They must strive to conduct their analysis under an ethical code that prevents them from deliberately avoiding finding a preconceived truth to further a cause, regardless of the cause. They must never allow themselves to fall victim to Mark Twain’s statement that there are “Lies, damn lies, and statistics”. Becoming a neo humanist data scientist means they will always try and hold themselves to a standard unparalleled in our society. The knowledge, the data, and the tools provided are a gift, of sorts, and they are entrusted to data scientists to make sure that their work will cause no harm to any person, or living thing.