AI Algorithm Has the Potential to Bring Great Changes in Digital Marketing | Prof. WANG Xiaoyi in UTD

A mall running promotions day and night may be a deleterious marketing strategy because advertisement without intervals supposedly reduces a customer’s intent to shop. Similarly, an online shopping platform continuously introducing the same category of products may lead to a lack of engagement as the customer gets bombarded by too many products that are alike.

However, these problems can be addressed through several methods in artificial intelligence like speech recognition and image processing; like self-driving cars and medical robots, marketing can also utilize deep reinforcement learning which has the potential to being unprecedented changes in this field.

After discovering the potential of "deep reinforcement learning" to revolutionize the field of digital marketing, Professor WANG Xiaoyi of the Department of Marketing at Zhejiang University’s School of Management and his team embarked on a series of in-depth studies. Their objective was to develop a personalized target positioning strategy using DRL as the foundation.

WANG and his team achieved a significant milestone when they published their scientific research titled "Deep Reinforcement Learning for Sequential Targeting" in MANAGEMENT SCIENCE, an internationally renowned magazine specializing in the field of management science. This groundbreaking study highlighted the utilization of deep reinforcement learning to enable dynamic and continuous target marketing.

You can access the original research paper  here 

WANG Xiaoyi  |  王小毅

School of Management, Zhejiang University


Academic Background:  Professor of marketing, doctoral supervisor, an expert in cross-research of digital transformation and brain-computer intelligent marketing from the School of Management, Zhejiang University, who has won a number of provincial and ministerial awards such as the first prize of scientific and technological progress awarded by China General Chamber of Commerce. He has published papers in Management World, Management Science, Marketing Science, Journal of Marketing Research, Information Systems Research, and other journals, with a Google’s academic H index of 20.

You can learn more about Prof. WANG Xiaoyi’s academic background  here 

Why do they focus on machine learning to study target marketing strategy?

Target Marketing conventionally involved enterprises identifying different groups of buyers, choosing one or more of them as target markets, and using appropriate marketing strategies to meet their demands. However, with the advent of the digital era, enterprise marketing strategies have shifted towards high-frequency interactions with consumers and the need for rapid adjustments in marketing approaches. Consequently, classic marketing theory is increasingly faced with challenges as its conventional ideas and solutions struggle to keep up with the evolving landscape. The dynamic nature of the digital era calls for innovative approaches that can effectively address the changing needs and expectations of consumers. Traditional target marketing strategies, which are usually for one-off deals, rely on early planning and cost a lot in advertising will cause ignoring the influence of time on consumer behavior and the continuity of promotional activities, with huge costs and uncertainty.

In practice, enterprises need to determine the target demographics for promotional activities along with a suitable interval between consecutive activities, while also keeping up with the rapid changes in consumers’ preferences.

Therefore, both academia and industry urgently require an adaptive strategy of target marketing that regulates and updates itself along with the changes that occur in consumer behavior.

Common Dynamic and Continuous Promotion Strategy Adopted by Enterprises

The deep Reinforcement Learning (DRL) algorithm, is a novel AI technique that learns by itself to improve performance without external influence; it is a reward-based learning method that can customize marketing strategies according to time, adapting to changing consumer behaviors that are complex and multi-dimensional and evaluating the effect of strategies.

Building upon this foundation, Professor WANG and his team put forward a personalized target positioning strategy and used a quantitative and heuristic method of uncertainty learning to adapt DRL to complex dimensions of consumer behavior.

Double-Dueling Network Architecture with Two-Stream Computations

What attempts have the research team made?

Artificial intelligence has made significant advancements in "autonomous learning" through the integration of "deep learning" and "reinforcement learning". Deep learning primarily focuses on enhancing AI’s perception capabilities, while reinforcement learning complements it by enabling AI to make decisive actions. The combination of these two approaches offers a new solution to the challenges of perception and decision-making in complex systems, known as "deep reinforcement learning (DRL)". This method not only bridges the gap between perception and decision-making but also aligns more closely with human thinking patterns, making it a promising AI approach.

There have been remarkable advancements in "Deep Reinforcement Learning (DRL)" in recent years. This artificial algorithm, influenced by the mechanism of reward and punishment, enables continuous system interaction with the marketing environment through a trial-and-error approach. By doing so, it seeks to discover the "best strategy" and accomplish the learning process. Through the iterative process of receiving rewards for favorable outcomes and punishments for unfavorable ones, the DRL algorithm adapts and improves its decision-making abilities, leading to more effective marketing strategies. This dynamic learning approach holds immense potential for optimizing marketing processes and achieving superior results.

Utilizing the DRL algorithm as a foundation, Professor WANG’s team devised a personalized target marketing strategy specifically tailored for sequential target marketing scenarios. The strategy employs a series of consecutive promotions to capture customers’ immediate attention and engage their interest. However, to prevent saturation and maintain customer engagement, a non-promotion period, also known as a cooling-off period, is introduced between every two promotional periods. Importantly, the duration of the cooling-off period gradually increases over time, allowing for changes in customers’ price expectations. This approach acknowledges the dynamic nature of customer preferences and aims to optimize the effectiveness of promotional activities by strategically managing customer expectations and fostering ongoing engagement.

The research shows that DRL can solve the three major challenges currently faced by the continuous strategy of target marketing:



Balance the company’s current and future income


Get returns on the market while exploring and learning

Maximize profits while learning through exploration and development



Dealing with high-dimensional state and marketing policy space

©千库网 | Qianku Network

In order to better adapt DRL to complex consumer behavior dimensions, the research team proposed a quantitative and heuristic algorithm of uncertainty learning to improve the efficiency of exploration and development.

The evaluation results demonstrate that the utilization of this new algorithm agent yields an average long-term income that is 26.75% higher compared to traditional methods. Additionally, the learning speed of this algorithm surpasses other commonly used algorithm models by 76.92% in all benchmark tests. These findings highlight the significant advantages of employing this new algorithm, showcasing its superior performance in generating increased income and accelerating the learning process compared to conventional approaches.

In order to better understand the potential mechanism behind the findings, the team has conducted a variety of research on explainable algorithms, which can interpret the behavior patterns of learning optimal strategies at both individual and group levels.

In addition, Professor WANG and his team members have proposed the "simulated" online test environment, a user behavior simulator for DRL training and testing, which provides a cost-saving way for the platform to learn DRL agents without running a large number of tests.

©千库网 | Qianku Network

Nevertheless, it is essential to acknowledge that the application of AI algorithms in the marketing field still presents certain challenges. The performance of these algorithms heavily relies on the availability of adequate training data and computational resources. Consequently, numerous experiments and optimizations are required to enhance their efficiency and accuracy. Moreover, the complexity and unpredictability of consumer behavior demand a greater amount of data and more sophisticated models to effectively explain and predict it. Additionally, when employing the DRL algorithm, real-time decision-making and rapid adjustments are crucial, necessitating the establishment of a real-time decision-making system to provide the necessary support. Overcoming these challenges will contribute to the successful application of AI algorithms in marketing and ensure their effectiveness in addressing dynamic marketing scenarios.

To address these challenges, Professor WANG’s team is actively engaged in ongoing research and development endeavors. The team aims to provide a solution that goes beyond the current scope and can be effectively implemented in diverse areas such as location-based services, online streaming media, and sequential recommendation systems for online education. These areas are highly relevant to platforms and marketers in their daily operations, making the research outcomes accessible and applicable in real-world marketing settings. The team’s efforts aim to facilitate the widespread adoption of DRL and unlock its potential for enhancing marketing strategies across different industries.

The framework will be based on quantitative and heuristic method of uncertainty learning, combined with real-time decision-making system and user behavior simulator, in order to improve the efficiency and accuracy of DRL algorithm, and help enterprises better understand and predict consumer behavior, and have marketing strategies of higher intelligence and better effect.

Generally speaking, the research of Professor WANG’s team has shown the great potential of DRL method in optimizing target marketing strategy to maximize the income of enterprises in the long run and prove that this method can have a revolutionary impact in the field of digital marketing.

©千库网 | Qianku Network

What does their research mean to brand value and digital marketing?

What is the role of marketing strategy based on "DRL" in improving the efficiency of digital marketing? What is the significance of helping enterprises develop digital marketing strategies?

This seemingly ineffable algorithm-driven marketing strategy is closely related to brands and enterprises and has important strategic significance in the digital transformation of enterprises. Specifically, it can help enterprises to carry out market positioning and target marketing more accurately, improve the efficiency and accuracy of digital marketing, and realize business growth and brand value promotion.


Better understanding the needs and preferences of consumers

"DRL" plays a crucial role in enabling enterprises to gain a better understanding of consumers’ needs and preferences, facilitating accurate market positioning and targeted marketing efforts. By conducting an in-depth analysis of consumer data, enterprises can uncover patterns and psychological characteristics that drive consumer behavior, leading to the formulation of more adaptive marketing strategies. For instance, leveraging the insights generated by reinforcement learning algorithms, enterprises can swiftly adjust product pricing, implement personalized product recommendations, optimize advertising approaches, and more. These timely adjustments enable enterprises to attract a larger consumer base and foster consumer loyalty. By harnessing the power of "DRL," enterprises can enhance their understanding of consumers and make informed marketing decisions that maximize their reach and impact.


Optimize marketing channels and improve efficiency

"Deep reinforcement learning" (DRL) offers valuable capabilities for enterprises to optimize their marketing channels and enhance the efficiency and accuracy of digital marketing efforts. Through the analysis and comparison of data from various marketing channels, enterprises can gain insights into the contribution and effectiveness of each channel, enabling them to formulate more scientific and data-driven channel strategies. For instance, leveraging the insights derived from reinforcement learning algorithms, enterprises can make timely adjustments to the delivery ratio and strategies employed in each channel. This dynamic optimization helps improve digital marketing initiatives’ overall efficiency and accuracy. By harnessing the power of DRL, enterprises can optimize their marketing channels and make informed decisions that lead to enhanced marketing outcomes.


Making digital marketing more refined and intelligent

The deep reinforcement learning algorithm holds the potential to significantly enhance the refinement and intelligence of enterprises’ digital marketing efforts. Through comprehensive research and analysis of consumer data, enterprises gain a deeper understanding of consumer behaviors and preferences, enabling them to deliver personalized marketing and services. Leveraging the insights generated by reinforcement learning algorithms, enterprises can design distinct marketing programs and tailor services to cater to the unique needs of different consumer groups. This personalized approach not only improves consumer satisfaction but also fosters loyalty. By integrating the power of deep reinforcement learning into their digital marketing strategies, enterprises can unlock new levels of refinement and intelligence, leading to more effective and impactful marketing campaigns.

©千库网 | Qianku Network

Based on this research result, WANG’s team cooperated with Alibaba and China Tobacco to improve digital marketing management methodology to design an intelligent marketing brain and formulated a private brand strategy for traditional retail business, which provided a new path and problem-solving method for enterprise digital intelligence innovation.

A few days ago, Professor WANG and the DEEPLINK model released by Alibaba analyzed in detail the consumption footprint of consumers from goods selecting to the initial purchase, and to purchase again.

If a "personalized target marketing strategy" is used to analyze and interpret the seemingly complex and disorder consumption footprint in the model, it can be disassembled as: Discover -Engage -Enthuse -Perform -Initial -Numerous -Keen. This model is a new upgrade based on the sales funnel- AIPL model (Awareness -Interest-Purchase-Loyalty) in the past.

Such analysis of footprints can help enterprises to better understand and predict consumer behavior and formulate more intelligent and effective marketing strategies.

- Prof. WANG Xiaoyi and his team have discovered a more intelligent and cost-effective way of analysing markets; we are looking forward to seeing it in action! 加油!

- You can read the original article in Chinese  here