Multi-objective Optimization for Guaranteed Delivery in Video Service Platform
By Hang Lei, Yin Zhao, Longjun Cai
This article is based on the paper accepted by KDD 2020 of the same title
Guaranteed-Delivery (GD) is one of the important display strategies for the IP videos in video service platform. Different from the traditional recommendation strategy, GD requires the delivery system to guarantee the exposure amount (also called impressions in some works) for the content, where the amount generally comes from the purchase contract or business consideration of the platform.
In our latest paper, we study the problem of how to maximize certain gains, such as video view (VV) or fairness of different contents (CTR variations between contents) under the GD constraints. We formulate such a problem as a constrained non-linear programming problem, in which the objectives are to maximize the total VVs of contents and the exposure fairness between contents. In order to capture the trends of VV versus the impression number (page views, PV) for each video content, we propose a parameterized ordinary differential equation (ODE) model, and the parameters of the ODE are fitted by the video historical PV and CLICK data.
Challenges Faced by Video Service Platforms
For online video providers such as Youku, there is usually one widget/drawer that needs to distribute the new-released or hot video contents (generally means TV dramas and varieties). The overall user visits of that drawer during one day will not change much due to the fact that the total number of daily active users (DAU) for a video service platform is relatively stable over a period. Therefore, the crucial problem for the widget is how to allocate limited impressions to the given video contents, so as to assure enough impression and fairness for them. The drawer should concern the business requirements or contract requirements for the new-released or hot video, i.e. guarantee the certain number of impressions for each content. This becomes a typical Guaranteed- Delivery system. Thus only relying on recommendation system, which is individual-oriented, is not enough.
To solve this issue, an effective impression resource allocation system which plans the impression resources in a certain period before every operation period is essential. Generally, the operation period can be one day or every several hours, which depends on the specific requirement. Impression resource is first planned for each content at the be- ginning of the period considering all the requirements, then the dispatching system (typically the recommendation system) will take the impression amount allocated for each content as reference and try to find the most suitable users. Thus the whole system can balance both the business requirements and users’ personal requirement.
However, the impression allocation before each operation period is complicated because of many constraints involved. Currently, the impression system is operated manually, which highly depends on the experience of human, and thus is definitely sub-optimal. On the one hand, the actual impression delivered for a video content in the widget mostly is determined by its click-through rate (CTR) empirically, but manual allocation strategy cannot precisely predict the contents’ CLICK under a given impression. On the other hand, it is even hard for humans to design an allocation strategy that can also achieve high click number, fairness (which is a common object for this scenario) without violating all the constraints for each video.
A typical scenario is that different video contents behave differently in terms of page views (PV) and video views (VV) because of the differences of content properties. Some video contents might achieve high clicks using less impressions while the others might consume more. If we assign too many impressions to these con- tents, the overall CTR might be at a lower level. Although several impression allocating algorithms for ads (also called guaranteed delivery algorithm) have been proposed, to the authors’ knowledge, there is no existing algorithm proposed for contents’ impression allocating problem.
Our Proposed Solution
The contents’ GD allocation problem has its speciality in the following two aspects. 1) For video content delivery, especially for IP video, content platform needs to expose those contents repeatedly to the target consumers because of the limited number of contents compared with ads or commercial products. Moreover, the contents generally have large varieties of potential consumers and repeated exposure has much more probability than ads to bring more potential consumers to watch the video. Thus CTR (the effectiveness metric) trends with PV is the key factor that contents delivery should be considered; 2) modeling the impression allocating problem for contents considering CTR trends as constraints poses a new challenge for both model and solution.
In this paper, to address the above challenges, we design a two- stage framework. The first stage is forecasting, and the second is allocating. In the forecasting stage, we seek to develop effective predictive model with the goal of forecasting click behavior of users on each content given their daily historical PV and CLICK records. Specifically, to describe CTR trends with respect to PV, a prediction model (called PV-click-CTR model or P2C) based on ordinal differential equation (ODE) is proposed. And then in the allocating stage, we provide a multiple objective nonlinear programming (NLP) model subject to the CTR trends and other constraints. Accordingly, to solve the NLP with ODE involved, a GA allocating algorithm is developed.
Combining CLICK forecasting model and allocating model provides us with a solution to handle content Guaranteed-Delivery (GD) problem. We carried out extensive offline and online experiments, which show the superior of proposed solution, on both model performance and system efficiency. To the best of our knowledge, this is one of the first industrial solutions that are capable of handling content Guaranteed-Delivery (GD) problem. It should be admitted that there are many other related factors, such as the location impact of impression within the widget, the performance of recommendation system for each content, etc. Currently we do not consider those factors since they will make the problem further intractable and leave them as future work.
The main contributions of our work can be summarized as follows:
- We propose a parameterized PV-click-CTR (P2C) prediction model to describing the CTR trends with PV.
- We design a framework that can maximizes certain objectives, such as the VV of contents, fairness for each content, under the Guaranteed-Delivery (GD) constraints considering the CTR trends of each video content and impression resource limitation.
- Comprehensive offline and online experiments are conducted to verify the effectiveness of the proposed PV-click-CTR model as well as Guaranteed-Delivery strategy.
Detailed Analysis of the P2C Model
For any content, as PV increases, click saturation decreases, and the click increment (preferably click increment) brought by the unit PV shows a downward trend with the current click ratio. In other words, there is a positive correlation between click increment and saturation, which can be expressed by the following formula:
According to equation (2), the ordinary differential equation model where click increases with PV can be obtained.
By integrating the two ends after separating the variables in equation (3), you can get
Where x0 and y0 are the initial PV and click, respectively.
Based on the P2C model established above, the task of this section is to give an approximate optimal exposure for each content under the circumstances that the exposure resources of each scene and drawer are limited.
First, based on the ordinary differential equation (ODE) model predicted by PV-click-CTR, for each content in the content pool, the least squares are used to fit two parameters in the ODE: click saturation value ym and click with the inherent growth rate of PV r. Thus, the PV-click function relationship of each content is given.
Second, based on the given optimization goals and constraints, a multi-objective nonlinear optimization model of PV distribution can be established.
The optimization goals of the above model include two: maximizing vv in multiple scenes and minimizing the variance of the content CTR in the content pool. It should be noted that the minimum CTR variance here is a formal description of exposure fairness to balance “overexposure” and “underexposure”. The constraints represent the exposure PV constraints of the scene, drawer, pit, and content, respectively. Because we use numerical methods to solve the objective function, the above optimization model cannot be solved using traditional gradient-based algorithms. The evolutionary algorithm provides a solution. Here, the genetic algorithm (GA) is selected to solve. It should be noted that the P2C model is used in the calculation of the fitness value function in GA.
Four Key Experimental Results
We select multiple new hot content, and give the prediction effect of the P2C model and the offline effect of the preserving model. The evaluation indicators here are root mean square error (RMSE) and absolute error percentage (APE). Using P2C model and smooth CTR method  to predict the number of hits of new hot content. It can be seen from the table that the P2C model can effectively predict the number of hits, and is superior to the smooth CTR method in terms of RMSE.
In the online experiment part, we established a bucket experiment. The benchmark bucket adopts manual strategy to maintain the quantity; the experiment bucket adopts the strategy proposed in this article. During the experiment, we pay attention to and compare the daily delivery effect of the benchmark bucket and the experimental bucket (CTR variance, overall CTR of the strategy on the scene, etc.). The following is the data of the 30-day and 7-week holding effect. Compared with the results of the manual strategy, it is found that the holding strategy has different degrees of improvement in the CTR variance and the overall CTR of the scene. In particular, in terms of CTR variance, the effect of the hedging strategy is very obvious, with an average relative increase of +50%.
In this paper, we study the model and algorithm for the video content Guaranteed-Delivery system with multiple objectives in video service platform. To solve it, we propose a novel algorithm framework that generates efficient solutions with GD constraints. There are two main components in the proposed framework. Firstly, the CLICK value for each content is forecasted by an ODE model (PV-click-CTR prediction model), which is used for describing the CTR trends with respect to PV. Then the Guaranteed- Delivery problem is formulated as an optimization problem constrained by the learned PV-click-CTR prediction model. The optimization problem can be solved by GA. This framework is easy to implement and has been successfully applied to many scenarios in practice. The results of both offline experiments and online A/B testing demonstrate its effectiveness. Cold-start problem for the PV-click-CTR model and incorporating more related factors into the model require future research. Also for practitioners, we should also investigate more sophisticated acceleration algorithms to solve the complicated optimization problem with ODE constraints. Potential direction might involve sequential quadratic programming etc.
Both the detailed and overall results show that the GD model for contents proposed in this paper can help us to make more informed and effective decisions in practical content GD system, which is superior to the current practical solutions.
 Vijay Bharadwaj, Peiji Chen, Wenjing Ma, Chandrashekhar Nagarajan, John Tomlin, Sergei Vassilvitskii, Erik Vee, and Jian Yang. 2012. SHALE: An Efficient Algorithm for Allocation of Guaranteed Display Advertising. CoRR abs/1203.3619 (2012). arXiv:1203.3619 http://arxiv.org/abs/1203.3619
 Srinivas Bollapragada, Michael R. Bussieck, and Suman Mallik. 2004. Sched-uling Commercial Videotapes in Broadcast Television. Operations Re-search 52, 5 (2004), 679–689. https://doi.org/10.1287/opre.1040.0119 arXiv: https://doi.org/10.1287/opre.1040.0119
 Srinivas Bollapragada, Hong Cheng, Mary Phillips, Marc Garbiras, Michael Scholes, Tim Gibbs, and Mark Humphreville. 2002. NBC’s Optimization Systems Increase Revenues and Productivity. INFORMS Journal on Ap-plied Analytics 32, 1 (2002), 47–60. https://doi.org/10.1287/inte.220.127.116.11 arXiv: https://pubsonline.informs.org/doi/pdf/10.1287/inte.18.104.22.168
 Srinivas Bollapragada and Marc Garbiras. 2004. Scheduling Commercials on Broadcast Television. Operations Research 52, 3 (2004), 337–345. https://doi.org/10.1287/opre.1030.0083 arXiv: https://doi.org/10.1287/opre.1030.0083
 Olivier Chapelle. 2014. Modeling Delayed Feedback in Display Advertising. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘14). ACM, New York, NY, USA, 1097–1105. https://doi.org/10.1145/2623330.2623634
 Olivier Chapelle. 2015. Offline Evaluation of Response Prediction in Online Advertising Auctions. In Proceedings of the 24th International Conference on World Wide Web (WWW ’15 Companion). ACM, New York, NY, USA, 919–922. https://doi.org/10.1145/2740908.2742566
 Olivier Chapelle, Eren Manavoglu, and Romer Rosales. 2015. Simple and Scalable Response Prediction for Display Advertising. ACM Trans. Intell. Syst. Technol. 5, 4, Article Article 61 (Dec. 2015), 34 pages. https://doi.org/10.1145/2532128
 David Maxwell Chickering and David Heckerman. 2003. Targeted Adver- tising on the Web with Inventory Management. INFORMS Journal on Ap- plied Analytics 33, 5 (2003), 71–77. https://doi.org/10.1287/inte.22.214.171.12448 arXiv: https://pubsonline.informs.org/doi/pdf/10.1287/inte.126.96.36.19948
 Gila E. Fruchter and Shlomo Kalish. 1998. Dynamic promotional budgeting and media allocation. European Journal of Operational Research 111, 1 (1998), 15–27. https://doi.org/10.1016/S0377-2217(97)00315-9
 Kun Gai, Xiaoqiang Zhu, Han Li, Kai Liu, and Zhe Wang. 2017. Learn-ing Piece-wise Linear Models from Large Scale Data for Ad Click Prediction. arXiv:stat.ML/1704.05194
 Thore Graepel, Joaquin Quinonero Candela, Thomas Borchert, and Ralf Herbrich. 2010. Web-scale bayesian click-through rate prediction for sponsored search advertising in microsoft’s bing search engine. Omnipress.
 Huifeng Guo, Ruiming Tang, Yunming Ye, Zhenguo Li, and Xiuqiang He. 2017. DeepFM: A Factorization-Machine based Neural Network for CTR Prediction. arXiv:cs.IR/1703.04247
 Ferenc Hartung, Tibor Krisztin, Hans-Otto Walther, and Jianhong Wu. 2006. Chap-ter 5 Functional Differential Equations with State-Dependent Delays: Theory and Applications. Handbook of Differential Equations: Ordinary Differential Equa-tions, Vol. 3. North-Holland, 435–545. https://doi.org/10.1016/S1874-5725(06)80009-X
 Xinran He, Junfeng Pan, Ou Jin, Tianbing Xu, Bo Liu, Tao Xu, Yanxin Shi, Antoine Atallah, Ralf Herbrich, Stuart Bowers, and Joaquin Quiñonero Candela. 2014. Practical Lessons from Predicting Clicks on Ads at Facebook. In Proceedings of the Eighth International Workshop on Data Mining for Online Advertising (ADKDD’14). ACM, New York, NY, USA, Article 5, 9 pages. https://doi.org/10.1145/2648584.2648589
 Alf Kimms and Michael Muller-Bungart. 2007. Revenue management for broad-casting commercials: the channel’s problem of selecting and scheduling the advertisements to be aired. International Journal of Revenue Management 1, 1 (2007), 28–44. https://ideas.repec.org/a/ids/ijrevm/v1y2007i1p28-44.html
 Kuang-chih Lee, Burkay Orten, Ali Dasdan, and Wentong Li. 2012. Estimat-ing Conversion Rate in Display Advertising from Past Erformance Data. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ‘12). ACM, New York, NY, USA, 768–776. https://doi.org/10.1145/2339530.2339651
 Qiang Liu, Feng Yu, Shu Wu, and Liang Wang. 2015. A Convolutional Click Prediction Model. In Proceedings of the 24th ACM International on Conference on Information and Knowledge Management (CIKM ‘15). ACM, New York, NY, USA, 1743–1746. https://doi.org/10.1145/2806416.2806603
 H Brendan McMahan. [n. d.]. Follow-the-Regularized-Leader and Mirror Descent: Equivalence Theorems and L1 Regularization. ([n. d.]).
 A Mihiotis and I Tsakiris. 2004. A mathematical programming study of advertising allocation problem. Appl. Math. Comput. 148, 2 (2004), 373–379. https://doi.org/10.1016/S0096-3003(02)00853-6
 H. Mühlenbein, M. Gorges-Schleuter, and O. Krämer. 1988. Evolution algorithms in combinatorial optimization. Parallel Comput. 7, 1 (1988), 65–85. https://doi.org/10.1016/0167-8191(88)90098-1
 Richard J. Oentaryo, Ee-Peng Lim, Jia-Wei Low, David Lo, and Michael Finegold. 2014. Predicting Response in Mobile Advertising with Hierarchical Importance- aware Factorization Machine. In Proceedings of the 7th ACM International Con- ference on Web Search and Data Mining (WSDM ‘14). ACM, New York, NY, USA, 123–132. https://doi.org/10.1145/2556195.2556240
 Paulo A. Pereira, Fernando A. C. C. Fontes, and Dalila B. M. M. Fontes. 2008. A Decision Support System for Planning Promotion Time Slots. In Operations Research Proceedings 2007, Jörg Kalcsics and Stefan Nickel (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 147–152.
 Yanru Qu, Han Cai, Kan Ren, Weinan Zhang, Yong Yu, Ying Wen, and Jun Wang. 2016. Product-based Neural Networks for User Response Prediction. arXiv:cs.LG/1611.00144
 Lili Shan, Lei Lin, Chengjie Sun, and Xiaolong Wang. 2016. Predicting ad click-through rates via feature-based fully coupled interaction tensor factorization. EleCTRonic Commerce Research and Applications 16 (2016), 30–42. https://doi.org/10.1016/j.elerap.2016.01.004
 Nitasha Soni and Dr. Tribhuwan Kumar. 2014. Study of Various Crossover Operators in Genetic Algorithms.
 A. Ta. 2015. Factorization machines with follow-the-regularized-leader for CTR prediction in display advertising. In 2015 IEEE International Conference on Big Data (Big Data). 2889–2891. https://doi.org/10.1109/BigData.2015.7364112
 John Turner. 2012. The Planning of Guaranteed Targeted Display Advertising. Operations Research 60, 1 (2012), 18–33. https://doi.org/10.1287/opre.1110.0996 arXiv: https://doi.org/10.1287/opre.1110.0996
 Flavian Vasile and Damien Lefortier. 2016. Cost-sensitive Learning for Bidding in Online Advertising Auctions. CoRR abs/1603.03713 (2016). arXiv:1603.03713 http://arxiv.org/abs/1603.03713
 Xuerui Wang, Wei Li, Ying Cui, Ruofei Zhang, and Jianchang Mao. 2011. Click-through rate estimation for rare events in online advertising. In Online multimedia advertising: Techniques and technologies. IGI Global, 1–12.
 Weinan Zhang, Tianming Du, and Jun Wang. 2016. Deep Learning over Multi-field Categorical Data: A Case Study on User Response Prediction. arXiv:cs.LG/1601.02376
 Guorui Zhou, Chengru Song, Xiaoqiang Zhu, Ying Fan, Han Zhu, Xiao Ma, Yanghui Yan, Junqi Jin, Han Li, and Kun Gai. 2017. Deep Interest Network for Click-Through Rate Prediction. arXiv:stat.ML/1706.06978
The views expressed herein are for reference only and don’t necessarily represent the official views of Alibaba Cloud.