- Sameer HAKIM – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France
- Ziyan LI – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France
- Yi PAN – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France
- Fabrice ZAUMSEIL – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France
- Huihui CHI – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France / Department of Information & Operations Management, ESCP Europe, Paris, France
- Wei ZHOU – M.Sc. in Big Data & Business Analytics, ESCP Europe, Paris, France / Department of Information & Operations Management, ESCP Europe, Paris, France
Corresponding Author: firstname.lastname@example.org (ZHOU Wei)
Data Management Platforms (DMP) are centralized systems for collecting and analyzing large sets of structured and unstructured data originating from disparate sources. These platforms analyze, organize and segment first, second and third party data into different customer or audience types to be used for marketing and in advertising campaigns. Given the amount of sensitive personally identifiable information they have on customers, DMPs start to be monitored by the General Data Protection Regulation (GDPR) after 25th May 2018. Based on a comprehensive review of 17 published articles, this paper is among the first to review the current practice of DMPs and the policy implications of GDPR. We also highlight the challenges with implementing the new regulation and therefore the required changes to facilitate the daily operations of DMPs with GDPR.
The role of data management platform in the context of media development and online advertising is to segment audiences by integrating data from proprietary and third part sources, including determining the quantity and quality of data, to buy and to manage all the aspects of this data. This includes controlling and restricting access to data, tracking its utilization and reporting operational changes, attributes and data cost. These processes and techniques are often used to leverage custom audience segments by Demand Side Platforms (DSPs) and Supply Side Platforms (SSPs) (Shah et al., 2011). Data incorporated into a DMP can be firsthand data, coming from an organization’s own applications, systems, websites and products, as well as secondhand data from partners and other associates. DMPs also often use third-party data to fill in holes in a company’s own data including partner data. As stated in the GDPR all data processors and controllers who have data that can personally identify an individual will have to abide by the new regulations. Since DMPs are in the business of identifying audiences and individuals for purposes of better online targeting they will be directly affected by the emerging data regulation policies.
Programmatic buying, for example, is a business model for online computational advertising in the age of big data. Based on analysis of massive amounts of cookie data generated by Internet users, programmatic buying advertising has the potential of identifying in real-time the characteristic and interest of the target audience in each ad impression, automatically delivering best-matched ads, and optimizing their prices via auction-based programmatic buying scheme. Programmatic buying has significantly changed online advertising, evolving from the traditional pattern of media buying and ad-slot buying to target-audience buying. Through cookie analysis, the DMP can identify the interests and characteristics of user. When this user opens a webpage an auction will be triggered once she inputs the URL and presses the enter key. The publisher will send the user information to the SSP who forwards the information to the Ad Exchange (AdX). The AdX further sends the user information to eligible DSPs. These DSPs in turn, ask DMPs and know that this user is a car enthusiast. So, each DSP sends the user information to its advertisers and starts an auction where advertisers that sell cars can submit bids for the opportunity of showing ads to the user. The winner from each DSP auction will enter the second-round auction in the AdX. The highest bidder among all DSPs finally obtains the ad impression, and her ads will be fed back to the AdX and SSP, and displayed to the user on the webpages of the publisher (Yuan et al., 2014). Given the breadth and scope of the number of private companies including many third party companies involved in the process, it is important to understand the impact of General Data Protection Regulations (GDPR).
General Data Protection Regulations was first approved on 14th April, 2016 and becomes enforceable on the 25th May, 2018. The GDPR replaces the Data Protection Directive and is designed to harmonize data privacy laws across Europe (Zhou and Piramuthu 2013, 2015), to protect and empower all EU citizens’ data privacy and to reshape the way organizations across the region approach data privacy. According to GDPR Article 4 personal data or personally identifiable data (PII) with reference to the online advertising industry is ”any information relating to an identified or identifiable natural person (data subject); an identifiable natural person is one who can be identified, directly or indirectly, in particular by reference to an identifier such as a name, an identification number, location data, an online identifier or to one or more factors specific to the physical, physiological, genetic, mental, economic, cultural or social identity of that natural person”. GDPR (24) further extends that regulations are not limited to first party service providers but also third party processors of personal data. In the context of GDPR, DMP and programmatic buying this paper shall discuss the working of DMPs in detail & their role in programmatic buying, the various techniques used by online advertising companies to identify and segment users (in a limited scope), the regulations as defined in GDPR and finally how service providers including third party processors comply with GDPR.
Motivated by the lack of discussion of impact of GDPR on DMP industry, this paper is among the first to review the current practice of DMPs and the policy implications of GDPR. We also highlight the challenges with implementing the new regulation and therefore the required changes to facilitate the daily operations of DMPs with GDPR. The remainder of the paper is organized as follows. In Section 2, we review DMPs’ structure and business Model. In Section 3, we present GDPR and its impact on DMPs. In Section 4, we provide a summary of this paper and conclude.
Review of Data Management Platform: Structure & Business Model
Data Management Platform Structure
Data management platforms in the online advertising process, collect and store data from various different sources including service providers, proprietary data and third party companies. The depth and breadth of having data from multiple sources helps with the diversity and reduction in corrupt data since DSPs perform validation checks of the data when processing. An analytics processor receives this data to provide analytics to companies using their services on, segmentations and classification of online customers using runtime profiles in a nesting-aware, SQL query language and along with a library of data mining methods, machines learning models, all in real time (Chen et al., 2013).
DMPs can be classified into two different user categories: for marketers and for publishers. Marketers use it to market their products while publishers use it to target specific audiences and improve the efficiency of marketer’ advertising campaign running on their site. DMPs offer many types of services or functionalities to client companies, some of which extend to the management of prospects and advertising audiences in a PRM (prospect relationship management) logic. Part of the ROI related to setting up a DMP can potentially be achieved through data activation (Bathelot, 2017).
For marketers (including agencies), one of the trickiest things is improving the accuracy of targeted online advertising and campaign. DMPs are able to help marketers tackle their most pressing challenge that is ad blindness. At their core, the Data Management Platforms provide services for advertiser and agencies regarding taking control of their own first-party data, processing data, comparing it to third-party data, to make better decisions on media buying and campaign planning. Moreover, they can also have the insights about ROI each campaign contributed for each segment.
DMPs offer a full-featured set of components that allow advertisers and agencies to make better use of their first- party data to have more insights of their target and make more accurate media buying. They also use first or third-party data to determine customized content in websites for different audience, alongside comparing third-party data sources to gain additional information about specific customer to increase conversions. These techniques help marketers achieve improved ROIs in the long run (Bluekai, 2011).
DMPs offer tag management, users segmentation, media integration, campaign analytics and users analytics.
- Tag management: This feature help advertisers and agencies to set tags in their websites for monitoring and collecting data. Advanced tag management also enables marketers to comply with the current privacy regulation and control measures for data access and sharing, as well as provide the ability to categorize and assign varying levels of rights to advertising.
- Users segmentation: DMPs allow marketers to classify their first-data of users by taxonomy, segment, campaign and /or ROI outcomes compared against third-party An advanced segmentation can create a large numbers of highly-relevant clusters within the DMP to reach users with the right promotion at different stages of the purchase funnel.
- Media integration: DMPs help marketers share their users segments with ad networks, trading platforms, portals and DSPs to serve targeted ads, perform programmatic buying and reach users real Consequently, one of the most critical business of DMPs is integrating the media resources across different channels.
- Campaign analytics: DMPs provide easy-to-use tools, which enables marketers to measure and compare campaign performance for different segments and channels. Thus helping marketers adjust their campaign strategy to make better and accurate.
- Users analytics: Apart from analyzing performance of campaign, DMPs also analyze the behaviors of users. They can measure how users interact with campaigns on each channels. So marketers can have insights on which channels deliver the highest ROI for specific.
There are many technological platforms that assist marketers in carrying out advertisement and media campaign Figure 1. The buyers are advertisers or media agencies, while the sellers are publishers. An ad exchange is a technological platform where buyers make bidding to get ad placement for their media campaign. Ad network is a middleman that buy ad positions from publishers and sell them directly to advertisers. DSP is an interface which delivers an integration of multiple ad exchanges (i.e. ad position bidding) in a single system. With this interface, advertisers can make programmatic decision for ad positions, buying across several ad exchanges in real time. SSP (Supply-Side Platform) takes inventory and connects to as many ad exchanges and ad networks as possible. It is designed to deliver a platform for publishers to manage and optimize their media advertising placements from one single interface. So multiple players are involved in this online advertising ecosystem, and DMPs build a strong connectivity among these parties.
Figure 1. Visual representation of the online display advertising landscape and the key player
Data collection by DMPs
Figure 2 shows how DMPs run business to create actionable insights. Generally, DMP firstly collect data from different sources which can be first-party, second-party and third-party data; then normalize, enrich, analyze these data by using different models and technologies to create and add value to them; finally deploy these processed data to the outside business world to generate profit.
As DMP create value through processing data, data collection and analysis is the main part of their business. They work based on three main phases aggregation phase, integration and management phase and the deployment phase. For the aggregation phase, data collection is the very first step and the basis for the whole process. Since DMPs are data experts and can communicate with their clients through data, including marketers and publishers, there are different types of data and different methods for them to collect data (Zawadski, 2016).
Figure 2. The Data Management Platform Framework
General Data Protection Regulation (GDPR) & Its Impact on DMPs
The General Data Protection Regulations is bringing about a massive change in how companies operate, store and process data. Most private companies including those involved in the business of online advertising are not equipped and ready to comply with the regulations under GDPR. According to GDPR, personal data is any information related to a person, irrespective of their nationality or residence (Regulation 2), who can be identified, directly or indirectly using identifiers (Article 4, 1) such as those provided by their devices including MAC address, Apple ID, Advertising ID, IDs provided by applications, tools and protocols such as internet protocol addresses, location data, cookie identifiers, other online identifiers to one or more factors and radio frequency identification tags (Regulation 30).
As defined by the GDPR, this regulation applies to all controllers or processors which provide the means for processing personal data (Regulation 18). Controllers are those who determine the purpose and means of processing personal data (Article 4, 7) while processors are entities which process personal data on behalf of the controllers (Article 4, 8). Both controllers and processors can be public, private, person, agency or a body. Furthermore, processing implies operations performed on personal data (Article 4, 2). This processing is location agnostic and therefore GDPR applies to all organizations including the branches or subsidiaries, irrespective of their physical location in the EU yet offering goods or services to persons in the EU (Regulation 22, 23). For controllers or processors not established in the EU but processing personal data of persons in the EU need to have a designated representative, unless processing is occasional or not large scale processing of special categories of personal data (Regulation 80).
With regards to data processing, both controllers and processors are subject to the GDRP when monitoring a per- son’ behavior on the internet and to profile a person so as to take decisions concerning them for analyzing or predicting their personal preferences, behaviors and attitudes when they are physically located in EU (Regulation 24).
The GDPR defines data used for data processing based on certain criterion; pseudonymous and anonymous data. Pseudonymous data is personal data that can no longer be linked to a person without the use of additional data irrespective of the tools and techniques used. The process of converting identifiable data to pseudonymous data is called pseudonymisation (Article 4, 5). Anonymous data is personal data that can no longer be linked to a person even with the help of external data and does not come under the preview of GDPR (Regulation 26). Data collected from a person should be adequate, relevant and limited to a minimum time period to what is necessary, for the purposes for which they are processed (Regulation 39). In regards to data being processed by controllers including third parties for their own legitimate interests including for direct marketing purposes, it is important that the rights of a person who is a client or in the service of the controller take precedence (Regulation 47). The person should be aware of the existences of such processing operations and their purposes specifically when data is being used for profiling (Regulation 60) including the duration for which the data is being processed, the logic involved in any automatic data processing and the consequences (Regulation 63). When the controller intents to process personal data for purposes for which they were not collected they should notify the person about such activities before further processing of this data (Regulation 61). Where personal data is being processed for purposes of direct marketing, the person should have a right to object to such processing, including profiling with regards to initial or further data processing (Regulation 70). When processing personal data on the behalf of a controller, processors including third parties at the choice of the controller should return or delete the personal data unless required by law to which the process is subject to (Regulation 81).
GDPR also specifies how consent for data processing can be gathered from a person. Consent for each activity of data processing should be given at the time of collecting this data by a clear affirmative act once the person is informed and unambiguous about how their personal information is going to be utilized. This consent should be in a written statement; electronic or oral, ticking a box when visiting an internet website, choosing technical settings, or other means. Silent, pre-ticked boxes or inactivity should not be considered as consent and the given consent should be for activities carried out for the same purpose as agreed by the person and no other activities that can utilize the data (Regulation 32). Further communications concerning activities for which the data is collected should be easily accessible and easy to understand using simple and plain language at the time of collection of data (Regulation 39). The person should also be informed whether they are obliged to provide personal data (Regulation 60). The controller if asked, should be able to demonstrate that the person has given his consent to specific processing operations where the person was made aware of at least the identity of the controller (Regulation 42). With regards to data processing in direct marketing, the right to object to data processing must be explicitly brought to the attention of the person (Regulation 70).
In view of the regulations related to security, pseudonymisation can reduce the risks to personal data to help controllers and processors meet their data-protection obligations (Regulation 28). Cases where a controllers are legally obliged to provide additional security measure to comply with the GDPR, they should not refuse to collect additional information about the person including login information (Regulation 57). This additionally collected information should not be retained for the sole purposes of being able to react to potential future requests and must be treated as temporary information for that specific activity (Regulation 64).
With regards to collected personal data, controllers should provide a mechanism for the person to request, access, rectify and delete such data. Requests should be responded to without delay and within the stipulated period of one month with appropriate reasons (Regulation 59). If additional personal data is collected about the person from other external sources or disclosed to other recipients the person should be notified about this within a reasonable period (Regulation 61). If the origin of such data cannot be provided, general information should be provided (Regulation 61). Cases where the person requests to be forgotten, controllers are responsible to inform processor and third party to erase any links, copies or replication of those personal data (Regulation 66). The person should also have the right not to be subjected to a decision related to measuring and evaluating personal aspects, including profiling to analyze or predict their characteristic, economic situation, health, preferences, interests, re- liability, behavior, location or movements based solely on automated processing which directly affects him or her unless consent is given (Regulation 71).
The responsibility and liabilities of the controller should be established. The controller should be obliged to implement appropriate and effective measures to demonstrate compliances of processing activities and effectiveness of the measures (Regulation 74). The risks and liabilities, to the rights and freedoms or to exercising control over personal data of the person are defined when they result in discrimination, identity theft or fraud, financial loss, damage to reputation, loss of confidentiality, unauthorized reversal of pseudonymisation, or any other social or economic disadvantage, revealing of racial or ethnic origin, political opinions, religion or philosophical beliefs, trade union membership, or where other personal aspects are evaluated particularly to analyze or predict performance at work, economic situation, health, personal preferences or interests, reliability or behavior, location or movements in order to create or use personal profiles (Regulation 75).These risks and liabilities should be determined and evaluated based on objective assessment to establish if data processing involves low, medium or high risk (Regulation 76).
Challenges & Future Agenda
The regulations laid out in the GDPR are in-depth when compared to the archaic Data Protection Directive and so are the penalties. With the high level of complexities involved in the online advertising sector, it might become difficult for publishers, brands and adtech companies to survive if appropriate actions are not taken. These actions go well beyond the normal GAP analysis and security overhaul when compared to other industries in information technology. Therefore, review of some of these deep challenges is equally important as to finding their solution if online advertising business are to sustain (Ryan, 2017).
Transparency into what user data has been collected and processed and for what purpose
Despite arguments, businesses will need consent from users to capture their personal data (Ryan, 2017) for online tracking (Regulation 32). In other words, users must be given clearly understandable terms for each instance in which their personal data will be used, and each processing activity that uses this data needs to be documented fully and followed precisely given that during consent gathering the user should be aware of how data of his each activity will be used. They will also be asked to check a box that will give permission to see their data to any data broker (which would open the door to unsolicited offers and online user tracking across devices). This is rather difficult given how dynamically user data is used. Moreover, companies who want to use this data must get separate permission to use users’ data for various purpose, such as marketing, maintenance, fraud scrutiny and support. Therefore, although the GDPR creates a very strict definition of uniquely identifiable data, the consent or other specific situations still apply. And companies even need to have detailed documentation that record when that consent was given. These exceptions are of great importance for DMPs, as they rely on sensitive data processing (Zarsky, 2016).
Any player involved in data collection and processing must comply with GDPR with no exception
Users want both self-centered and general brand experiences (Simmons, 2008). The core concept of GDPR revolves around identifying and targeting individual users based on certain behavioral patterns (Article 4,1) (Regulation 24). Self-centered or targeted branding is showing advertisements to a specific segment of relevant users. The core concept here is to increase the brand presence, recall rates and conversions of the advertiser among that segment of identified user. Consider for example, a tennis shoes company that wants to advertise on the website of one or multiple publishers. For the best returns the company would be interested in increase awareness only among people who are interested in tennis shoes. A sports-related website using targeted campaigns (digital advertisements) would be the best fit (Chichering and Heckerman, 2003). Supplementing this strategy, the advertiser would also focus on individuals who have shown the intent of purchasing a tennis shoe which is called in-market advertising. In-market audiences are a way to connect with consumers who are in the market and are currently searching or comparing products and services across various different sites. To categorize an individual as in-market for a specific product or service, DSP including AdXs take into account the various different behaviors shown by the individual including clicks on related ads and subsequent conversions, along with the content of the sites and pages they visit and recency and frequency of visits. This helps DSPs and AdXs to accurately categorize users based on intent to improved your offerings. Unless all parties involved in singling out this user conform to the GDPR, adtech companies might have to revert back to mass media which is against today’ trend. Unlike mass marketing one-to-one marketing increases the value of the customer base, increases cross-selling, reduce customer attrition, higher customer satisfaction and reduced transaction cost and faster cycle times (Peppers et al., 1999).
Identified Action Items
Implementation of privacy by design
The most important technology with regards to implementing privacy by design is called Privacy-Enhancing Technologies (PET) e.g. data encryption, protocols for unidentified communication, attribute based credentials and private search of databases. They are well proven based on prior research and in test environments. Unfortunately, PET is still not commonly considered when designing systems. Privacy by design also covers not just technologies but also organizational processes and business models (Danezis et al., 2015). Some classes of privacy enhancing technologies are email, interactive including technologies for instant messaging, internet applications, remote logins, VOIP and games and other communication anonymity and pseudonymity systems. PET can include methods like type-0 remailers, anonymizer.com, onion routing, the freedom network, Java Anon proxy, Tor, GNU privacy guard, SSL, TLS, off-the-record messaging, private payments, private credentials and anti-phishing tools (Goldberg et al., 2007).
Online behavioral targeting without personal data
There are certain ways in which users can be shown personalized advertisements without the need for sharing their personal data. One such methods can be the implementation of a browser extension on the user’ system. This extension which is capable of processing personal information to create a user persona can also to be used to select the type of ads to display. If the ads are not clicked on, their personal data is never communicated outside of their computer and hence no personal data is reveled. Users can still see ads relevant to their interest and based on their behavior (Toubiana et al., 2010). The basic principle behind this is to move the process and not the data (Armstrong, 2014).
Increase direct partnership deals with publishers
Web data is the main source of data collection in today’s digital age. Therefore, an increase in different players and increasing data collection has led to increased complexity of data structures. Additionally, challenges arise with the scalability of web data management. As amounts of data and the amounts of relation- ships between different data management companies increase, organizing and locating shared data becomes increasingly complex (Abadi et al., 2007). As data structures become increasingly complex and the addition of the GDPR, companies and especially DMPs are concerned with the identification of data, which users request to be deleted or for which they demand insights. In order for companies being able to track the data and being able to identify the data structure and where it went, DMPs need to increase their relationships with publishers. Improving the management of the relationships helps all players to properly identify and track the use of the users’ data in order to comply to the GDPR and accurately identify all of the users’ data across all parties involved (Madhavan et al., 2006).
This paper has reviewed in detail the business model of DMPs from the various different perspectives. In the entire ecosystem, the actors with whom DMPs deal are the advertisers, website publishers, DSPs, SSPs and Adx, who are significantly relevant in possessing data or exchanging data. Internally, DMPs create value and generate profits by processing and analyzing data to segment users, via building a unique profile of every user, and then making them useful to improve the marketing targeting and buying. The whole process involves lots of technologies to collect track, identify and match personal data from all sources.
Given how integral the internet has become nowadays and the amount of data individuals generate and share in the virtual world every second, a review of privacy is important. With this privacy in the EU plans to implement the new General Data Protection Regulation from next year (2018). This regulation has strictly defined personal data and has plenty of articles to prevent this personal data from being identified, collected and used in an improper or unconstitutional manner.
Facing the upcoming GDPR regulation, the entire business community should change their present way of running businesses as long as their business uses personal data. There is no doubt that the GDPR brings many challenges to DMPs, which can considerably affect a big part of their business. We have identified five main challenges for DMPs to be overcome in order to comply with the GDPR. These challenges range from giving users more insight into their data applications to security and data protection plans to ensure a secure storage of personal information. The GDPR has been enforced in May 2018, which pressures DMPs and many other companies using personal data to adapt their business models and techniques of utilizing personal data.
- Abadi, D. J., Marcus, A., Madden, S. R., & Hollenbach, K. (2007, September). Scalable semantic web data management using vertical partitioning. In Proceedings of the 33rd international conference on Very large data bases (pp. 411-422). VLDB Endowment.
- Armstrong, R. (2014, September 23). Move The Process, Not The Data Accelerating Analytics With In-Database Processing. Retrieved December 19, 2017
- Bathelot, B. (2017, June 25). DMP. Retrieved December 4, 2017.
- Bluekai. (2011). Data Management Platforms Demystified. [White Paper]. Retrieved November 21, 2017.
- Chen, S., Dasdan, A., Elmeleegy, H., Kolay, S., Li, Y., Qi, Y., … & Wu, M. (2013). U.S. Patent Application No. 13 /924,343.
- Chickering, D. M., & Heckerman, D. (2003). Targeted advertising on the web with inventory management. Interfaces, 33(5), 71-77.
- Danezis, G., Domingo-Ferrer, J., Hansen, M., Hoepman, J. H., Metayer, D. L., Tirtea, R., & Schi ffner, S. (2015). Privacy and Data Protection by Design-from policy to engineering. arXiv preprint arXiv:1501.03726.
- Goldberg, I., Wagner, D., & Brewer, E. (2007, February). Privacy-enhancing technologies for the Internet III: Ten Years Later. In Compcon’97. Proceedings, IEEE (pp. 103-109). IEEE.
- Madhavan, J., Halevy, A. Y., Cohen, S., Dong, X. L., Je ffery, S. R., Ko, D., & Yu, C. (2006). Structured data meets the Web: a few observations. IEEE Data Eng. Bull., 29(4), 19-26.
- Peppers, D., Martha Rogers, & Bob Dorf, Is Your Company Ready for One-to-One Marketing? Harvard Business Review, JanuaryFebruary 1999, 15160.
- Ryan, J., Dr. (2017, July 19). The 3 biggest challenges in GDPR for online media & advertising. Retrieved December 04, 2017.
- Shah, V., Mao, Y., Chen, S., Bennett, D., & Shao, X. (2011). U.S. Patent Application No. 13 /206,416.
- Simmons, G. (2008). Marketing to postmodern consumers: introducing the internet chameleon. European Journal of Marketing, 42(3 /4), 299-310.
- Toubiana, V., Narayanan, A., Boneh, D., Nissenbaum, H., & Barocas, S. (2010). Adnostic: Privacy preserving targeted advertising. Proceedings Network and Distributed System Symposium.
- Yuan, Y., Wang, F., Li, J., & Qin, R. (2014, October). A survey on real time bidding advertising. In Service Operations and Logistics, and Informatics (SOLI), 2014 IEEE International Conference on (pp. 418-423). IEEE.
- Zarsky, T. Z. (2016). Incompatible: The GDPR in the Age of Big Data. Seton Hall L. Rev., 47, 995.
- ZawadziÅski , M. (2016, October 31). How Does Data Collection Work in a DMP? Retrieved December 2, 2017.
- Zhou, W., & Piramuthu, S. (2015). Information relevance model of customized privacy for IoT. Journal of business ethics, 131(1), 19-30.
- Zhou, W., & Piramuthu, S. (2013). Technology regulation policy for business ethics: An example of RFID in supply chain management. Journal of business ethics, 116(2), 327-340.