Availability: the key to data governance at L’Équipe
Data and product team manager Romain Lhote oversees about 15 employees who are in charge of developing data, analyzing user behavior, monitoring the subscriber base and data science projects to develop L'Équipe's digital product. In this in-depth interview, Romain explains everything L’Équipe has done over the last few years to lay the foundations for data governance and the road ahead to optimize the data management and democratization processes.
L'Équipe in a few stats...
- 2.5 million unique visitors per day
- 1.5 billion page views per day
- 80% of page views on mobile
- 300,000 subscribers
- 5 times a day, the average number of times users visit the app
Can you tell us how L’Équipe is organized internally?
Within the digital division, there are four business units: a sales, acquisition and loyalty team; a social media unit; a technical department and the product and data team that I have been managing for the last six months. At the group level, there is another team on data, made up of data engineers, whose mission is to structure the data model, to ensure that the flows work properly and that the data is stored in the right place. The editorial team is the foundation of our brand and is "unique" in both print and digital but divided into soccer and general sports. We work with all these people on a daily basis to develop L'Équipe's data policy. We should also include the management, TV, finance and ASO teams, etc.
How did you first become familiar with the principle of data governance? What was the main issue in the beginning that led you to seek data governance as a solution?
When I joined L’Équipe, I was not particularly well-versed in data governance because I was essentially doing web analysis and I was fairly distant from all the data science issues. So I learned everything late in the game. In 2017, when I took over the data team, I wanted to put web analytics at the center of our data ecosystem because in a media publication like L’Équipe, the browsing paths and consumption trends of our users are, in my opinion, our most valuable assets.
From then on, I noticed that many teams were doing "data" by themselves and that for a given KPI, we could have at least four different results. This situation was not sustainable. It was urgent to make the data more reliable by centralizing it within a single team. We, therefore, quickly put an end to the multiple dashboards that were circulating internally. Now, all L'Équipe's analytics dashboards are developed and "labeled" by the data division with the support of the IT department to structure the data.
Who exactly is data governance for in an organization like L'Équipe?
Quite simply, it's for all the employees we work with daily. The goal of a data governance policy is to make all employees understand that the data team is at their service and that it can always provide them with all data — as far as possible — and that any data that comes from elsewhere has no value, no authenticity. But if data governance concerns all the teams of the digital division from a very operational point of view, it must also involve the management. The latter must be aware that data cannot (and should not) come from several people but from a single unit. This is a fundamental point for the success of the project.
When was the foundation for a data governance policy set at L’Équipe? What were the reasons for implementing it?
Before I arrived, there were the beginnings of data governance at the group level, at a time when the Les Echos and Le Parisien brands were part of the Amaury group. There was overall group data governance but, within each entity, we also had web analysts who worked on traffic analysis. After the sale of the two media companies, we had to reorganize the data teams. Indeed, at the time we had two separate data teams within L’Équipe, one for customer service and sales teams and the other for web analysis. When I became Data Manager, we brought these two teams together.
From the outset, our communication with employees improved and became more fluid because we were getting rid of a lot of duplication. For example, we used to get different results in our analyses because our calculation methods varied from one team to another. There were also several rules that were not widely known such as:
- It is impossible to aggregate visits
- Web visitors are calculated per day
As a result, we were regularly obtaining inaccurate conclusions, with all the consequences that this could have on internal decision-making and on the performance of our digital activity. Management also wanted to instill a group data vision — between advertising, print and digital — with the creation of committees to get the teams to work together and set up common roadmaps.
In practical terms, what were your first data governance actions?
Making data accessible to as many people as possible seems to me to be a key element in establishing a data governance policy within an organization. Between 2017 and 2018, we created and made available to teams several dynamic dashboards, notably on:
- The Global Digital Subscriber Base allows you to see subscriptions and churn on our entire subscriber base in real time.
- Direct subscriptions allow you to visualize the share of articles published per sport theme, the volume of subscriptions generated per sport and the details of subscriptions generated for each article.
- The data of page views in total, per sport and per article.
These first dashboards were intended for the editorial staff, with the aim of managing all the editorial activity. Later, we created other more detailed dashboards that were more commercially oriented: subscriber base, cancellations and subscriptions, user cohorts, churn rate, etc.
Also, federating all the teams around data involves monthly routines, to share key figures, analyze them and make decisions for the product.
Do you design analytics dashboards in collaboration with business teams? Can they create them themselves?
At the beginning, when data was not as democratized as it is today, our employees had fairly basic requirements. They had a vague idea, but they didn't know what they could actually do. So working together was rare. Today, we work more hand in hand with them because they know how we operate, what types of data we store and what we can analyze. So our discussions are more constructive than before.
On the other hand, our employees do not have access to more data than we share with them via dynamic dashboards. The aim of this is to centralize everything and make the data more reliable at the heart of the data center. So, maybe they will need more data in the future, but at this point, we are still in a data adoption and evangelization phase.
The implementation and control of the tagging plan are crucial in a quality data governance approach. How do you organize this critical step within the L’Équipe?
Today, the product teams know full well that they must inform the data team and bring them into the loop as soon as they launch a project or a feature idea. My goal would be to have the data team involved from the design phase, so that they can think about KPIs when creating interfaces. On the operational side, we have a web analyst working almost exclusively on tagging. As soon as there is a release of the website or mobile app, they carry out a test to check that there is no regression or side effects on our platforms.
These tests are carried out manually, as the automatic tests we tried, especially on the mobile app, were never successful. The bots were unable to reproduce specific scenarios on our screens. And, as 80% of our traffic is on mobile app, we decided to take care of the quality of our tagging manually. We are aware that manual tagging is quite tedious, but it is essential for our analysis work. Without reliable tagging and without data flowing to the right place, we can't do anything. The tagging plan ensures that our production processes are as reliable as possible: analysis, scoring, personalization and content management.
Do you have a committee or body dedicated to the validation of data projects?
No, there is currently no decision at the group level. We validate the launch of a project, or its delay, with the head of the digital division. Each data project must meet our objectives: generate engagement or subscriptions.
What resources did you mobilize to complete your project?
First, we invested in a data visualization tool and a platform. Also, we have recruited a lot of people to expand the data team. We have enlarged our team with a person working on tagging, experts in SQL language and more hybrid SQL and marketing profiles that allow us to make the data talk. Finally, one person works on data science and has allowed us to accelerate quickly on a range of subjects.
Do you have a reference document or resources that formalize all internal processes?
We started this project during last year's lockdowns. Until now, we had no functional documentation of our data flows. Sometimes we even gave them names that were too unusual or not meaningful enough for someone new to the product. For example, we have a flow called data flow event. If you are new to Piano Analytics, such a name will obviously not mean anything to you.
In the middle of 2020, we set up a glossary on the Confluence shared workspace that allows us to identify and present each of our flows by indicating the data schema, the service provider involved, the time of reception, the creation or not of associated intermediate tables and the definition of the fields of all our data sets, etc. It is a titanic task. We are still very far from having finished it, but we are spending a lot of time working on it.
How do you ensure the quality of your data on a daily basis?
On the site side, as I explained earlier, we have a person who ensures the quality of the implementation of our tagging plan. We also have people who take care of the reliability of the input data flows on the big query side. Data cleaning is carried out by our DPO, in consultation with the data, marketing and technical teams, and includes a regulatory data purge. In addition, we also have monitoring metrics to identify possible bugs or system crashes. For this task, we use New Relic on the web and Crashlytics on mobile apps; but these tools are more for the technical teams.
In this environment, what are the objectives of the data team?
Our team's goals are simply aligned with the publisher's business model, i.e. to generate more page views, thus more display advertising and more subscriptions. All our data projects must be aligned with this goal. Today, data is used to deliver the right article to the right user, but it will not specifically influence production. We can indeed occasionally identify a user's interest in a subject (for example, more recently, MMA or Formula 1) and share this trend directly with editorial teams in the form of recommendations. But data does not drive the editorial line of the newsroom.
What are the upcoming data governance projects?
At L’Équipe, I feel that we are mature enough from a data point of view, but there is still a lot of work to be done to structure us at the data governance level. We need to set up real processes for creating our data model to make data collection more reliable. We also need to set up dev and production environments for the data because, today, anyone can create a table or a view within our ecosystem. Soon, we are also going to take over all our data subjects shared with the other divisions (advertising, TV, print) at the group level to move from a publisher data vision to a global data approach for the brand.