Jenkins is the way to freedom from boring tasks

Xperi's AMA Platform

Submitted By Jenkins User Amit Sharma
Data loss is a catalyst for revenue loss in media sales.  A monitoring framework was needed to test data delivery daily*.
Organization: Xperi, https://www.xperi.com/
Industries: Advertising
Programming Languages: Python
Version Control Systems: GitHub

Automatic monitoring of data input to improve the quality of advertising reports.

Background: I am working as a part of the advertising team at Xperi. We get data from many sources, which we process to generate the reports. As the data comes from different systems, if source data was held up due to either technical or manual issues, reports generated based on that data received could not be used. This led to revenue loss as there was no way to recover the data. To overcome this problem, we decided to have a monitoring framework that would run some tests on processed data every day.

Goals:  We must constantly monitor data pipelines to ensure good data quality.

"Jenkins forms the core of my solution. Before Jenkins, I was running the scripts manually, and on weekends had to login and run it. Now, with Jenkins jobs pre-configured, I am free to live my life while Jenkins takes care of the jobs."
image— Amit Sharma, Sr. Software Development Engineer in Test, Xperi

Solution & Results: To overcome the problem of missing data, I wrote a Python script that runs tests on the data received. If all tests are not passed, we look at the data and flag issues to the concerned team so that remedial actions can be taken immediately. Jenkins was helpful in setting up the daily monitoring job and triggering it daily. Additionally, the reports generated are mailed to the stakeholders so that everyone can take a look at them. For historical purposes, the 'build status' gets captured in Jenkins, so, if needed, historical data can also be referenced. We also integrated Jenkins with our internal communicator to make it easier to check job status at any point in the dev cycle.

The key capabilities we relied on were varied. We used the Shining Panda plugin to integrate the Python code with Jenkins. We also used the Slack plugin to print the status of the job in the Slack channel. And we used Jenkins Build to periodically trigger to schedule the job to be run on a daily basis.

We are happy with the results:

  • Missing data issues are identified early. 

  • Issues with data quality are caught and highlighted 

  • The need for manual intervention to check the reports is eliminated