Faster Reporting: Introducing Automated Statistics Aggregation
Introduction
We are blessed to have a large customer base, many of whom have been with us for a very long time, some for 10 years or even longer. The Revive Adserver systems we host for these customers store delivery statistics at a very precise level of detail: for every combination of a banner and zone, by the hour.
This results in impressively large databases, which comes with some drawbacks. First, running reports (even for “today” or “last week”) takes longer when the overall database is massive. Secondly, we make full backups of all databases multiple times a day, which requires significant time and disk space.
Today, we’re announcing that starting in May 2026, we’ll be optimizing all databases using a new aggregation process.
Aggregation of statistics
Starting from May 2026, we’re going to run a monthly process that will do the following:
- Recent statistics (not older than 12 months) will remain available at an hourly level
- Any statistics older than 12 months will be aggregated to a daily sum
- Any statistics older than 24 months will be aggregated to a monthly sum
For example:
In mid-May 2026, we’ll run the aggregation process (treating May 1st as the cutoff date):
- Statistics for May 2025 and newer will remain available by hour
- Statistics from May 2024 through April 2025 will be aggregated from the hourly to the daily level
- Statistics for April 2024 and older will be aggregated from the daily level to the monthly level
And then, in mid-June 2026, we’ll run it again:
- Statistics for June 2025 and newer will remain available by hour
- Statistics from June 2024 through May 2025 will be aggregated from the hourly to the daily level
- Statistics for May 2024 and older will be aggregated from the daily level to the monthly level
How the aggregation works
The developers of the Revive Adserver software have created a tool that automates the aggregation process entirely to ensure it works quickly and avoids the risks of manual database operations. Initially, we’ll start the aggregation process manually to monitor execution time and performance effects. Once we’ve established that it works correctly, we will schedule it for automatic completion once a month.
What our users think
Before moving forward, we examined server logs to see how our users interact with long-term statistics. We found that users rarely run reports for data older than a year. When they do, they virtually never require the hourly breakdown. We also interviewed a small group of users, and the consensus was that monthly sums are perfectly sufficient for data older than two years, and daily sums are plenty for data older than one year.
Expected benefits
As a result of this aggregation, the overall size of the databases will decrease significantly. This provides several benefits:
Faster Reporting: Reports should appear even more quickly than they do today. Additionally, other users won’t have to wait as long for resources to clear, improving performance for everyone.
Efficient Backups: Smaller databases mean faster backups, freeing up system resources for background processes.
Optimized Performance: While testing the tool (code-named Condens), we found it runs surprisingly fast. It supports “batch processing” to ensure the database server is never overwhelmed during the aggregation.