Data Architecture Optimization Case

AdTech custom development: Simpals Data Architecture Optimization Case

29 June 2022

Admixer Development Team helps advertisers and publishers to overcome technical difficulties in development of their in-house solutions. The most common problem for media companies is handling large amounts of data. Building the right solution in this area can save a lot of money and resources and significantly increase operational efficiency. Admixer, having great experience in working with big data projects, helps partners with such complex development.

The Big Data Problem

Today, most of the companies are faced with the problem of storing, processing, and analyzing their data. The amount of data is constantly growing, and it becomes more difficult to handle it. Some companies solve this problem by reducing the data stored. This is fraught with the loss of important data points and could affect the overall performance. Others increase computing capacity and number of servers. Which inevitably leads to a significant increase in overall costs.

A few years ago, Admixer faced the similar problem, as we had to store and manage tens of terabytes of data with the requirement of quick access to it. Having tried various solutions and approaches, we discovered ClickHouse, which is a column-based database management system (DBMS) for online analytical processing (OLAP).  ClickHouse is a very productive DBMS that provides a high-performance, easily scalable, and fault-tolerant architecture. Based on this product, we built a highly efficient system for storing, processing, and analyzing data for Admixer.

To learn more about our experience with High Load projects development, visit https://loadfighters.com and ClickHouse Blog

A few months ago, our partners from Simpals approached us with the similar problem. Simpals is a large media holding, who runs the largest eCommerce project in Moldova – 999.md. This marketplace generates huge number of statistics daily, and the major problem was in the constantly increasing amount of data to process.

Implementation

Data architecture of the Simpals’ main project was built based on Elasticsearch. Elasticsearch is a RESTful distributed search and analytics system. It is quite suitable storage and a good engine if you only work with strings and need a proper string search. But for many real cases it is not enough, and more optimized storage system is needed.

At some point Simpals’ existing architecture ceased to be efficient both in terms of storage space and query execution speed. Given that the complexity of queries was constantly growing, they needed a more powerful solution.

Admixer’s Data Engineers conducted an audit of the current architecture and collected the main problems that the company faces while working with data and typical chains of interaction with the data. Using ClickHouse for this case was the right decision since there were many non-text values ​​in the data structure, which meant the data storage could be greatly optimized. Furthermore, they needed a high speed of getting data outright.

We rebuilt the data structure from Elasticsearch to ClickHouse, selected the right data types, and created an optimal data processing chain. Generally, the data processing chain looks like this:

Optimized data processing chain
Optimized data processing chain

This structure can be described as a waterfall. The data first gets into the fastest and most efficient buffers, which are in RAM. Then splitting occurs, if necessary, into several large Raw plates (such splitting is necessary, for example, if the data structure is completely or basically different). And then, there is another stage of stratification into smaller plates, which are configured for specific types of requests to maximize the speed of obtaining data. Layering occurs with the help of MaterializedViews structures that allow you to process data in chains.

Our next step was to make a service that receives and inserts data into ClickHouse and acts as a proxy for receiving data directly from the servers. The service is written in Golang, which is high-performance and widely used. All data exchange between ClickHouse and the service occurs over a pure TCP protocol.

Business Impact

Having built a new structure, we migrated data from Elasticsearch to ClickHouse to test and analyze the capabilities of the new system. As a result, the transition from Elasticsearch to ClickHouse made it possible to speed up the execution of requests for obtaining data by several times, in some cases by dozens of times, and this applies to complex queries. Simple requests are executed within a couple of milliseconds.

The transition also reduced data storage by more than 10 times (worth noticing that it was carried out on a fairly small amount of data of several terabytes). This system is easily scalable on large volumes, allows Simpals to save infrastructure costs, and speeds up data processing significantly.

All this made it possible to optimize resources on the project, speed up a working process with data both in recording and receiving, and reduce the number of routine tasks, giving the business the opportunity to focus on more important and necessary challenges.

Want to know more about Admixer Custom Development Solutions?

Visit our AdTech Development Page or fill in the form to talk to our experts

Want to learn more?

    Select
    Demand
    Brand
    Agency
    DSP
    Supply
    App Developer
    Others
    Ad Network Owner
    Tech Vendor
    Other/Not Sure
    Select
    Demand
    Brand
    Agency
    Others
    Ad Network Owner
    Other/Not Sure
    Select
    Demand
    Brand
    Agency
    Supply
    Media House
    Others
    Ad Network Owner
    Tech Vendor
    Other/Not Sure
    Select
    Demand
    Brand
    Agency
    Supply
    Web Publisher
    App Developer
    SSP
    Media House
    Others
    Ad Network Owner
    Tech Vendor
    Other/Not Sure
    Select
    Supply
    Web Publisher
    App Developer
    SSP
    Media House
    Others
    Ad Network Owner
    Other/Not Sure
    Select
    Demand
    Brand
    Agency
    DSP
    Supply
    Web Publisher
    App Developer
    SSP
    Media House
    Others
    Ad Network Owner
    Tech Vendor
    Other/Not Sure
    You Might Also Like
    Ad Tech Events 2022

    Table of contents: Advertising events in Europe 2022 Advertising events in the US 2022  Advertising events in Asia and Australia 2022 As covid restrictions become looser, the advertising world is returning to personal communication. After two years of exclusively online events, the largest advertising conferences and exhibitions are making a comeback. This year advertisers, publishers, […]

    Ad Tech events 2022
    Ad Tech Events 2022

    Table of contents: Advertising events in Europe 2022 Advertising events in the US 2022  Advertising events in Asia and Australia 2022 As covid restrictions become looser, the advertising world is returning to personal communication. After two years of exclusively online events, the largest advertising conferences and exhibitions are making a comeback. This year advertisers, publishers, […]

    Read more
    OpenRTB Version 2.6

    Table of contents: Pod Bidding for CTV More context for Buyers  User-Agent content structuring AdCOM Lists for more flexibility Admixer already supports OpenRTB 2.6 On the 12th of April IAB Tech Lab announced the finalization the OpenRTB 2.6 draft, and reported that the new standard is ready for implementation. It is really good news for […]

    OpenRTB version 2.6
    OpenRTB 2.6 is ready for implementation. What is new in this version of protocol?

    Table of contents: Pod Bidding for CTV More context for Buyers  User-Agent content structuring AdCOM Lists for more flexibility Admixer already supports OpenRTB 2.6 On the 12th of April IAB Tech Lab announced the finalization the OpenRTB 2.6 draft, and reported that the new standard is ready for implementation. It is really good news for […]

    Read more
    iab alm - Admixer blog

    Table of contents: Changes in Measurement and Addressability The Talent Crunch  Streaming and Connected TV The importance of first-party data In February took place the 2022 IAB Annual Leadership Meeting (ALM), where industry leaders debated on the most important adtech trends and challenges. Market players were focused on programmatic trends, addressability solutions portfolio, consumer privacy […]

    iab alm 2022 - Admixer blog
    IAB ALM 2022: From the Pivot to Privacy

    Table of contents: Changes in Measurement and Addressability The Talent Crunch  Streaming and Connected TV The importance of first-party data In February took place the 2022 IAB Annual Leadership Meeting (ALM), where industry leaders debated on the most important adtech trends and challenges. Market players were focused on programmatic trends, addressability solutions portfolio, consumer privacy […]

    Read more

      Stay updated with Admixer
      Privacy Policy