It is often said that data is the new oil, the most lucrative commodity on the market today. Major customer-facing companies collect tremendous amounts of data — big data. Through loyalty programs the retail knows the customers’ purchases, the banks knows the transactions, the telecom knows the travels and internet searches, and so on. It is a vast global repository of first party data.

This global repository is highly fragmented. Each company has its own limited view — only its own customers. Due to the highly sensitive nature of this data most companies are reluctant to share it for the fear of losing control. It is a problem of trust: the participants do not trust each other or any central storage or hub with their data assets. As result, precious opportunity for joint data collaboration is lost. The opportunity to build a universal multi-sourced customer portrait to benefit all parties involved, — the customers most of all.



The result of the matching ID’s is a joint distributed dataset containing anonymized customer’s attributes sourced from multiple data suppliers. There are hundreds of attributes available: social demographics, children and pets, interests, events, geotargeting and locations, travels, budgets, preferences, visits to retail, behavioral, retail purchases, and other categories. The combined dataset — the audience segment — is dynamic, generated per request through the instance and is not stored on any single central server. The result is the Customer Portrait 360.

The primary platform contributors and users are:

Data Suppliers — cross-industrial (retail, e-commerce, telecom, financial sector, insurance, education, etc) customer-facing companies that collect first party data as part of business transactions.

Brands and Advertising Agencies — the main users of the targeting for the advertising product. Agencies, on behalf of the brands, build precisely targeted audience segments through the UI or custom queries in order to connect with their customers through digital advertising.

Advertising Platforms — well known platforms with large ad inventory: Facebook, Google, Yandex, Mail.ru, and others.

Research Agencies — provide independent verifications of attributes derived by the data science algorithms. Paid panellists within selected audience segments will answer survey questions to confirm or deny the accuracy of predicted attributes.

Analytics Users — all participants, including the suppliers of actual physical goods sold in retail. They are interested in the analytics sourced from joint retail data.

Roles may overlap: for instance, a single company may be a data supplier, a brand, and an analytics user.


All of the below product functionalities are accessible through UI — a part of AFS that is also stored in the instance of each participant — or through advanced data modelling.

Targeting for Advertising

The platform allows to build an audience segment from multiple datasets and then launch targeted advertisements (banners, video) through ad platforms such as Mail.Ru, Yandex, Facebook, Google, and others. Online ads performance (views, clicks) are matched back with the retail purchase data to build dynamic O2O (Online-to-Offline) reporting gauging the effectiveness and impact of the advertising campaign on shopping behavior in retail.

Retail Analytics

The analytics are available for all types of participants, both major players and SMB (Small and Medium Business). The real time analytics in Aggregion DDMP are envisioned as an eventual replacement for the traditional survey-based data measurement practices, which are hindered by 30-40 day lead time due to manual data collection and processing.

Some of the analytical reports are:

Comparative Analysis —  sales and positioning of brands and goods categories against competitors within the same category;

Pricing — gauging optimal product pricing policies based on sales and competitors’ pricing in the same category;

Promotions —  assessing the success rate of promotions launched through marketing channels; tailor and customize promotions to achieve business goals;

Assortment — recommendations on the range of products offered for sale and demand by category; comparisons with competitors from different types of retail;

Geomarketing — use of geotargeting from telecoms to build correlations between visits, sales, competing stores in the vicinity and traffic patterns for recommendations for optimal location of a retail store.

Marketing Channels

Aggregion platform also empowers many of its participants to use their existing channels — mobile apps, sites, social networks, messengers, email, etc — more efficiently and add new ones. For instance:

Coupons — personalized offers (discounts, combos, deals) generated automatically for the audiences based on their Portrait.

Surveys — participants may communicate with the audiences to inquire about purchases (or any other activities and preferences). Customers are rewarded with bonuses for answering the questions.

Complements — additional incentives may be attached to the products: promotional interests-based vouchers in partnering businesses; digital products ; earning bonuses, cashback, points, miles and such in the partner network.


Aggregion has developed the technology to resolve this problem — the Aggregion DDMP (Distributed Data Management Platform). In the heart of DDMP is the decentralized protocol that enables the instances of AFS — Aggregion Full Stack, core technology deployed within the companies’ own IT landscapes — to interact without the need for any central hub or storage. This ensures that the data never leaves the premises of the owner company. All operations are written on blockchain and the blockchain nodes are stored within each AFS instance.

The critical component of this technology is the data matching mechanism. It allows to match hashed customer ID’s between datasets. The AFS algorithm stored within secure enclaves in each company’ own instance allows to securely perform matching and produce correlations between datasets. As result, the client companies, the data users, can run queries on this joint data to produce anonymized audience segments that are used to power products built on top: targeting, analytics, marketing, ML/AI, and many other possible uses.