With the rise of digitalization and Big Data, the need to manage large volumes of data has grown steadily in recent years. The problem is that human operators are still very much involved in these data processing processes. However, when it comes to describing a product or service, everyone has a different way of conceiving things, which creates great heterogeneity. And that's where the problem lies, it becomes very difficult to perform analyses via computer models if the data that feeds them is of poor quality.
Let's take an example: an insurance company receives an average of several hundred invoices from garages every day. However, the same intervention can be described in a very different way depending on the expert in charge of it: if some write "replacement of a rear bumper", others speak rather of "installation of a rear shield". It is therefore impossible to automate the processing of all these invoices to speed up reimbursements or avoid fraud since a computer is unable to understand that it is the same operation.
There are three ways to solve this problem:
- By treating the problem at the source and by imposing a unique reference of services to all the garages in the world and hoping that they will respect it. This solution is obviously not at all realistic.
- By asking people to modify by hand all these invoices to match identical services. But on the one hand they would have to know about repairs, and on the other hand they would spend far too much time on this given the huge volume of invoices issued every day.
- By addressing the problem downstream with a technology that standardizes all descriptions of the same product or service.
At YZR, we have chosen the 3rd option.
We offer an artificial intelligence tool (a SaaS for Software as a Service) that allows us to standardize entire catalogs of data almost automatically in a very short period of time.
In reality, the accuracy of the algorithm we developed is not 100% but rather around 90%. This means that there is still 10% of the data that the machine cannot process. This is why we have also developed a platform specifically for business experts to fill these gaps.
The strength of our solution is that :
- The time saving is huge for these people who see their workload divided by 10.
- It applies to any sector in which ungoverned textual data is processed: in retail, in the pharmaceutical industry, in the automotive industry, ... We are in fact tackling a problem that concerns all industries.
- It even works for data written in different languages. For example, one of our customers is a European used car dealer who collects cars from all over Europe with vehicle descriptions in Spanish, Italian, English or German. He was able to use our tool to improve his sales through an optimized marketplace.
YZR was created in 2018 by Sébastien Garcin, former Chief Data Officer at L'Oréal, and Jean-Philippe Poisson, former strategy consultant. Throughout their careers, they were frequently confronted with these data heterogeneity issues. For example, at L'Oréal, sales data comes from very different sources, resulting in very different descriptions for the same product ("D M Q L N" for "make-up remover" in particular). Despite their research, they never managed to find a solution that precisely met this need. That's how the company was born.
Following a fundraising of 2 million euros in January 2021, YZR employs 6 months later more than twenty people to continue the development of the tool.