![duplicate detector salesforce duplicate detector salesforce](https://d3nqfz2gm66yqg.cloudfront.net/images/v1517942599-lab-where-data-comes-from_mdmlrt.png)
Whenever a human determines that a set of records are duplicates (or not), the system will “learn” from these actions and tweak the algorithm with the goal of identifying future duplicates without human interaction.
![duplicate detector salesforce duplicate detector salesforce](https://www.datavisor.com/wp-content/uploads/2019/10/4-DataVisor-Customized-Actions-300x263.png)
What is the Advantage of Using Machine Learning to Dedupe Salesforce?Įvery company’s dataset is unique and has its own challenges when it comes to deduplication. Salesforce deduping tools that use this type of technology will allow you to set the weights for each field and then create a model so that the approach is codified and leveraged in any comparison. The “field by field” approach allows you to do this by assigning a specific weight to each field, starting with the most important fields having the highest weight and so forth. This makes it less convenient if you want any emphasis placed on a specific field, such as Last Name. Let’s start by assuming it is a single block of text (as shown below): Record 1Īnother option is to compare each field individually:įor the “single block” approach, each field string would be treated equally. There are a couple of ways we can look at a Salesforce record. Deduping Salesforce With Machine Learning Algorithms For now, let’s take a look at how all of these metrics are used to dedupe Salesforce. We will drill down into these concepts at a later point in this article. Therefore, adapting string edit distance to a particular domain requires assigning different weights to different strings. For example, substituting a digit makes a huge difference in a street address since it effectively changes the entire address, but a single letter substitution may not be that significant because it is more likely to be caused by a typo or an abbreviation. For example, if you consider the Last Name from the example above, the Hemming distance would only be 1 since you only need to change only one letter to convert “Bolton” to “bolton.”Īnother variation on this is learnable distance metrics which takes into consideration that different edit operations have varying significance in different domains. This method counts the number of substitutions that are required to turn one string into another. There are many string metrics out there, with one of the most well known ones being the Hamming distance. This is when you take two strings and return a number that is low if the strings are similar and high if they are dissimilar. One of the ways researchers “teach” similarities to machines is through string metrics. While this may be a good first step, we would then need to stipulate exactly what we mean by the word “similar.” Is there a range where something may be considered not similar at all to very similar? How would a machine go about identifying these similarities?
![duplicate detector salesforce duplicate detector salesforce](https://datagroomr.com/wp-content/uploads/2021/01/datagroomr_features_header1-768x523.jpg)
Since there are obviously so many of them, we can conclude that these are duplicates. We might start by pointing out all of the similarities. In fact, it is actually much harder than it might seem. However, a machine doesn’t have experience or background to make the same determination.
![duplicate detector salesforce duplicate detector salesforce](https://s6c7h4e3.stackpathcdn.com/wp-content/uploads/2018/11/Duplicate-Detection-Rules-750x386.png)
If we take a look at the two records shown below, it is pretty clear that these are duplicates: Name How Does Machine Learning Match Two Records? In this article, we will discuss how machine learning algorithms are trained to dedupe not only Salesforce, but any unstructured data. Just like with autonomous vehicles and other examples, the algorithms that power these products need to be trained to produce the desired outcome. In our case, we are using this technology to identify duplicates in your Salesforce environment. While these are all common applications that garner headlines, machine learning has the capability to simplify many other activities. When we think of machine learning, we tend to think about robotic process automation, virtual assistants, and self-driving cars.