Skip to the content.

This is probably the most intuitively obvious one.

People’s names are certainly not globally unique.

30,000 people in the US are named “James Smith”, the most common name. Source

The company I worked for at the time of writing this had an employee named Michael Jackson, who had never released any hit singles, or at least, none I was aware of.

From this alone, you can not assume that if two names match, they’re the same person. This is of course why we generally use email addresses, not names, as user IDs.

Beyond that, a single person can have multiple names. Even multiple legal names. Sometimes this happens during transition from one name to another, sometimes a person may have multiple legal names for life. So it’s very possible that two names that don’t match are in fact referring to the same person.

Even in the case of the “same name” being input into a computer twice, there are cases where a person might have their name in different styles in different systems.

A case we should all be familiar with: a canonical representation of a Western name will usually have some or all parts capitalised, but in some systems the name will have been input in uppercase, because paper forms told the person to write everything in print capitals.

A more advanced case: When the name contains non-ASCII characters, these may have been input as pre-composed Unicode characters, or using combining marks, depending on the software being used to enter them. To cover this, it is always recommended to normalise any Unicode strings which come from outside your system, including from the user, and including names.

So in general, a system will not be able to use this information to determine conclusively whether two names refer to the same person. If the names, along with other information, is shown to an operator, they may be able to make the call on whether the two are really the same person. For example, if the names are slightly different in their presentation, but the two people were born on the same day, and born in the same small town, you can be fairly certain the names refer to the same person.

Perhaps an advanced system could even keep track of all the official names it has seen for the same person, which could also help for potential future deduplications.