Sometimes repeating yourself is redundant. The same is true for your data.
After working in several industries, I have found that any company is only as good as its records. According to CIO Magazine, two-percent duplicates in your database is among the realm of acceptable. However, once a database reaches five-percent is when they are in the danger zone. Once your database is in the “danger zone” reports become misleading and data updates become lost.
One might ask, how do duplicates happen? To name a few reasons: people registering under a few different emails, those with formal full names using a nickname (e.g. Steve vs. Steven), spelling, and other various reasons that can have the same effect. You can read our blog Duplicating the Duplicated Duplicates for more on this topic.
CIO Magazine recommends using de-duping software and zooming in on exactly what makes you have too much of that good data.
There are some companies that still rely on humans to do their de-duping. For example, a company that I worked for that used medical records, relied solely on humans to simply fill out paperwork every time a duplicate was found. This paperwork was then put in a pipeline of bureaucracy in the hopes that the duplicate got fixed and the process did not need to be repeated.
I believe there is a happy medium between these two solutions. Not every company can afford “cutting edge software” and clearly leaving a paper trail to be rid of a duplicate on a computer database is impractical. Stay under five-percent, my friends.