OCT 23, 2012 4:25pm ET

Record-Matching Integrity: An Algorithm Primer


When hospitals are sharing information internally or externally, accurately linking patient records from multiple disparate databases and information systems is critical to ensuring clinicians are making decisions based on the correct and accurate patient records and avoiding the creation of system-clogging duplicates and overlays.

One patient can have multiple identifiers within a single organization, e.g. medical record number, billing/patient account number, order number, requisition number, etc. More identifiers come into play when an organization has multiple locations offering different types of services. When these identifiers flow into an HIE or ACO, strong algorithms must be in place to identify with pinpoint accuracy which records belong to which patient so they can be linked into a single record with a single unique identifier for use across the initiative.

These algorithms are typically embedded in a hospital’s or information exchange’s system. However, they are not all the same. How well an algorithm performs depends upon which of the following three categories it falls into:

* Basic Algorithms:  The simplest technique for matching records, basic algorithms make comparisons based on selected data elements, typically name, birth date, Social Security number and gender. They typically utilize exact match or deterministic matching tools, the latter of which is slightly more sophisticated in that partial matches or matches from phonetic encoding systems may also be used. Basic algorithms also deploy wild-card linking techniques, which return every record that matches a limited number of characters entered into a search string as well as any other data element specified to refine the search.

* Intermediate Algorithms: Intermediate algorithms incorporate “fuzzy logic” and arbitrary or subjective scoring systems with exact match and deterministic tools. A field match weight is arbitrarily assigned to specific identification attributes and records must reach a minimum scoring threshold to qualify for consideration. Fuzzy logic utilizes nickname tables and rules to address transposed names, characters or digits and other typographical errors within the database. Intermediate algorithms may also include an automated frequency adjustment, which decreases the field match score across two records if the actual field value (i.e. a common last name or birth date) is present in a significant number of records.

* Advanced Algorithms:  The most sophisticated set of record-matching tools, advanced algorithms rely on mathematical theory--bipartite graph theory, probabilistic theory and mathematical and statistical models--to determine the likelihood of a match. Advanced algorithms also include machine learning and neural networks, which use forms of artificial intelligence that simulate human problem solving. These systems “learn” as more data is processed and automatically redefine field weights based upon that learning.

Regardless of the strength of the algorithm used, false positives and false negatives will always occur. Even the most sophisticated algorithms cannot be solely relied upon to make record-matching decisions, as “auto-linking” routines can create errors.  Among the most common errors are linking two closely related people with similar names and birth dates who live near each other or two individuals with the same name and birth date who share an address, such as can happen in large apartment complexes or other multi-family residential buildings.

This is why results must always be verified using well-established record-matching validity procedures. Skipping this critical step could result in overlaid records, potentially violating privacy laws and, more significantly, impacting care coordination, quality and safety.

Beth Haenke Just (bjust@justassociates.com) is CEO and president of Just Associates, a data integrity consulting firm.



Comments (3)
Nice post. I used to be checking continuously this blog and I am inspired! Very useful information specially the final phase :) I take care of such info a lot. I used to be seeking this particular info for a long time. Thank you and good luck. link
Posted by Adella F | Tuesday, November 05 2013 at 1:51AM ET
Greetings! Very helpful advice within this article! It is the little changes which will make the largest changes. Thanks for sharing! japanese video
Posted by Adella F | Thursday, November 07 2013 at 4:07AM ET
It's a pity you don't have a donate button! I'd definitely donate to this outstanding blog! I guess for now i'll settle for bookmarking and adding your RSS feed to my Google account. I look forward to new updates and will share this website with my Facebook group. Talk soon! limo service to sfo
Posted by Adella F | Thursday, December 19 2013 at 3:53AM ET
Add Your Comments:
Not Registered?
You must be registered to post a comment. Click here to register.
Already registered? Log in here
Please note you must now log in with your email address and password.

Blog Archive for Beth Haenke Just

What HIOs can Learn from the ONC Patient Matching Report
Record Matching Algorithms: Close Isn’t Good Enough

More from Beth Haenke Just »

Blog Index »

loading time...

Stay Connected


HDM Clinical Visionary John Showalter has seen the future of predictive analytics, and it starts right now.

Already a subscriber? Log in here
Please note you must now log in with your email address and password.