sql - Querying database to find potential duplicates based on multiple columns -


i have table in database have 100k clients , erroneous system in place quite time coming find out might potentially have thousands of duplicate records, should not have been duplicates.

this basic structure of table...

clientid         fname           lname           dob          sysemid ---------------------------------------------------------------------- 123              janette         marukan         1/2/1990     111111s 478              jannete         maruckan        1/2/1990     111111s 

these not fields in system goal able use columns above in search query. i'm trying take first 2 or 3 letters of first , last names, dob sysemid , find records potential duplicates.

as can seen above in table have 2 clients, names different - in fact 2 clients both same, saved in system two. there way me somehow query whole tables , find that?

i don't have 1 specific client in place - it's more of general kind of query show me potential duplicates of clients same 2 or 3 letters of 1st name, last name , dob starters?

you use window functions:

select t (select t.*,              count(*) on (partition left(fname, 2), left(lname, 2), dob, sysemid) cnt       t      ) t cnt > 1 order sysemid, dob, fname, lname; 

this may 1 of rare situations soundex() useful. after all, designed strings in mind. so:

select t (select t.*,              count(*) on (partition soundex(fname), soundex(lname), dob, sysemid) cnt       t      ) t cnt > 1 order sysemid, dob, fname, lname; 

Comments

Popular posts from this blog

node.js - Node js - Trying to send POST request, but it is not loading javascript content -

javascript - Replicate keyboard event with html button -

javascript - Web audio api 5.1 surround example not working in firefox -