Open In App

How to Find Duplicate Records that Meet Certain Conditions in SQL?

Last Updated : 06 May, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we will understand how to find Duplicate Records that meet certain conditions in SQL. Using the GROUP BY and HAVING clauses we can show the duplicates in table data. The GROUP BY statement in SQL is used to arrange identical data into groups with the help of some functions. i.e.if a particular column has the same values in different rows then it will arrange these rows in a group.

For the purpose of demonstration, we will be creating a Participant table in a database called “GeeksForGeeksDatabase“.

Step 1: Creating the Database

Use the below SQL statement to create a database called GeeksForGeeks Database.

Query:

CREATE DATABASE GeeksForGeeksDatabase;

Output:

 

Step 2: Using the Database.

Use the below SQL statement to switch the database context to GeeksForGeeksDatabase.

Query:

USE GeeksForGeeksDatabase;

 

Step 3: Table Definition

Query:

CREATE TABLE Geeks(
GeekID INTEGER PRIMARY KEY,
GeekName VARCHAR(255) NOT NULL,
GeekRank INTEGER NOT NULL,
GeekSchool VARCHAR(255) NOT NULL
);

Output:

 

Query:

INSERT INTO Geeks VALUES (101, 'Nix',2 ,'Code Valley School');
INSERT INTO Geeks VALUES (102, 'Rutz',4 ,'Blue Chip School');
INSERT INTO Geeks VALUES (103, 'Shrey',1 ,'GCOEA School');
INSERT INTO Geeks VALUES (104, 'Ankx',3 ,'Round Robin Play School');
INSERT INTO Geeks VALUES (105, 'Ridz',7 ,'Dream School');
INSERT INTO Geeks VALUES (106, 'Mayo',6 ,'Silver Shining School');
INSERT INTO Geeks VALUES (107, 'Bugs',5 ,'Twinkle Star Convent');

Step 4: To see the  contents of the created table we use the given below query:

Query:

SELECT * FROM Geeks;

Output:

 

The above geek table has normal records with no duplicate values. 

Step 5: Let’s imagine a situation where a bug occurred in the geek’s table due to some reason a few duplicates are inserted. Ideally, each row should have a unique value for GeekRank but now in our table, duplicate geeks with duplicate ranks are inserted.

Query:

INSERT INTO Geeks VALUES (108, 'Maria', 5 ,'Code Valley School');

Output:

 

In the above newly updated table, we could see there are 2 records with the same geek rank of 5. GeekID 107 and GeekID 108 are having the same rank of 5. Now we need to find this duplication using SQL Query.

Using GROUP BY and HAVING Clause

One way to find duplicate records from the table is the GROUP BY statement. The GROUP BY statement in SQL is used to arrange identical data into groups with the help of some functions. i.e. if a particular column has the same values in different rows then it will arrange these rows in a group.

Query to find the duplicates

Query to find the duplicate records having the same GeekRank. 

Query:

SELECT GeekRank, COUNT(GeekID) AS DuplicateRanks
FROM Geeks
GROUP BY GeekRank
HAVING COUNT(GeekRank)>1;

Using the GROUP BY and HAVING clauses we can show the duplicates in table data. By ‘GROUP BY GeekRank‘ means, to place all the rows with the same value of only that particular column i.e. GeekRank in one group. We use ‘HAVING COUNT(GeekRank)>1′ to select a column having more than 1 rank of geeks in output. The COUNT() function of SQL is used here to count the duplicate rows. Here we are naming our new column as ‘DuplicateRanks‘ which counts duplicate ranks.

Output:

IMG4

 

Explanation:

As we can see, the rows with duplicate GeekRank are grouped under the same GeekRank and their corresponding COUNT is the count of the GeekRank of duplicate rows. GeekID 107 and GeekID 108 are having the same rank 5. Thus in the above output, we could see GeekRank as 5 (because this 5th rank is found duplicated ), and since two GeekIDs were having the same GeekRank 5 so DuplicateRank i.e. count of the duplicate records is 2. Once you found the duplicate rows, you may choose to remove those duplicate rows using the DELETE statement.


Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads