top of page

How to Find and Remove Duplicates in Power BI

person typing on a laptop

Overview

Duplicates in a dataset refer to multiple records that are identical or nearly identical. These can occur due to errors in data entry, merging multiple datasets, or redundant data collection. Duplicate records can lead to misleading analysis, incorrect conclusions, and inefficiencies in processing.

Example of Duplicates

screenshot of sample dataset

(Looking closely at the above dataset, you can see that the highlighted rows have been duplicated – a mistake that could lead to errors in analysis should this not be corrected by removing the duplicate values).

This article explores the different ways in which you can deal with duplicate values.

1. Removing Duplicate Values

To remove duplicate values from your dataset, open the Power Query.

  • Select the columns that contain duplicate values.

  • In the Home tab select Remove Duplicates under Remove Rows option.

options for removing duplicates in power query

(i)                  Removing duplicates from multiple columns

For example, using the above sample dataset, I can remove all duplicate values by highlighting all the columns in the data, then select Remove Duplicates.

options for removing duplicates in power query

Notice, this action will remove all the duplicated values resulting to the sample data below.

sample dataset without  duplicate values

(ii)                Removing duplicates from a single column

Similarly, the same applies when you need to remove duplicate values from a single column. For example, to remove duplicate Names – all I need is to highlight the column “Name” then select Remove Duplicates as shown below.

options for removing duplicates in power query

Notice, this action will remove all duplicated names resulting to the sample data below.

sample dataset without duplicate values

2. Keeping Duplicate Values

On the other hand, you can choose to keep the duplicate values should you need to review them in detail before removing them.

To do so.

  • Select the columns that contain duplicate values.

  • In the Home tab, select Keep Duplicates under Keep Rows option.

options for keeping duplicates in power query

(i)                  Keeping duplicates from multiple columns

For example, to keep duplicate values in this dataset, all I need is to highlight all columns containing duplicate values. Then select Keep Rows as shown below.

options for keeping duplicates in power query

Notice, executing this will retain all the duplicated values which can be reviewed in detail before being removed. See the resulting table of duplicate values.

sample data of duplicate values

(ii)                Keeping duplicates from a single column

Note, a similar approach applies when you need to keep duplicate values from a single column. For example, to see all duplicate values from the column “Name” all I need is to highlight the column “Name” then select Keep Rows as shown below.

options for keeping duplicates in power query

Notice this will keep the duplicated values in the column, resulting in the table shown below.

sample data containing duplicated values in a specific column

Conclusion

Removing duplicate values in Power Query is a crucial step in data cleaning and transformation. It helps ensure accuracy, maintain consistency, and prevent redundant records from affecting analysis. Power Query provides a straightforward method to eliminate duplicates using the Remove Duplicates feature, which identifies and removes identical rows based on selected columns.

By applying this tool effectively:

  • Users can refine their datasets for better decision-making.

  • Improve performance and efficiency by reducing unnecessary records.

  • Preserve data integrity, preventing errors in reports and calculations.

If you like the work we do and would like to work with us, drop us an email on our contacts page and we’ll reach out!

Thank you for reading!

Original.png

We Support You Deliver Business-Focused Solutions That Enable Data-Driven Decision Making.

  • Tableau profile
  • YouTube
  • White LinkedIn Icon
  • Facebook
  • X

QUICK LINKS

CONTACT US

WhatsApp: +254 738 307 495

East Gate Mall, Donholm

3rd Floor Suite No. 3i

Nairobi, Kenya

Join our mailing list

bottom of page