th images menu user export search eye clock list list2 arrow-left untitled twitter facebook googleplus instagram cross photos entrep-logo-svg

Cleaning your data

Cleaning data is often overlooked but is a very significant phase in the data mining process. Find out its importance in analytics.
By International Institute of Digital Marketing |
Cleaning your data

Before you dive deeply into your data and mine it for insight and intelligence for all its worth, there’s one more step that’s equally important but often overlooked: cleaning your data. Once the data set has been achieved, the analysis itself becomes quite straightforward.

Data cleaning refers to the process of removing invalid data points from a dataset. These include duplicated entries, extra characters, and surplus or deficient data.

“The goal of the data cleaning process is to preserve meaningful data while removing elements that may impede our ability to run the analyses effectively or otherwise affect the quality of the resulting statistics,” explained an expert in the Certified Digital Marketer (CDM) Program.

The CDM Program outlines three (3) basic steps: the screening phase, which systematically looks for errors within the data; diagnostic phase, which identifies the condition of the data; and treatment phase, which entails deleting, editing, or retaining data.

In the screening phase, you must be able to answer the following questions: are there blank items in your file (lack of data)? Are there duplicated responses (excess data)? Are there values that are far beyond the typical that they seem potentially erroneous (inconsistent data)? Do some data seem counterintuitive or extremely unlikely (suspect analysis results)?

Once you have identified the discrepancy, you proceed with the diagnostic phase, investigating further and placing them in proper context which can be any of the following: missing data, or answers omitted by the respondents or skipped over; errors, or typos or answers that indicate the question was misunderstood; and true extreme, or items that seem high but may be qualified by two answers.

After these phases, you’ll be able to arrive at a verdict–whether the data is factual or erroneous.

Have a clearer understanding of how to mine valuable data assets to benefit your marketing campaigns with the Digital Marketing Analytics Specialist Track of the CDM Program. Request for the course syllabus at or call 426-6001 (loc 5679), 0928-506-5382.



Latest Articles