Data Cleansing -- What Everyone Wants Done, But Nobody Wants To Do -- Until Now!

With Predictable Data’s PDQ service, common-sense and pragmatic data quality is fast, simple, and very affordable.

Are you scheduling a meeting, organizing an event, preparing a report, accessing a REST API or performing data analysis on time series data? If you don’t know how to format dates correctly, you could cause confusion and lose credibility. You may or may not be aware, but the date is written differently in different parts of the world. This is especially a problem when representing a date with numbers only.

For example. with date written like: “01-03-12”, the Americans and Filipinos read that as: “January 3rd, 2012”.  However, the Mexicans, Brits, Germans, Indians, and a whole lot of others interpret that as: “1st of March, 2012”.  The Chinese, Korean, and Japanese, are very confused, because they put the year first. How fun, right?!

Applications like Excel, Tableau, HubSpot and programming languages like Java, Python and R can all format dates and times, but it’s not always easy and sometimes data has left those pipelines.

The great news is that with Predictable Data Quality (PDQ) you can format dates to use words or abbreviations for months or you can split a date into component parts, which also clears up ambiguity. You can also perform sanity checks by validating proper formats and date ranges.

fancybox

Dates are frequently stored like this, but often don't have the format or precision needed by decision makers and data scientists.

fancybox

The day-of-week form provides new information. Knowing that 3/11/2018 is a Sunday is useful not only for analytics, but also security and marketing reports.

fancybox

The month-spelled-out form is more readable and disambiguates. Some locales use the day/month/year format, so 3/11/2018 would be November 3rd to them.

fancybox

The week-of-year form helps determine seasonality. Knowing the average weeks into a year until profitability or other milestones is important.

fancybox

Dates are frequently stored like this, but often don't have the format or precision needed by decision makers and data scientists.

fancybox

The day-of-week form provides new information. Knowing that 3/11/2018 is a Sunday is useful not only for analytics, but also security and marketing reports.

fancybox

The month-spelled-out form is more readable and disambiguates. Some locales use the day/month/year format, so 3/11/2018 would be November 3rd to them.

fancybox

The week-of-year form helps determine seasonality. Knowing the average weeks into a year until profitability or other milestones is important.

fancybox

fancybox

fancybox

fancybox

fancybox

Date/Time and Time formats can be formatted or parsed into sub-components, adding precision to your data.

fancybox

Do trends in your data occur in a particular hour of the day? Sometimes averaging data or other modelling unearths these trends.

fancybox

When are your cash registers or phones their busiest? This level of detail helps schedule breaks, server maintenance, and more.

fancybox

At this low level of detail you often find patterns in computer, network, and sensor behavior. We also can split down to nanosecond level for even more precision.

fancybox

Date/Time and Time formats can be formatted or parsed into sub-components, adding precision to your data.

fancybox

Do trends in your data occur in a particular hour of the day? Sometimes averaging data or other modelling unearths these trends.

fancybox

When are your cash registers or phones their busiest? This level of detail helps schedule breaks, server maintenance, and more.

fancybox

At this low level of detail you often find patterns in computer, network, and sensor behavior. We also can split down to nanosecond level for even more precision.