How Do I Standardize Addresses?
Differing address formats around the world create several problems for Customer 360 and Single Customer View implementations.
Your customers will want to enter their addresses in the national format that is most comfortable for them. Since most forms and databases are created for specific markets, each market will tend to have a different format than the others.
Global luxury brands that want to create a personal experience in every market need to address these problems.
Customers will want to enter their address in the format that is most comfortable for them. If your forms do not match their expectations, they may put data in the wrong place. For example, in English-speaking countries, the postal or zip code usually comes last. Sometimes, it is written on a separate line. However, in much of Europe, the code preceeds the town name.
Another example is that in many Western countries, addresses are written from the smallest unit, such as street name, to the largest. However, in Hungary, and when Chinese, Japanese, and Korean customers write their names in their own character set, the opposite order is used. Some more detailed examples may be found in the Microsoft list of international address formats.
For global luxury brands, this creates two problems:
- Forms should reflect how customers want to enter their address. Therefore, e-commerce systems need to be adapable based on ther market.
- Since most country-specific databases are in data silos, a way needs to be found to bring together the data that may be stored in different formats.
For online forms, a common technique is to ask for the country first. Once the country is selected, a country-specific form can be displayed. In some countries, multiple languages may need to be supported.
To bring together data silos, the best practice is to parse information into its component attributes. Examples of attributes used for addresses are as follows: Unit/Apartment/Flat, Premises, Number, Street, City, District, Town, Postal Code, and Country.
The selection of attributes ought to be defined with international markets in mind. Conventions for addresses and phone numbers vary considerably by market. Consistent attributes make it possible to compare records and to display them properly.
Data Parsing Example
Original:
23 DAVID PLACE
ST HELIER JE2 4TE
Parsed:
Number: 23
Street: David Place
City: St Helier
Postal Code: JE2 4TE
Data Parsing Example with Non-Roman Character Set
An individual may write an address for billing in Japan as follows:
北海道札幌市東区北二十四条東3-3-1
The source information in the Japanese system would be parsed as follows:
Block Sequence: 3-3-1
Area Name: 北二十四条東
District: 東区
City: 札幌市
State: 北海道
Postal code: 065-0024
The same address would Romanized for a European copy of the database:
Block Sequence: 3-3-1
Area Name: Kita-24 Johigashi
District: Higashi-Ku
City: Sapporo-Shi
State: Hokkaido
Postal Code: 065-0024
The same address may be entered into a database in a variety of ways, as shown in the example below. Therefore, the next step is to validate and standardize addresses. Validation involves comparing a parsed address with official records and making corrections. In some countries, a location may have multiple designations or even names. Best practice would be to perform the following operations:
- Determine or validate the country
- Match to Postcode Address File (PAF) reference data for the specific country, which includes officially licensed sources such as the country’s postal authority, other government agencies, and third parties.
- Correct and standardize identified components
- Append/insert missing components
While licensed and up-to-date PAF data may cost more than other sources, they can vastly improve accuracy and quality. It is a good idea to ask vendors whether they use officially licensed sources.
In dense urban areas with multistory buildings, such as Hong Kong, many people may have the same name within the same premises. Sub-premises detail, such as floor numbers or block information, adds to the confidence in matching. Other less dense locations may use building names or route numbers without more specific address components.
Attribute | Record 1 | Record 2 | Record 3 | Record 4 |
---|---|---|---|---|
First Name | Christiane | Christiane | Chris | Christiane |
Last Name | Gellesch | Gellesch | Gellesch | Gellesch |
Address Line 1 | Avon House | Avon House | Avon House | 43 David Place |
Address Line 2 | 23 David Place. St. Helier | 23 David Place | 23 David Place, | Avon House |
Address Line 3 | St Helier Jersey | St. Helier | Jersey | |
City | Jersey | Channel Islands | St Helier | |
Postcode | JE2 4TE | JE2 2TE | JE2 4TE | JE2 4TE |
The parsed address is validated and standardized as follows:
Building: Avon House
Number: 23
Street: David Place
District: St. Helier
City: Jersey
Postcode: JE2 4TE