You are here

EIM Data Cleansing : Oracle function – Regular Expression

When loading data into Siebel through EIM, we do perform Data cleansing/massaging as per our 
 
business requirement. The cleansed data from Source is then loaded into the target EIM 
 
Interface tables
  Scenario:
Convert multiple spaces in address related columns to a single space for all records 
 
dynamically. Please refer the example cited below,

SOURCE

TARGET

82  DEVONSHIRE ST

82 DEVONSHIRE ST

2245       OBSERVATORY    PL

2245 OBSERVATORY PL

2245    OBSERVATORY  PL  NW

2245 OBSERVATORY PL NW

 
Oracle built in functions like replace , decode and translate was not yielding the desired 
 
result due to search complexity. Writing a custom procedure / function to do this address 
 
cleansing will be tedious and complex process, as it has to cleanse all records dynamically.
 
Implementation :  
 
We have an OOB solution for this by using the regular expression (REGEXP_REPLACE) function 
 
provided by Oracle from 10g version which improves the ability to search and manipulate 
 
character data.
 
Syntax :  
 
SELECT ADDR AS SOURCE,REGEXP_REPLACE(ADDR,'( ){2,}', ' ') AS TARGET
FROM SIEBEL.S_ADDR_PER WHERE PER_ID IN ('1-23O0-1625','1-G8FU-133')