Accelerate Your Identity

3... 2... 1...

Using sed to clean up an LDIF file for import #Oracle #Identity #UNIX

I needed to import a group of users, into Oracle Internet Directory (OID) with attributes in a variety of backend data stores. I used Oracle Virtual Directory to virtualize the data stores into a single ldap view. I used the OVD adapter configuration to specify which attributes I wanted returned. I then exported using the export control from Apache Directory Studio. This resulted in an ldif file containing all of the records I needed with attributes. There were a few additional attributes as a result of using OVD that I now had to deal with.

I ended up with an ldif file that contained a lot of records like this:

dn: cn=Babs Jensen@ACME.GOV,ou=temp_user_load
objectclass: inetOrgPerson
objectclass: organizationalPerson
objectclass: person
objectclass: top
cn: 1234556677@ACME.GOV
cn: Babs.Jensen@ACME.gov
cn: Jensen, Babs
sn: Jensen
givenName: Babs
vdejoindn: ou=acmeinfo_temp:cn=JENSEN,BABS,ou=acmeinfo_temp
vdejoindn: AD_temp:CN=babs.jensen@ACME.GOV,OU=locations,OU=park,ou=ad_t
fascnDecoded: 1234567890987654321
guid: ABcdedghi1234567890
ssn: 12345678

Note: With the SED command you can make changes directly to the source file but I am creating a new target file with each change I can make so that I can always revert back if the command doesn’t work exactly the way I want it to.

I wanted to get rid of lines that don’t start with an attribute name (In my case I am free to get rid of lines that carry over into the second line … YMMV)

I also wanted to specifically wanted to get rid of all lines that start with “vdejoindn:” and there are also some vdejoindn lines that overrun onto a second line that won’t beremoved if I use sed to remove lines with the pattern matching vdejoindn:.

So, first I want to remove all lines that don’t contain a colon. This removes the overrun lines but also all blank lines.

$ sed ‘/:/!d’ input.ldif > tmp.ldif

this keeps the lines with a colon.

But now we don’t have breaks between the records

$ sed ‘s/^dn:/n&/g’ tmp.ldif > tmp2.ldif

Ok, now I want to get rid of the lines that have “vdejoindn:”.

$ sed ‘/vdejoindn:/d’ tmp2.ldif > tmp3.ldif

Now at some point I ended up with “^M” at the end of each file … I don’t know if this is because I opened with VIM in Windows before moving to Linux … I am going to assume so but either way in this instance I want to remove these characters.

$ dos2unix tmp3.ldif > tmp4.ldif

Alright, Now, for me to import this into Oracle Internet Directory (OID) I’ll need to add the “changetype” directive. I am going to add the string “changetype: add” on a new line after each line with “ou=temp_user_load:” which is the temporary suffix I used in this export.

$ sed ‘/ou=temp_user_load/ achangetype: add’ tmp4.ldif > tmp5.ldif

Now, should be the last step, prior to importing, is to correct the entries “DN” attribute. Essentially, we need to replace “ou=temp_user_load” with the correct suffix for where these users will be created.

$ sed ‘s/ou=temp_user_load/cn=Users,o=icam,dc=acme,dc=local/g’ tmp5.ldif > tmp6.ldif

At this point my ldif file (“tmp6.ldif”) is ready to import into my directory. You can use the ldapmodify command or since I am using OID you can use bulkload (which is recommended for large record sets).