Database Migrations

As part of my freelance work for WMC IT Solutions AG, I developed a tool that executes almost all the steps of a data migration from one storage system to another.

The source data was coming from Excel or different SQL Servers. Each legacy system had its own structure and way of naming columns and tables, so extensive data analysis, mapping and restructuring was necessary before the migration could be started.
The primary keys too were different in each system, and data was often linked in bizarre inconsistent ways. It often felt like solving a puzzle.

Attachments like PDFs, images, and other files were usually stored in a file system or in one case in binary form in a MSSQL db and also had to be processed by the tool.

There were often more than 50 tables to map and hundreds of gigabytes of data to transfer. This meant that efficiency was crucial as well as the ability to pause and resume the migration after an interruption. In one instance, to minimize downtime for the client, the migration was done in two steps where the bulk of the documents were migrated a few days ahead and then the new data synchronized during the final migration.

Migration control panel

During the development, each section had to be tested and migrated separately. This was controlled by a web interface that allowed me to start the migration of specific modules. During the migration, the progress was displayed with a progress bar, and the number of rows migrated as well as a time estimate.

All around tool

Instead of me adapting the migration program to every new customer's old data storage system, I offered to create a migration "framework" with documentation, so new migrations could quickly be set up with existing logic covering most cases.

The mapping is the only thing that would have to be adjusted in the code with the documentation guiding through the process.

Merging databases

Another required variation of the tool was the synchronization of two databases with the same structure.

The main challenge was data that belonged together, in the different databases while having different identifiers, and both versions having critical info that must be preserved.

Another challenge was the number of tables linked to each-other, and the primary key that had to be changed to avoid duplication. It meant that every row dependency had to be inserted in the right order and re-linked (or linked to an existing entry from the other database).