30 Gigabyte Data Migration

30 Gigabyte Data Migration

Phill Luby
24th March 2020

Home Insights 30 Gigabyte Data Migration

Data migrations can be complex tasks requiring careful co-ordination, but when large volumes of data are involved the process becomes more complex.

Our client, Peritus Health Management Limited, had a large volume of clinical and non-clinical records stored on a private file sharing platform that they were decommissioning. Non-clinical records would go to Drop Box, but the clinical records needed to be cleaned, codified and imported into the clinical records system that we develop and manage for them.

Data migrations are complex, but especially complex when the source data is semi-structured and too big to deal with quickly. The first part of the problem involved the production of strict rules that codified the source data and reported exceptions. Careful analysis of the reports, followed by fixes to the source data, changes to the codification algorithm, and re-running the process, continued for several iterations. Our client was directly involved in analysing the reports and correcting data as it was their essential knowledge of their information that would allow correct codification. At the same time we were time-checking the import process to see if it needed to be incremental or could be done in one pass in our data centre. After weeks of validation and modification we were satisfied that the data could be loaded into the live platform.

A major failing of this type of project is to assume the process is correct, based on the checking of data cleansing and analysis reports. Human error is inevitable in a process of this scale and nature, so we expect some data to be incorrectly coded due to unstructured features going unidentified in the checking process. To defend against this problem, all records imported into the system are clearly marked with their source, and the original source data put into storage for later recovery should it be needed.

On the day the data was imported successfully and the clinical records system handled the information without issue. The source data sits safely where we can use it if needed.

Picture by Artur Rydzewski.

Share Article

Insights.

Exciting News! Agile Yorkshire Shortlisted for Leeds Digital Festival Awards
Exciting News! Agile Yorkshire Shortlisted for Leeds Digital Festival Awards

a community we’ve proudly supported for over a decade

Discover More
Ada Lovelace Day: A Celebration of Women in Tech!
Ada Lovelace Day: A Celebration of Women in Tech!

This event will explore and promote strategies for increasing accessibility to technology careers for individuals from disadvantaged communities.

Discover More
An A-Level Student's First Journey into the World of Work: Jai Soni's Inspiring Week at NewRedo
An A-Level Student's First Journey into the World of Work: Jai Soni's Inspiring Week at NewRedo

“This week has been an eye-opening experience. I’ve learned so much from Alan and the team, not just about engineering but also about how our work can make a real difference. It’s been incredibly inspiring,”

Discover More