Data Lake

A cloud-based infrastructure providing virtually infinite and secure data storage for WHO’s data assets.

Our Data Lake is a modern data hosting environment that allows WHO’s data science teams to safely store all of their data assets, in any volumes and format, in a secure, central and easily accessible environment.

The Data Lake acts as the focal point for all components of the World Health Data Hub creating a foundation to leverage the wealth of data stored in the Data Lake through automation and customization built specifically for WHO’s data needs.

Acquisition of data is made simple through the use of a Low Code / No Code solution, allowing data managers to connect to any data source, transform data and store the results in the Data Lake. The drag–drop-configure paradigm is easy to learn enabling data managers to focus more on data transformations. Once complete, transformations can be scheduled to automate the data acquisition / data refresh processes, eliminating repetitive manual tasks.

Additionally, WHO teams can perform analytics and visualization operations on the data as simply as accessing simple tabular – Excel-like sheets.

Features

Secure by design

The hosting environment is built to conform to WHO’s security requirements. Data managers leveraging the platform automatically benefit from this customized implementation, ensuring compliance and security for their data assets.

Made to scale

The data hosting environment can securely store unlimited volumes of data while offering stellar performance.

Automation

WHO’s data teams can automate tedious and time-consuming tasks like data acquisition, cleansing and formatting using simple and powerful no code tools. Scheduling of pipelines allows the unattended execution of transformations according to a flexible schedule.

Data landscape

Teams are able to move directly from acquiring and storing unstructured data sets and files, to exploring, understanding and running analysis and visualizations. Advanced tooling allows WHO’s data science teams to make sense of their data in new ways.

An illustration of a cloud with a database logo inside it.

Objective

Secure hosting and collaboration are a crucial need for all WHO teams working on datasets they collect, consume, create and use. The diversity of data and data sources in WHO requires a highly capable, flexible and agile platform. Security, data protection and governance compliance are also necessary to maintain the trust of Member States and partners ensuring a robust solution that underpins WHO's role as a steward of health data.

In addition to data storage, the Data Lake facilitates the streamlining and simplification of data operations, with a toolset that provides for acquisition, transformation, augmentation and data storage using a simple workflow paradigm.

 

About data at WHO

WHO ensures the timeliness, reliability and validity of measurements, ensuring comparability of data and allowing the world to track trends, progress and impact.