On Sharing Scientific Data with Open Source Software

Why sharing data matters: An introduction to open source Delta Sharing and how the world changed in just five days.

Frank Munz
Geek Culture

--

Big Things / Big Data

How likely is that? Five days before Omicron was discovered, I was invited to give a presentation at Europe’s leading Big Data conference, Big Things Spain. One of my goals was to motivate why it is essential nowadays to be able to share large amounts of scientific data in a secure, cheap, and scalable way — based on an open format and open source software such as Delta Sharing.

Delta Sharing — Frank Munz @ BigThings conference

Delta Sharing is a Linux Foundation open source framework that uses an open REST protocol to secure the real-time exchange of large datasets and enables secure data sharing across products, companies, and clouds for the first time. Delta Sharing directly leverages modern cloud object stores, such as Amazon Simple Storage Service (Amazon S3), ADLS, or GCS to access large datasets reliably.

Sharing Data -> Scientific Discoveries

In one of my examples, I pointed out how sharing digital data in an open way became the foundation of scientific discoveries such as the development of vaccines (as opposed to sending frozen lung tissue biopsies across the world with UPS next day delivery, like in the old days).

A more visual example of why sharing data matters is shown in the tree representation of the Corona Virus mutations. You can easily spot the Coronavirus Alpha, Beta, and Delta mutations on the slide I presented.

Only five days after me explaining that visualization Omicron was detected and the outlook on the pandemic ending quickly was shattered…

Delta Sharing — Frank Munz @ BigThings conference

You can watch the full presentation of my presentation at Big Things conference here.

Please follow me on Medium and clap for this article if you enjoyed reading it. For more cloud-based data science, data engineering, and AI/ML follow me on Twitter (or LinkedIn).

--

--

Frank Munz
Geek Culture

Cloudy things, large-scale data & compute. Twitter @frankmunz. Former Tech Evangelist @awscloud, Principal @Databricks now. personal opinions here. #devrel ❤️.