On Sharing Scientific Data with Open Source Software
Why sharing data matters: An introduction to open source Delta Sharing and how the world changed in just five days.
Big Things / Big Data
How likely is that? Five days before Omicron was discovered, I was invited to give a presentation at Europe’s leading Big Data conference, Big Things Spain. One of my goals was to motivate why it is essential nowadays to be able to share large amounts of scientific data in a secure, cheap, and scalable way — based on an open format and open source software such as Delta Sharing.
Delta Sharing is a Linux Foundation open source framework that uses an open REST protocol to secure the real-time exchange of large datasets and enables secure data sharing across products, companies, and clouds for the first time. Delta Sharing directly leverages modern cloud object stores, such as Amazon Simple Storage Service (Amazon S3), ADLS, or GCS to access large datasets reliably.
Sharing Data -> Scientific Discoveries
In one of my examples, I pointed out how sharing digital data in an open way became the foundation of scientific discoveries such as the development of vaccines (as opposed to sending frozen lung tissue biopsies across the world with UPS next day delivery, like in the old days).
A more visual example of why sharing data matters is shown in the tree representation of the Corona Virus mutations. You can easily spot the Coronavirus Alpha, Beta, and Delta mutations on the slide I presented.
Only five days after me explaining that visualization Omicron was detected and the outlook on the pandemic ending quickly was shattered…
You can watch the full presentation of my presentation at Big Things conference here.