Databricks turned into the favorite platform for many data engineers, data scientists, and ML experts. It combines data, analytics, and AI. It’s multi-cloud and now you can also use it on GCP.

This article will walk you through the main steps to become efficient with Databricks on Google Cloud.

Databricks on GCP: From Zero to Hero

1. Get the Foundation Right — From Subscription to User Creation

To…

I did a webinar session for ODSC about sharing huge amounts of data from a Lakehouse with Delta Sharing. The recording is available now on-demand (for free) and includes a number of live and hands-on demos.

  1. Multi-cloud data sharing from AWS with a Google Colab notebook.
  2. OSS only: Jupyter notebook, pyspark, Delta Sharing OSS on EC2.
  3. Preview to Delta Sharing from a Databricks notebook with simple SQL commands to create shares and recipients.

Delta Sharing is the industry’s first open protocol for secure data sharing, making it simple to securely share massive amounts of data with other organizations regardless of which computing platforms or cloud storage they use.

Delta Sharing

Delta Sharing is a Linux Foundation open source framework. Picture it as a modern way of sharing massive amounts of live data from your data lake. On-premises, in the cloud, or hybrid. With basically any kind of receiver that is supporting pandas or Spark, so there is no vendor lock-in.

Delta Sharing 0.2.0

Does predicate pushdown for Databricks on Google Cloud with BigQuery work? It does! And here is how to verify it.

When I tested the features of the recently released Databricks on the Google Cloud platform, I checked out the BigQuery integration. Databricks is using a fork of the open-source Google Spark Connector for BigQuery. So I wondered how to check if a certain predicate of a query is indeed pushed…

Without writing Dockerfiles or Kubernetes YAML files.

This article describes a quick and easy way to go from writing a light-weight Java, Kotlin or Scala application to a running Kubernetes service.

Introduction

Kubernetes (K8s) is an open-source container-orchestration system driven by a large, enthusiastic group of developers. Companies such as Amadeus, Bose, CERN, …, Zalando, and thousands of…

Frank Munz

Cloudy things, large-scale data & compute. Twitter @frankmunz. Former Tech Evangelist @awscloud, Databricks now. Only personal opinions here.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store