This is a candidate session. Scala Matsuri selects sessions using as a reference participants voting later.

日本語

Keeping up with Spark

Spark has come a very long way since its initial release in 2014. The pace at which the industry has adopted it and the number of contributions it receives is unprecedented, which allowed it to grow rapidly. Unfortunately this means it might be hard for developers to keep up with all the changes and best practices. In this talk we will go through the relatively new features such as DataSets (and when to use them vs. DataFrames and RDDs) and the improvements Spark 2.0 has introduced.

Session length: 40 minutes
Language of the presentation: English
Target audience: Intermediate: Requires a basic knowledge of the area
Speaker: Mateusz Dymczyk (H2O.ai)

Candidate sessions