Spark has come a very long way since its initial release in 2014. The pace at which the industry has adopted it and the number of contributions it receives is unprecedented, which allowed it to grow rapidly. Unfortunately this means it might be hard for developers to keep up with all the changes and best practices. In this talk we will go through the relatively new features such as DataSets (and when to use them vs. DataFrames and RDDs) and the improvements Spark 2.0 has introduced.
voted / votable