Machine learning researcher applying deep learning to transportation and the urban environment.
Creating S2-cell ids by hand.
in Tech · November 28, 2018
pysparkling now supports stream processing with discrete streams, called DStream. This post shows a simple example that uses this new API.
in Tech · March 11, 2017
Benchmarks for the latest parallel features in pysparkling. It shows good scaling for processing with multiple CPU cores. The example contains only a simple computation which shows that hyperthreading is not very effective in this case.
in Tech · December 04, 2015
A pure Python implementation of Apache Spark’s RDD interfaces. pysparkling does not depend on Java and has a small execution overhead. It can be a fast test runner for Spark applications.
in Tech · May 29, 2015