Benchmarks for the latest parallel features in pysparkling. It shows good scaling for processing with multiple CPU cores. The example contains only a simple computation which shows that hyperthreading is not very effective in this case.
A pure Python implementation of Apache Spark's RDD interfaces. pysparkling does not depend on Java and has a small execution overhead. It can be a fast test runner for Spark applications.