Getting started with PySpark

What is PySpark? PySpark allows Python users to interface with Apache Spark and wrangle data present in clusters over multiple nodes. PySpark provides users with Dataframes which is an abstraction of its underlying RDD (Resilient Distributed Datasets). PySpark supports most Spark features such as Spark SQL, dataframe, streaming, ML Lib and Spark Core. Install PySpark …

Enable NTFS write on OS X Yosemite without Fuse

OS X supports the option to read NTFS-formatted drives, but has no support for writing to these drives. Previously I used OS X Fuse to enable write functionality but it requires installing multiple 3rd party drivers. Instead we can use Apple’s NTFS write functionality which is not enabled by default. Please note the following steps will have to …

Programming the Arduino Pro Mini using an Arduino Uno

The Arduino Pro Mini is a great microcontroller board based on the ATmega328. The board comes without built-in USB circuitry, so an off-board USB-to-TTL serial convertor must be used to upload sketches. This can be a FTDI TTL-232R USB – TTL Level Serial Converter Cable for the 5V Arduino Mini Pro), or a FTDI TTL-232R-3V3 USB …