Archive for July, 2015

Comparing data storage options in Python

When it comes to numerical computing, I always gave in to the unparalleled convenience of Matlab, which I think is the best IDE for that purpose.  If your data consists of matrices or vectors and fits in main memory, it’s very hard to beat Matlab’s smooth workflow for interactive analysis and quick iteration.  Also, with judicious use of MEX, performance is more than good enough.  However, over the past two years, I’ve been increasingly using Python (with numpy, matplotlib, scipy, ipython, and scikit-learn), for three reasons: (i) I’m already a big Python fan; (ii) it’s open-source hence it’s easier for others to reuse your code; and (iii) more importantly, it can easily handle non-matrix data types (e.g., text, complex graphs) and has a large collection of libraries for almost anything you can imagine.  In fact, even when using Matlab, I had a separate set of scripts to collect and/or parse raw data, and then turn it into a matrix.  Juggling both Python and Matlab code can get pretty messy, so why not do everything in Python?

Before I continue, let me say that, yes, I know Matlab has cell arrays and even objects, but still… you wouldn’t really use Matlab for e.g., text processing or web scraping. Yes, I know Matlab has distributed computing toolboxes, but I’m only considering main memory here; these days 256GB RAM is not hard to come by and that’s good enough for 99% of (non-production) data exploration tasks. Finally, yes, I know you can interface Java to Matlab, but that’s still two languages and two codebases.

Storing matrix data in Matlab is easy.  The .MAT format works great, it is pretty efficient, and can be used with almost any language (including Python).  At the other extreme, arbitrary objects can be stored in Python as pickles (the de-facto Python standard?), however (i) they are notoriously inefficient (even with cPickle), and (ii) they are not portable.  I could perhaps live with (ii), but (i) is a problem.  At some point, I tried out SqlAlchemy (on top of sqlite) which is quite feature-rich, but also quite inefficient, since it does a lot of things I don’t need. I had expected to pay a performance penalty, but hadn’t realized how large until measuring it.  So, I decided to do some quick-n-dirty measurements of various options.

Read the rest of this entry »


Household hacks with a 3D printer

I’m often asked “what is a 3D printer good for, isn’t it just a novelty”?  So here are some examples of household hacks, in no particular order.  I’ve chosen examples that satisfy two criteria.  First, it didn’t take me more than an hour to whip up the CAD model (and, in many cases, it took just 10-15 minutes), so it qualifies as a “quick hack”.  Second, it’s of general household use, so mechanical assemblies, 3D printer parts, etc, were left out.  Some of these are published on Thingiverse (linked from the post headings).

Eyeglass frame fix

This is one of my favorites.  It was one of the quickest to make, but it was used a lot.  My mother has her favorite eyeglasses and is loath to change them.  However, over time, the arm loosened and they would constantly slide down her nose. Tightening the screws didn’t do anything anymore. So, I quickly designed a clip that slides over the frame, and has a tapered nub to apply pressure to the arm (printed in ABS, so it has some flexibility).  Guess you could call it an “eyeglass arm pretensioner attachment”.  She’s been using them for years, and asked for a pack, in case she looses one (printing a set of six takes about 15 minutes; the example in the photo is an early print in black, instead of brown).


Read the rest of this entry »