File Compression Tools

It’s been a long while since I’ve spent much time thinking about file compression tools, but I was running out of space on one of my smaller SSD’s and the culprit was mainly lots of large multi-GB TIFs and PSDs that had accrued in our Lensley Dropbox.

My first thought was xz, since it compresses super well, but I was growing old waiting for files to compress and decided to switch. lrzip seemed like a good option, and performance was much better than xz, and threaded (and with multi-file support and a nice progress UI to boot), but I also happened on another compressor shoot-out page (if you’re interested in this topic, definitely read this, there’s a lot more testing and nice pros/cons list for the apps) which concluded that lbzip2 was a good choice.

I gave it a spin, and sure enough lbzip2 is great. With defaults, it loses a few percentage points when compressing vs lrzip, but it also compresses on average twice as fast, with the kicker being that its archives are fully cross-compatible with bzip2, so pretty much any system can access the archives by default. I’ll be aliasing bzip2 to lbzip2.

(If speed isn’t as important (batch compression) and for max compression of large files, lrzip is probably better. Although a lot of people have been using xz, even with multithreaded support, I still didn’t feel performance was great so for personal use, I’ll probably end up mostly using lbzip2 or lrzip from now on.)

Lackadaisical AWS Migration Notes

We have relatively modest AWS bills w/ Lensley, but it’s been inching upwards and broke $400/mo at the beginning of the year. Originally, I thought this was mostly just due to ever increasing S3 usage, but it turns out that wasn’t the case. After going through and trimming things, we’ll end up saving about $250/mo, which isn’t a huge absolute amount ($3K/yr) but was over 60% of our monthly.

  • Obviously there were a couple EC2 instances hanging around that we didn’t need. One was a bit pernicious since it kept regenerating (not a Spot Request, but actually due to an old Elastic Beanstalk test. Oops.)
  • Our Reserved Instances had lapsed, so everything was 50% more than it should have been. Double oops.
  • Our instances were running old and busted m1’s – with our usage I was able to switch m1.smalls to t2.micros and m1.medium to t2.smalls, cutting costs by 2/3’s (w/ equal Reserved discounts).
  • Note: T2s are HVM only. Migrating from PV to HVM wasn’t too bad, although I also took the opportunity to switch one machine to the new 16.04 LTS (if you don’t have to, don’t – besides some package migrations, there’s also a killer kswapd bug that pegs the CPU)
  • It’s worth noting here that one of the instances was running EBS Magnetic storage. Obviously worth snapshotting and switching over to gp2.
  • We’ve been paying for an ELB and MultiAZ instances but honestly, have never had to use the failover. At the risk of jinxing myself having typed that, I just got rid of them.
  • A few years ago I switched to a Multi-AZ RDS instance for our database. At the time this seemed like a good idea, and while I never had to worry about things, w/ the rollout of gp2 (SSD) EBS w/ plenty of IOPs, it seemed maybe a bit wasteful for minimal convenience. Switching from Multi-AZ db.m1.medium to a t2.medium w/ a 100GB gp2 EBS volume lowers our monthly cost from ~$160/mo to $33/mo
  • Clearing out unused EBS volumes/snapshots, ofc

A big chunk of our savings were the result of getting rid of some redundancy, but since I can count the number of single-AZ outages (we’re in us-west-2 in Oregon) we’ve been affected by over the past 3 or 4 years on a single hand and there’s nothing super-mission critical that can’t live with a few minutes of downtime.

Firefox Developer Edition

I run a lot of browsers – I’ll usually have 3-4 running, a mix of Chrome, Canary, Safari, and Firefox Nightly. With Mavericks, I switched to Safari as my default browser on my MBA due to its power efficiency.  Unfortunately, Yosemite breaks the SIMBL plugin I was dependent on, so it was time to move on.

Chrome has been getting sluggish and I’ve really been liking what Firefox has been up to, but the latest Nightly builds have been not so dependable (I blame e10s but maybe that’s unfair) and since I’m traveling again, daily 90MB downloads isn’t ideal, so I decided to give Firefox Developer Edition a shot.

Turns out, it’s pretty great! It has a dark simple, theme, by default. Is pretty snappy, and the developer tools look great (although at this point I’m so used to Chrome’s keybindings that it’s been a bit awkward switching).

The one fly in the ointment was that 1Password wasn’t playing nice. Luckily, there is a solution. Just upgrade to the latest beta extension and the latest beta version of the app and it’ll work.

If you use Evernote, you’ll also want the beta Clipper that brings it to parity w/ Chrome and Safari.

Lastly, one of the things that I really got spoiled by was Chrome’s particularly elegant “hold CMD-Q to quit” option. While, ever so slightly less elegant, meta-q-override/warn-before-quit does the trick.

I’m currently using Firefox Developer Edition as my new default browser.

The Future of Social Networking

Ello blew up this week. It’s new and shiny and does some interesting things. That being said, it’s not where social networking or how we use the Internet needs to go.

If you want more reading:

I had posted some of my own initial thoughts, which is that the ideal social network should be end-user controlled and distributed and decentralized. A natural pre-condition is there should be an open protocol, but it’d be worth fleshing out the type of functionality that’s required (I’ll have to revisit some relevant thinking I did in the early 2000s in decentralized SNSs, the mid-2000s on permeability/privacy, and the late-2000s on Y!OS).

FWIW, the more interesting social networking-related project I discovered is an open source, decentralized, massively-distributed 3D simulation engine called Lucidscape. It is explicitly designed for an open metaverse. (See also: Open Cobalt née Croquet)

Making Safari Usable

One of the things that Activity Monitor’s “Energy Impact” fields have made obvious is that Safari 7.0 is significantly more energy efficient than both Chrome 30.0 or Firefox 27a1.

After regular usage, Safari has an Average Energy Impact of about 4-5 5-6 vs Chrome and Firefox hovering at about 8-9. For comparison: Airmail averages about 3, Spotlight is about 2, and Dropbox 0.75. Playing a 720p H.264 MOV in Quicktime Player is about a 9, and playing a 720 H.264 MKV in VLC is 20+.

Recently I’ve been migrating away from Chrome and back to Firefox, as the former has gotten more sluggish, and the latter has gotten a lot faster (Chrome is still my preferred browser for dev and the only option for SSBs), which actually has left me in a good place to try switching to Safari, as I’ve pared down my “necessary” plugins:

  • 1Password 4 – 1Password 4 is a huge improvement and the new way it works w/ browsers (as a simple frontend that interacts w/ a menubar app) makes all the browsers extensions work equally well (previously, the Firefox plugin would constantly freak out). With all the recent hacks, having unique passwords is more important than ever and I can wholeheartedly recommend 1Password.
  • Adblock – Safari only supports Adblock, not Adblock Plus but they both work well enough
  • Lazarus – if you’ve ever lost something you typed into a text box due to a browser close/crash you’ll want this. Available for Chrome, Firefox, and Safari
  • Evernote Clipper – I use Evernote for storing everything. Chrome’s extension is newer/fancier (and has some unique features) while Safari and Firefox are both an older version (but serviceable). I sort of like how the older version works so I’m not really complaining, although it is a bit curious.
  • Pocket – I’ve been using ReadItLater/Pocket for years. All the plugins add a “save to pocket” to the context menu, which is pretty much all I want. The Chrome version is a bit nicer since it has a colored icon in the context menu that actually makes it noticeably easier to us.

I also am using QuickStyle for Safari, which is like Stylish for Firefox or Stylebot for Chrome, but that’s more of a nice-to-have.

The most annoying thing I’ve found so far with using Safari, and probably the biggest reason I’ve never stuck with it, is that CMD 1-9 are mapped to the bookmarks bar and not switching tabs. It’s confounding (especially as I hide and don’t even use the bookmarks bar).

The solution for this is a SIMBL plugin called SafariTabSwitching – there is an installer on the Github page so installing is a snap, and the latest version is updated for Mavericks and is working great.

There are still a couple other niggles (only a single tab unclose), tab-close focusing is different, both Chrome and Firefox have a very useful contextual status bar (ie, when you mouseover a link, the URL shows up in the bottom left), so we’ll have to see if switching to Safari gives enough battery life to make it worth it. I’ll probably be updating this in a week or two w/ how it turns out.

Dirty Hack of the Day: Python DNS Edition

In Python, you can set most request timeouts w/ socket.setdefaulttimeout(). In recent versions, urllib2 has also added a timeout field to urllib2.urlopen(). So far so good, right?

Unfortunately, while these work fine when looking up IPs or domains in /etc/hosts, this fails miserable when querying a FQDN as you’re at the mercy of socket.gethostbyname() and your DNS resolver which does not let you adjust the timeout. On my Mac this defaults to 30 seconds. It’ll ruin your day, really. (A good recent thread, old summary)

This is a somewhat common problem and you can see a lot of various workarounds (using signals didn’t work for me). The proper modern way is to probably using multiprocessing with a join(timeout) (sample) but that seemed awfully wordy, so here’s my simple one-line hack that I ended up with instead:


Just set 1 to the timeout you want. It’s hacky, but it works and it’s much easier and shorter – a one liner in a try block without any other libraries. Another advantage this has is that it works as it should both with DNS and mDNS (zeroconf) without any additional lookups. I’m using this for finding local machines so this is quite useful.

Some extra references:


So, at the end of 2016 I encountered this problem, and decided that I could do better, especially because I wanted to do a sub-second query. I decided that of course the way to go would be to use concurrent.futures, but that was actually wrong, it turns out. When you call whether on the ThreadPoolExecutor or ProcessPoolExecutor version, it still waits for socket.gethostbyname() to finish. Here’s the simplest code that I implemented that worked:

from   multiprocessing import Process, Queue
def dns_lookup(host, q):

q = Queue()
p = Process(target=dns_lookup, args=('', q,))

if q.empty():
  print('dns timeout')

Using the multiprocessing library turns out to be the way to go because the terminate() function actually works like you expect it to, killing with extreme prejudice and w/o too much extra code. Hope that helps anyone dealing with the same problem.

How to Install Pida on OS X

For some reason, I got it into my head that I wanted to try out Pida (a Python IDE that embeds Vim or your editor of choice) on my Mac. Well, actually from the description, it sounds pretty cool, right? The screenshots are pretty neat too. Unfortunately, the end result on OS X is somewhat less than compelling.

However, it was a huge fight getting it setup, so I figured I’d write this down for posterity.

There is a PIDA MacPort, however there is no maintainer, it’s for Python 2.6 only, and it didn’t work out of the box for me. You’ll need to fight it enough that you might as well go whole hog. Here’s how I got Pida running w/ MacPorts python27.

First the ports:

sudo port install librsvg py27-gtk py27-gnome dbus-python27 py27-notify-python
sudo port install vte +python27
sudo port install vim +python27 +x11 +gtk2

Then the Python libraries:

sudo easy_install py
sudo easy_install pygtkhelpers
sudo easy_install Logbook
sudo easy_install bpython

Next, after grabbing the source, your build environment:

PKG_CONFIG_PATH="/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/pkgconfig" PATH="/opt/local/Library/Frameworks/Python.framework/Versions/2.7/bin:$PATH" python build_ext --inplace
running build_ext

Now, you’ll be able to run, but you may get a dbus error (which won’t let you select your editor among other things). Here’s how I made sure that dbus was running:

launchctl list | grep dbus
sudo launchctl load -w /Library/LaunchAgents/org.freedesktop.dbus-system.plist
launchctl load -w /Library/LaunchAgents/org.freedesktop.dbus-session.plist
launchctl list | grep dbus

Note above that dbus-system should be as root, and dbus-session should be loaded as the user.

Once I did this I was able to get up and running, however the Python shell subprocess throws an exception for me, and the font rendering and overall look, and of course all the non-vim keyboard shortcuts are painfully alien. Sadly, if you’re looking for a vim-like IDE-ish solution on the Mac, I think Vico, while quite new and still incomplete, is probably a better bet. If you’re looking for better Python introspection/debugging with a not-totally-awkward keyboard shortcuts (and incidentally, dead easy OS X installs), Reinteract and iep look to be the best choices I’ve found. (There’s also Spyder, which has a python26 Macport, but it depends on qt4-mac which may cause your MacPorts to build the world.)

Downtime, Ubuntu Sysadmin Notes

After 511 days of uptime, I decided it was time to bite the bullet and do a version upgrade. The `do-release-upgrade` command did what it said on the tin, and the upgrade from 9.10 to 10.04LTS was pretty straightforward (some downtime waiting for the disk to fsck, and requiring ops to manually reset). Unfortunately, the upgrade made WordPress pretty unhappy. Some combination of WP, APC, and potentially WP Super Cache? Instead of using Ubuntu’s APC (3.1.3p1-2) I switched to a pecl install (3.1.9). This didn’t solve things, so I bumped up the apc.shm_size to 128M…

I’ve been lackadaisical lately w/ my sysadmining, but with the unfriendly waters, I took some time to tighten the ship up a bit. I probably be publishing a little “hardening Ubuntu for really lazy/busy devs” guide soon.

Tech Predictions, Five Years Later

Five years ago, inspired by a Yahoo! Answers question (their top answers), I put on my tech futurist hat and wrote up some quick prognostications about
Which products, used by few today, will be essential in five years? This was published, incidentally, on Vox (now defunct). Are you getting that mid-2006 vibe yet? Well, it’s been five years (that was quick), so maybe we should take a look.

I won’t reproduce my original article (linked above), but I’ll go through each of the predictions and make some comments:

  • Software as service is standard – My prediction was that social networking, media sharing, and all kinds of apps would be increasingly integrated/prepackaged OOTB. I think that this has been born out, certainly on the mobile and device front, although this year may be the inflection point for the desktop (iCloud, ChromeOS, etc). Even without that, probably the majority of consumer computing is now service/browser based. I find myself totally dependent on many cloud-based services (Evernote, Checkvist, DropBox, Google Docs, GMail/GApps, Twitter, FB, etc). Also, the majority of my small business’s software is also cloud-based.
  • Global digital identity / reputation / relationship system – my prediction was that online/offline personas, relationships, and physical presence would be tied together, potentially controlled by a single company. I think in mid-2006 I would have guessed Google would end up taking it all, but FB was a strong contender, and they’re on top at the moment. Still, as of mid-2011, this ball is still in play, and there are certain components (location, reputation) that are still almost complete tossups. Note: while FB has been enormously successful and will almost certainly be the first Internet company to hit 1B actives, there are some signs that it may have peaked in its developed markets, so it’s not invincible. There’s also a lot of potential left in terms of social utility that’s still completely unexplored (and only in the most superficial ways in many other cases).
  • Digital media – I predicted streaming/wireless syncing of media from anywhere. While iCloud was only just announced (to compete against Amazon Cloud Drive, and Google Music) and music has been lagging a bit (although celestial jukebox services like Spotify and Rdio have been hitting it out of the park, so maybe unfair to dismiss music completely), we’ve seen this come true much more for video. Maybe this is due to the competition traditional TV/Film has faced from the YouTube/Internet video juggernaut (my first YouTube video, uploaded just over 5 years ago). Netflix in particular, which not only has overtaken web traffic, but also BitTorrent. Expect the cord-cutting to accelerate. One last observation. Amazon’s current homepage menu now completely highlights digital goods: Online Shopping for Electronics, Apparel, Computers, Books, DVDs & more
  • Smart phone – I think I hit this one 100% percent. Not much to say about it. Well, one caveat is that while there were rumors of an iPhone floating around for years, it wouldn’t be announced for another 6 months. Apple gets huge props for single-handedly helping to drag the lagging handset/telecom industry into this future, as well as totally shaking things up with its App Store. I’m sure there are some charts somewhere that show recent numbers on mobile vs fixed Internet use, but if that number hasn’t been crossed, I’m sure it will be soon.
  • RFID – I was totally wrong. At Lensley, we’ve been doing some neat RFID integrations with clients, and RFIDs have had huge adoption in thing that touch people’s daily life, like in supply-chain and public transit (as well as less well thought out ways, like US Passports). On the whole, though, they’ve remained too expensive and too niche to get much consumer love (kits from Sparkfun notwithstanding). While NFC in Android (an RFID-compatible superset) has gotten lots of hubaloo, there’s pretty much zilch in terms of real world use, much less anything remotely spimey. We’ll have to see how mobile payments pan out over the next couple years. (2012?)
  • Self Monitoring – While the Quantified Self has been getting some traction (a conference! breathless writeups!) and there are a proliferation of services and devices (Runkeeper, FitBit, Gowear Fit, Zeo, Withings, etc), this is still a pretty niche/nascent movement. I have no doubt it’ll keep growing, and there are some pointers (the proliferation of Feltron-like reports for social activity, checkins) that there’s a tipping point approaching. We’ll see
  • Personal Aggregators – I saw the other day that Flipboard’s at 400M flips/month, and one might argue that Facebook’s news feed algorithms, modern blogs (Gawker, HuffPo, Engadget, etc), or even Twitter have stepped in to fill big roles in terms of filtering the bombardment of crap, but it seems like treading water. I would have expected some smarter/more robust attention management tools to have been developed, but maybe I’m completely wrong on how most people handle infoglut.
  • Shared everything – obviously wrong about fine-grained privacy. Facebook has given us a “mostly private enough sort of for now” model that’s been pretty sucessful. Certainly at moving everyone torwards the social-everything model (you win some, you lose some).

Of my long-shots (things that I thought would be awesome), we actually got one of them in a huge way. At the time I had written this, I just received my iRex Iliad ($700) after waiting for years for an honest to goodness E-Ink device. Sadly, it was a pretty useless white elephant of a device. However, the display was phenomenal, so I threw it on the list. In late 2007 Amazon released the first Kindle, and a few weeks ago, Amazon announced that it is now selling more Kindle books than print books. The Kindle 3, BTW, was the best-selling product in Amazon’s history.

3D printing/fabrication has gotten a lot more traction (even a recent Stephen Colbert interview), as has the maker movement in general. Although it’s still niche, the pricing is right. At $1300, the Thing-O-Matic is cheaper than most people’s first laser printer.

AR HUDs, are as ever, another 5 years away. (The OVF on my X100 is pretty sweet though.)

OK, that’s all well and good. But how about the things that I missed completely. Here’s a short list:

  • Location – while I tangentially mentioned location, I never listed LBS, mapping and other location services explicitly. Looking back, this is a 100% obvious thing, considering how much usage has exploded since. My only excuse is that being hip-deep/working for so long on local/map/mobile stuff at the time probably blinded me to how ubiquitous it wasn’t for the rest of the world while writing this. (I was working on geocoding/map/checkins at Upcoming, and from ZoneTag to Checkmates, to Yahoo! Maps, I was surrounded by all kinds of crazy LBS/geo/mobile stuff).
  • Twitter – I probably first saw Twitter about a month after I wrote my original post. At the time it was “twttr” was a completely different beast – very SMS focused, like group chat. I passed, and didn’t even bother signing up until a few months later when visiting with friends in the UK (it got a lot of early traction because it was cheaper than texting). It took a while (early 2007?) for me to really get to grips with Twitter (writeup here). Kudos to Jack, Noah, Ev, et al for trying out something new, and then working at it for years to refine it. It’s gone through a lot of transformations (mostly for better)…
  • iPad – I was a close follower of the Mobile+UMPC+Tablet industry at the time, and if you had told me that in a few years Apple would have released a friggin Dynabook with 10 finger multitouch, 10 hour battery life, amazing responsiveness, and an a complete App Ecosystem (backed by 10s of millions of sister devices), selling for $500 I would have smacked you. After which, I’d have gone out and bought a lot more Apple stock. Like the iPhone when it launched in 2007, the iPad came from a few years in the future and dragged everyone else, kicking and screaming.
  • Wikileaks – Even during the year of the iPad launch, however, probably the biggest and most unexpected story of 2010 was Wikileaks (some of my favorite writeups). It has literally changed the world, and the most amazing thing is that it’s been a story that’s been in the making for years, if not decades. Wikileaks and many other stories happening right now (the Arab Spring, Anonymous, LulzSec) in many ways epitomize Clay Shirky‘s postulate that “Communications tools don’t get socially interesting until they get technologically boring… It’s when a technology becomes normal, then ubiquitous, and finally so pervasive as to be invisible, that the really profound changes happen.”

OK, in hope of publishing soon, I’ll be wrapping up now. No 2016 predictions from me, but maybe it’ll be worth catching regardless up in a few years. For those that are really interested in the things catching my attention these days, here’s a spring graph I made early last year:

On My Mind

Update: An editor from the International Business Times dropped a line yesterday with a few questions. Here’s the writeup they did today in the Luxury and Brands section today: Blogger Correctly Predicted the Future in 2006 (Mostly)