Snapshots of Modern-Day China

Last week Nathan Myhrvold posted a pair of photo-essays to the Freakonomics blog recounting a recent visit to Shanghai and Beijing. Some strong images (looks like he’s using something like a 12mm aspheric on some of those? shame about the compression), and interesting commentary (and comments!).

For a different perspective, but along the same lines, I really enjoyed Bunny Huang’s Made In China posts from last year.

Python os.walk() vs ls and find

Since I wasn’t able to find a file cataloguer and dupe-finding app that quite fit my needs (for the Mac, DiskTracker was pretty close, I’d definitely recommend that of all the apps I tried), I started to code some stuff up. One of the things I was interested in starting out was how well using Python’s os.walk() (and os.lstat())would perform against ls. I threw in find while I was there. Here are the results for a few hundred-thousand files, the relative speed which was consistent over a few runs:

python (44M, 266173 lines)
---
real  0m54.003s
user  0m18.982s
sys 0m19.972s

ls (35M, 724416 lines)
---
real  0m45.994s
user  0m9.316s
sys 0m20.204s

find (36M, 266174 lines)
---
real  1m42.944s
user  0m1.434s
sys 0m9.416s   

The Python code uses the most CPU-time but is still I/O bound and is negligibly slower in real-time than ls. The options I used for ls were -aAlR, which apparently produces output with lots of line breaks, but ends up being smaller than find‘s single-line, full-path output. The find was really a file-count sanity check (the 1 difference from the Python script is because find lists itself to start with). Using Python’s os lib has the advantage of returning all the attributes I need w/o the need for additional parsing, and since the performance is fine, I’ll be using that. So, just thought it’d be worth sharing these results for anyone who needs to process a fair number of files (I’ll be processing I’m guessing in the ballpark of 2M files (3-4TB of data?) across about a dozen NAS, DAS, and removable drives. Obviously, if you’re processing a very large number, you may want a different approach.

New Music, Catching Up Edition

I’ve had a few weeks to decompress/catch-up on life post-election. One of the things I did involved clearing out my music backlog (almost a thousand albums – completely done until a late-night of sample chasing…). Thought I’d share some of the stuff that has caught my ear so far:

Yes we did.

I’ve been thinking about what I wanted to post – I think I’m still at a loss of words, but I wanted to post something just to commemorate. I also wanted to step back and note the bitter irony of the increased African-American participation being a factor in the passing of Prop 8, which was couched in the same language and frame as the anti-miscegenation laws a scant few decades ago. Lastly, I’m also amazed at many of the razor thin margins, both in national and local races. It certainly gives me pause, even as I appreciate the celebrations that have been going on around the nation.

We certainly have a lot of work to do.

McCain’s gracious concession:

Obama’s victory speech: