Snapshots of Modern-Day China

Last week Nathan Myhrvold posted a pair of photo-essays to the Freakonomics blog recounting a recent visit to Shanghai and Beijing. Some strong images (looks like he’s using something like a 12mm aspheric on some of those? shame about the compression), and interesting commentary (and comments!).

For a different perspective, but along the same lines, I really enjoyed Bunny Huang’s Made In China posts from last year.

Python os.walk() vs ls and find

Since I wasn’t able to find a file cataloguer and dupe-finding app that quite fit my needs (for the Mac, DiskTracker was pretty close, I’d definitely recommend that of all the apps I tried), I started to code some stuff up. One of the things I was interested in starting out was how well using Python’s os.walk() (and os.lstat())would perform against ls. I threw in find while I was there. Here are the results for a few hundred-thousand files, the relative speed which was consistent over a few runs:

python (44M, 266173 lines)
real  0m54.003s
user  0m18.982s
sys 0m19.972s

ls (35M, 724416 lines)
real  0m45.994s
user  0m9.316s
sys 0m20.204s

find (36M, 266174 lines)
real  1m42.944s
user  0m1.434s
sys 0m9.416s   

The Python code uses the most CPU-time but is still I/O bound and is negligibly slower in real-time than ls. The options I used for ls were -aAlR, which apparently produces output with lots of line breaks, but ends up being smaller than find‘s single-line, full-path output. The find was really a file-count sanity check (the 1 difference from the Python script is because find lists itself to start with). Using Python’s os lib has the advantage of returning all the attributes I need w/o the need for additional parsing, and since the performance is fine, I’ll be using that. So, just thought it’d be worth sharing these results for anyone who needs to process a fair number of files (I’ll be processing I’m guessing in the ballpark of 2M files (3-4TB of data?) across about a dozen NAS, DAS, and removable drives. Obviously, if you’re processing a very large number, you may want a different approach.

New Music, Catching Up Edition

I’ve had a few weeks to decompress/catch-up on life post-election. One of the things I did involved clearing out my music backlog (almost a thousand albums – completely done until a late-night of sample chasing…). Thought I’d share some of the stuff that has caught my ear so far:

Yes we did.

I’ve been thinking about what I wanted to post – I think I’m still at a loss of words, but I wanted to post something just to commemorate. I also wanted to step back and note the bitter irony of the increased African-American participation being a factor in the passing of Prop 8, which was couched in the same language and frame as the anti-miscegenation laws a scant few decades ago. Lastly, I’m also amazed at many of the razor thin margins, both in national and local races. It certainly gives me pause, even as I appreciate the celebrations that have been going on around the nation.

We certainly have a lot of work to do.

McCain’s gracious concession:

Obama’s victory speech: