Catching Up On Reading

In between the projects and mini-projects, I’ve also begun to do some catching up on reading. Carrying around a book everywhere and taking lots of public transportation has helped tremendously in that regard. I started with Shirky’s Here Comes Everybody a few weeks ago, just finished Weinberger’s Everything Is Miscellaneous (that’s been sitting around since his original book tour when he came to Yahoo!) and polished off Cory’s new YA novel Little Brother in a single sitting (what’s so compelling and chilling is the realization that we’re about half-a-step away from a sort of grim-meathookiness. It gave me more of an appreciation of the fiction coming out of the 80s). And I’m now starting Glut.

One thing that’s interesting is that all these books have related/complementary media (podcasts, talks, etc.) attached to them (and all worth the time spent, I’d say, particularly Alex Wright’s Web That Wasn’t presentation). Now, obviously anyone who’s done much spelunking on Wikipedia (lately, I’ve begun doing a bit of random clicking around on Slideshare as well), or you know, random Internet browsing can tell you there’s a lot of “stuff” out there. However, I thought I’d share a list of some of the more useful/structured resources I’ve found (online video, lectures, etc):

This post sort of got kicked off while I was watching this very engaging recent talk (via) that I can’t favorite because YouTube still has a 500-favorite limit (now below 0.6 favs/day):

May 2008 Mix

It’s been a good couple months for music. Sometimes mixes don’t really have a name or a theme. Here’s some stuff I’ve been digging lately. 43 minutes.

Oh yeah, trying Schiller‘s latest SoundManager 2 release (mostly based off of the inline player) and some custom posting code. (Mixwit doesn’t seem to have an API?)

Tech note: for those that are looking to use SoundManager, you might want to check out to see if you’re build has the onpause handler. If it doesn’t you can add it to the SM the defaultOptions at the top and then call it in the pause() function around line 727 (just apply the onpause the same way the onstop above it is applied).

Adium Productivity Tip

Recently, in my quest to minimize distractions, I’ve been going through and …fixing things. First was updating my procmail and my spam filtering so that the only mail that ends up in my Inbox is real mail. (Inbox Zero for the past month!) And here’s what I did for Adium so I could leave IM on while I was working:

Adium Productivity Tips

Turning off dock animations (no more bounces) and hiding Adium while in the background (no more window popping) were really huge. I also have my dock hidden, which helps as well. (The menu bar gives me enough cues so that I can see what’s going on, but that isn’t too bothersome while I’m trying to work.)

And of course, I have my work.py script, which lays out my Terminals appropriately. At some point I might start scripting Space switching for that, but at least right now, this seems to be working out OK.

Web 2.0 Expo Presentation Rundown

There were actually a surprising amount (to me, at least – most of the people I talked to had low expectations) of very good presentations at Web 2.0 Expo last week. Most of them are now posted on either SlideShare (more presentations here) or, for the keynotes, on Blip.tv. This is I think, a very exciting and positive development for industry conferences (which I think will only have net-positive effects on attendance; conference proceedings are de rigueur at academic conferences). Here’re the ones I thought were most interesting.

Keynotes (overall, I liked the 10min What X Knows format that asks companies to boil down numbers and insights):

Lots of awesome sessions, the quality of the presentations (primarily in terms of prep/interestingness) was higher than usual:

  • A Flickr Approach to Making Sense of the World – my favorite session of the conference. If you’re doing “geo stuff,” you owe it to yourself to take a look at this. The divisive hierarchical agglomerative clustering bit is great (using morton curves for better pathing, clever). Now there’s not a lot on reverse-geocoding, which I believe I am now doing unique and interesting work on — once I prove it works, I’ll have to publish/present about that. 🙂
  • Capacity Planning for Web Operations – sure you can’t clone Allspaw, but reading what he has to say is probably the next best thing.
  • Website Psychology – linking to an earlier version of Gavin’s talk (with notes, yay) – he does a really great job mapping cognitive psychology concepts onto site usage and development. Well worth reading and thinking about
  • Grasping Social Patterns – by far my favorite Ignite talk this year, all kinds of hooks for thinking about how far social apps and the “social graph” needs to go
  • Making Email a Useful Web App – Bots are awesome and underrated. I’ve been working a lot more w/ them recently and this was a good overview (would love an even more comprehensive history of cool bots…)
  • Even Faster Website (PPT) – Steve Souders (now at GOOG, doing the same sorta thing he was doing at YHOO) talks about the current stuff he’s working on, which is optimizing JS (the logical progression). Great new stuff, just as useful as the older stuff
  • Adding “Where” to Mobile and Web Applications – a bit basic, but a good overview of how location stands today. Come to Where 2.0 and Wherecamp to learn more…
  • Polite, Pertinent, and… Pretty: Designing for the New-wave of Personal Informatics – slides not online. Boo-urns
  • Casual Privacy – slides not online. Boo-urns
  • Next Generation Mobile UIs – slides not online. Boo-urns

Talks I didn’t make but that have interesting decks:

Some stuff that sounded interesting but don’t have slides online: include Marc Davis’ Mobile talk, Opportunity Computing in the Cloud, Social Networks and Avatars (caught a few min of this, looks like they haven’t done a lot of work on the numbers (even organizing across cohorts), but still would like to see the deck), Global Design Trends (there are slides, but only enough to wish you had a recording of the talk)

Some bonus talks if you’ve made it through all those:

Did I forget something (quite probable), miss one of your favs? Post links in the comments.

Digging out Old Crap

I was doing a search and came across some old notes/presentations that I wrote back in 2003 (references)

Honestly, 5 years isn’t a long time ago (time flies), but it’s interesting to look back at my thinking at the time, and also to see where we are. There are some things that have happened that I wasn’t expecting, and some that I would have expected to have happened by now (honestly, I expected Digital ID consolidation by 2008, and that’s not gonna happen anytime soon)… Good times.

One thing led to the next, and soon I was looking through an old Recommended Reading List I had started making. That actually seems like a good idea, and I’ll be working on one as I have time (I’ll post about it once I’ve hashed out the basics).

Along the way of all this searching I found a dead link to an old UNIX tutorial I had written. I have a tarball of it somewhere, but honestly, it was much easier to just grab it from the Wayback Machine. So, here it is, for later reference: Stupid Unix Tricks.

It looks like it’s aged pretty well (better than these old webdev workshops I also taught), although it does remind me that I need to publish my latest bash setup sometime (and maybe start writing down the new things I pick up before I forget them). The only big change (vs addition) is that when I’m not using cmd-line file transfers on OS X these days, I use Cyberduck (an open source SFTP, FTP, WebDAV, S3 client).

And lastly, where would a nostalgia post be without some old music? I’ve recently finally started digging through old boxes and am encoding all the CDs I lay my hand on. Here’s a track, part of my “catching up on rock” period in the early 2000s:

Clay Shirky on Gin, Television, and Social Surplus

Clay Shirky gave the best keynote talk that I caught at Web 2.0 Expo last week. He’s posted a transcript, entitled Gin, Television, and Social Surplus on his new book’s site (also quite recommended; it makes it onto my “understand the internet” bookshelf).

So how big is that surplus? So if you take Wikipedia as a kind of unit, all of Wikipedia, the whole project–every page, every edit, every talk page, every line of code, in every language that Wikipedia exists in–that represents something like the cumulation of 100 million hours of human thought. I worked this out with Martin Wattenberg at IBM; it’s a back-of-the-envelope calculation, but it’s the right order of magnitude, about 100 million hours of thought.

And television watching? Two hundred billion hours, in the U.S. alone, every year. Put another way, now that we have a unit, that’s 2,000 Wikipedia projects a year spent watching television. Or put still another way, in the U.S., we spend 100 million hours every weekend, just watching the ads. This is a pretty big surplus. People asking, “Where do they find the time?” when they’re looking at things like Wikipedia don’t understand how tiny that entire project is, as a carve-out of this asset that’s finally being dragged into what Tim calls an architecture of participation.

If you didn’t catch it, this is well worth reading.

Donations Done!

After Super Tuesday, I started a little fundraising campaign where I promised to match contributions dollar for dollar up to $4K (since $2K was about what I had left on my campaign contribution limit). I’m happy to announce that the donations passed the goal tonight (15m after a last call tweet).

The final stats: $2070 donated by 19 people (all friends or friends of friends; an average of $109). With $2070.19 matched by me.

Hooray! Gobama!


Also, since it’s hilarious (although actually, Obama won Delaware as well):

Internet Asshattery, Armchair Scaling Experts Edition

I know it’s never good to pay attention to the nattering classes, but there was a pretty high profile fusillade that Mike Arrington launched on Blaine Cook which seemed to bring out the arm-chair experts in full force in the comments. Now, while I think that Arrington’s post is way out of line (I’ll explain that in a bit), I’m almost not as bothered by it (as long as he’s not to bothered for being called out on it)… What really bugs me is the number of clueless “developers” throwing in their two cents. That includes Arrington’s two Rails developers with “finger on the pulse of the rails community” (ha!). My discontent was further exacerbated by this (unrelated) completely clueless piece on The Register. Is this the best that tech journalism has to offer?

First, a disclaimer: I don’t know Blaine very well, and I don’t have any privileged info on Twitter or Obvious Corp.

There’s no question that Twitter has and continues to suffer from capacity, load, and other stability issues, and pointing that out is fair game, however pointing at Blaine’s scaling talk as a personal dig is a disservice to the everyone, especially since:

  1. The advice in the slides are generally good (and the “It’s Easy” is obviously snark – just look at the failcat in the next slide; it’d be easy to confirm that by asking anyone who was in the talk (like me or several hundred other people) instead of projecting prideful boasting to justify his attack — I’ll avoid ascribing motivations to why Arrington chose to do this).
  2. More crucially, the slides themselves point to the issues that a proper tech journalist would be able to spot and follow up on to try to find out what was really going on (assuming he cared about that).

For example, 600 QPS on 8 machines is pretty decent – but this raises the question of utilization and capacity planning. You can see from the 1×1 MySQL structure and the note on DRb that there were many single points of failure – again, this raises questions of BCP and redundancy. With the constant bumping of limits, you could guess that they were running really hot (and from a single data center, even after the move (probably w/o backup routers, etc.)) — all these issues are as much (if not moreso, since these are technical no-brainers) business/financial decisions than architectural/technological ones.

Now, I don’t know what happened between ops, management, and engineering, but guess what? Arrington doesn’t either, and he never bothered to follow up and kicks Blaine in the head instead, even when such clues obviously raise significant doubts about whether it’s appropriate. I agree with Arrington’s point about accountability, which is why I say now that Arrington wasn’t posting journalistically (the minimal followup with someone w/ half a clue would have pointed out exactly what I did), and Blaine deserves an apology. If you’re gonna shit on someone and start pointing fingers, you better have the goods to back it up. Whiny, uninformed personal attacks belong on Arrington’s Live Journal or (wait for it…) Twitter stream.

Now, onto the retarded comments from wanna be developers. Well first, of the entire thread, I only saw one half-decent attempt at a technical critique, and even that falls down when you look at it. I don’t want to belabor the point, but the poster, Jordan, actually raises technical points worth addressing (and refuting):

  1. On indexing: while it’s true you don’t want to index willy-nilly and it’s incomplete to say “index everything”, if your ORM isn’t automatically indexing frequently used keys, you can be sure as heck that you’ll want to make a point of indexing them, especially if you’re doing joins. Yeah, you don’t index what you don’t need, but even if you have frequent writes, you need to eat it if you’re ever going to ever query. Because people suffer from lack of indexes, unless you’re not adding an index and examining, you’re not gonna have a problem “over indexing.” I don’t know the exact fan-out/pub-sub architecture, but you can be sure you’ll be doing a lot more reads even if you cache the hell out of it. If you’re thrashing, you’re looking at having mis-configured index caches more than anything else.
  2. DRb: This is a case where it looks like he just misread. It’s easy if you don’t have the context of the talk. DRb was good enough… until it wasn’t – which is why Starling was written to replace it. Now, we still don’t know if it’s a single point of failure, but it obviates that whole rant (as to why DRb was chosen in the first place, more on that later)
  3. Caching: again, the same thing with indexes. Of course over-caching is bad, but that’s never going to be your problem because you start with no caching, and you add caching until you start losing performance. Also, the “no substitute for fixing the underlying problem” is naive – most of the time, your problems are that there’s no need to do complex queries or processing since the data doesn’t change and should be cached. durrf.
  4. Profiling: ok, this I’d sorta agree with. Mentioning ruby-prof would probably be good, but honestly, 90%+ of optimizations can be done on simple timers, explains, and logs alone. (And also, performance tuning doesn’t have all that much to do with scaling anyway.)

As to the rest of the wannabees, it really is true that if you haven’t done it, that is: been intimately involved growing a social web app from prototype to Internet-scale on a UNIX stack, then you really don’t know shit. (I know more than my fair share of people that have, and I didn’t see any of them posting armchair bs on the comments). I’m not trying to say this just to be dismissive, but only to say, you really really, don’t understand the technical challenges involved. Generating target sets on social objects is extremely expensive and ill-suited to traditional 4NF data models in RDBMSs. So is social activity fan-out and any number of activities core to Twitter’s message routing/storage and to social web apps in general. These are not traditional problems and standard, HA solutions just aren’t available.

Even if you’re architecturally sound, you’re dealing with development with extremely tight timelines/pressures, so you have to make decisions to pick things that will work but will probably need to eventually be replaced (e.g. DRb for Twitter) — usually you won’t know when and what component will be the limiting factor since you don’t know what the uses cases will be to begin with. Development from prototype on is a series of compromises against the limited resources of man-hours and equipment. In a perfect world, you’d have perfect capacity planning and infinite resources, but if you’ve ever experienced real-world hockey-stick growth on a startup shoestring, you know that’s not the case. If you have, you understand that scaling is the brick that hits you when you’ve gone far beyond your capacity limits and when your machines hit double or triple digit loads. Architecture doesn’t help you one bit there.

And the people that have experienced this and lived to tell the tale also know that it’s impossible to critique the technical/operational aspects made w/o seeing and understanding the QPS targets, load graphs, profiling data/sar info and all manner of other architectural/technical data and details (that none of us are privy to) before commenting with any sort of authority.

Anyway, if you were given the choice of working with/hiring someone like Blaine who has had the firsthand full life-cycle scaling experience and any random developer (and definitely anyone from the TechCrunch comments), I think it’s fairly obvious what the right decision would be. I guess I’ll leave it at that.

This leads to Part Deux of my rant… this lead-paint baby of an article entitled Backlash starts against ‘sexy’ databases which has the following quote, I shit you not:

“The bottom line is don’t tell me RDMBS [sic] can’t scale if you can’t write a decent query or design a normalized database schema.”

This is by one John Holland. Now, no doubt the WordPress code can be pretty shitty (although sometimes there are good reasons for the multiple queries to support various hooks/plugins), but you will never hit the type of performance problems in a WP (non-mu) installation that have people looking for MySQL alternatives because WP just doesn’t have the types of queries that destroy RDBMs.

I can understand that it’s not the article author’s (Phil Manchester) fault for conflating the “cons” with arguments that WP is badly coded with the “pros” (correct!) that you can’t write the kinds of queries you need for social apps because if he’s like the reporters I know, he probably doesn’t actually understand it at all and is doing his beat writeup, but dammit, can’t the author get some decent frickin’ technical advisors to explain this if he’s doing tech journalism? The entire article is based on characterizing a misinformed blog post as a “brewing controversy.”

I mean, I don’t want to be more mean than I have to about this, but John Holland just has no idea what he’s talking about. He picks up on Atwood‘s post on WP inefficiency, and then uses that to (completely incorrectly, and not without a tinge of reverse elitism) generalize on why the “cool kids” are hyping non-relational data stores. He goes on to boldly state “Relational databases are not the bottleneck” due to his complete lack of understanding of the actual problem set (hint: I don’t know anyone who’s suggesting WP should be switched off MySQL). This then leads to a horribly ignorant article being published by a writer who is in the best case, lazy and doesn’t understand what he’s reporting (just show two equal sides and do a writeup) or in the worse case is simply looking for a manufactured conflict that only will serve to stir controversy and confuse the non-savvy reader.

(The reasons for alternative data-stores actually exist in a couple axes – one is for more development flexibility or the ability to change functionality w/o expensive downtime (schemaless), one is for issues of scale and availability (distributed), and then a whole bunch for supporting social queries that just are horribly suited to RDMSs (multi-attribute, inverse index, mq/pubsub, etc.). Many of the alternatives are a combinations of various axes.)

A Free Lunch: Just Do The Numbers

Yesterday Gordon commented on Silicon Alley Insider’s back of the envelope calculations for Google’s food costs. IMO they’re not doing the right numbers. I spent a fair amount of time fighting (somewhat unsuccessfully) this sort of backwards thinking, so here’s a copy of the back of the napkin numbers that I just did. Here’s a copy of what I posted on Gordon’s site:

I think they’re not doing the right numbers. GOOG headcount is at 18,000. If half the workforce ends up working for an extra 1 hr/day (it’s probably closer to 2) b/c of breakfast and dinner, and assuming that an FTE hour is worth $50 to Google (probably low since the fully loaded cost for the avg knowledge work is probably about $100/hr, and the Goog’s profit margin is >25% and that’s including all operating costs), that means:

9,000 employees * 251 days * 1 hour * $50 = $113M

If it does cost $75M (again, we’ll use the worst case number, even if it’s more likely 70% of that), we’re talking about a 50% return on investment. Since we’re using worst case numbers, the direct returns are probably much higher.

And that’s before calculating *any* second order benefits like: better retention, increased loyalty/productivity, increased morale/quality of life, increased reputation/easier hiring and recruiting – and of course free write ups. By almost every set of these important (but harder to quantify) metrics, they is a non-trivial qualitative improvement. Even if it were a complete wash, any knowledge company that could* would be (and are IMO) stupid for not following suit. Any proper calculation would properly include/calculate offsets for these externalities (certainly recruiting/retention cost projections would be easy to model).

Finally, speaking from personal experience, I doubt SAI realizes how disruptive and hard it is to replace (recruit/hire/train/etc) high performing engineers (especially w/ the huge ramp up w/ the custom systems and processes at a cutting edge tech firm)? Does anyone realize how non-linearly time scales and what it means in terms of development productivity (talking to lots of people, and certainly for myself, I find it takes me a couple hours for me to ramp up into a flow state, after which I get more productive (until a certain point – but again, depending on blood sugar)). Anyway, I’ll stop railing against stupidity. I’m sure Google has run the numbers themselves. They do love doing that, so I hear.

Based on my napkin calculations, and the failure of other organizations to grasp the work environment tactics that Google employs, my conclusion is that everyone (employees, the corporation, and the shareholders) at Google are very much enjoying a “free lunch.”

(* Where “could” means a company that could successfully profit from the additional man-hours and the reduced turnover costs. While the former may be a smaller slice, I would suspect that once you add retention, almost every company would be over the line.)