Web 2.0 Expo Presentation Rundown

There were actually a surprising amount (to me, at least – most of the people I talked to had low expectations) of very good presentations at Web 2.0 Expo last week. Most of them are now posted on either SlideShare (more presentations here) or, for the keynotes, on Blip.tv. This is I think, a very exciting and positive development for industry conferences (which I think will only have net-positive effects on attendance; conference proceedings are de rigueur at academic conferences). Here’re the ones I thought were most interesting.

Keynotes (overall, I liked the 10min What X Knows format that asks companies to boil down numbers and insights):

Lots of awesome sessions, the quality of the presentations (primarily in terms of prep/interestingness) was higher than usual:

  • A Flickr Approach to Making Sense of the World – my favorite session of the conference. If you’re doing “geo stuff,” you owe it to yourself to take a look at this. The divisive hierarchical agglomerative clustering bit is great (using morton curves for better pathing, clever). Now there’s not a lot on reverse-geocoding, which I believe I am now doing unique and interesting work on — once I prove it works, I’ll have to publish/present about that. 🙂
  • Capacity Planning for Web Operations – sure you can’t clone Allspaw, but reading what he has to say is probably the next best thing.
  • Website Psychology – linking to an earlier version of Gavin’s talk (with notes, yay) – he does a really great job mapping cognitive psychology concepts onto site usage and development. Well worth reading and thinking about
  • Grasping Social Patterns – by far my favorite Ignite talk this year, all kinds of hooks for thinking about how far social apps and the “social graph” needs to go
  • Making Email a Useful Web App – Bots are awesome and underrated. I’ve been working a lot more w/ them recently and this was a good overview (would love an even more comprehensive history of cool bots…)
  • Even Faster Website (PPT) – Steve Souders (now at GOOG, doing the same sorta thing he was doing at YHOO) talks about the current stuff he’s working on, which is optimizing JS (the logical progression). Great new stuff, just as useful as the older stuff
  • Adding “Where” to Mobile and Web Applications – a bit basic, but a good overview of how location stands today. Come to Where 2.0 and Wherecamp to learn more…
  • Polite, Pertinent, and… Pretty: Designing for the New-wave of Personal Informatics – slides not online. Boo-urns
  • Casual Privacy – slides not online. Boo-urns
  • Next Generation Mobile UIs – slides not online. Boo-urns

Talks I didn’t make but that have interesting decks:

Some stuff that sounded interesting but don’t have slides online: include Marc Davis’ Mobile talk, Opportunity Computing in the Cloud, Social Networks and Avatars (caught a few min of this, looks like they haven’t done a lot of work on the numbers (even organizing across cohorts), but still would like to see the deck), Global Design Trends (there are slides, but only enough to wish you had a recording of the talk)

Some bonus talks if you’ve made it through all those:

Did I forget something (quite probable), miss one of your favs? Post links in the comments.

Clay Shirky on Gin, Television, and Social Surplus

Clay Shirky gave the best keynote talk that I caught at Web 2.0 Expo last week. He’s posted a transcript, entitled Gin, Television, and Social Surplus on his new book’s site (also quite recommended; it makes it onto my “understand the internet” bookshelf).

So how big is that surplus? So if you take Wikipedia as a kind of unit, all of Wikipedia, the whole project–every page, every edit, every talk page, every line of code, in every language that Wikipedia exists in–that represents something like the cumulation of 100 million hours of human thought. I worked this out with Martin Wattenberg at IBM; it’s a back-of-the-envelope calculation, but it’s the right order of magnitude, about 100 million hours of thought.

And television watching? Two hundred billion hours, in the U.S. alone, every year. Put another way, now that we have a unit, that’s 2,000 Wikipedia projects a year spent watching television. Or put still another way, in the U.S., we spend 100 million hours every weekend, just watching the ads. This is a pretty big surplus. People asking, “Where do they find the time?” when they’re looking at things like Wikipedia don’t understand how tiny that entire project is, as a carve-out of this asset that’s finally being dragged into what Tim calls an architecture of participation.

If you didn’t catch it, this is well worth reading.

Internet Asshattery, Armchair Scaling Experts Edition

I know it’s never good to pay attention to the nattering classes, but there was a pretty high profile fusillade that Mike Arrington launched on Blaine Cook which seemed to bring out the arm-chair experts in full force in the comments. Now, while I think that Arrington’s post is way out of line (I’ll explain that in a bit), I’m almost not as bothered by it (as long as he’s not to bothered for being called out on it)… What really bugs me is the number of clueless “developers” throwing in their two cents. That includes Arrington’s two Rails developers with “finger on the pulse of the rails community” (ha!). My discontent was further exacerbated by this (unrelated) completely clueless piece on The Register. Is this the best that tech journalism has to offer?

First, a disclaimer: I don’t know Blaine very well, and I don’t have any privileged info on Twitter or Obvious Corp.

There’s no question that Twitter has and continues to suffer from capacity, load, and other stability issues, and pointing that out is fair game, however pointing at Blaine’s scaling talk as a personal dig is a disservice to the everyone, especially since:

  1. The advice in the slides are generally good (and the “It’s Easy” is obviously snark – just look at the failcat in the next slide; it’d be easy to confirm that by asking anyone who was in the talk (like me or several hundred other people) instead of projecting prideful boasting to justify his attack — I’ll avoid ascribing motivations to why Arrington chose to do this).
  2. More crucially, the slides themselves point to the issues that a proper tech journalist would be able to spot and follow up on to try to find out what was really going on (assuming he cared about that).

For example, 600 QPS on 8 machines is pretty decent – but this raises the question of utilization and capacity planning. You can see from the 1×1 MySQL structure and the note on DRb that there were many single points of failure – again, this raises questions of BCP and redundancy. With the constant bumping of limits, you could guess that they were running really hot (and from a single data center, even after the move (probably w/o backup routers, etc.)) — all these issues are as much (if not moreso, since these are technical no-brainers) business/financial decisions than architectural/technological ones.

Now, I don’t know what happened between ops, management, and engineering, but guess what? Arrington doesn’t either, and he never bothered to follow up and kicks Blaine in the head instead, even when such clues obviously raise significant doubts about whether it’s appropriate. I agree with Arrington’s point about accountability, which is why I say now that Arrington wasn’t posting journalistically (the minimal followup with someone w/ half a clue would have pointed out exactly what I did), and Blaine deserves an apology. If you’re gonna shit on someone and start pointing fingers, you better have the goods to back it up. Whiny, uninformed personal attacks belong on Arrington’s Live Journal or (wait for it…) Twitter stream.

Now, onto the retarded comments from wanna be developers. Well first, of the entire thread, I only saw one half-decent attempt at a technical critique, and even that falls down when you look at it. I don’t want to belabor the point, but the poster, Jordan, actually raises technical points worth addressing (and refuting):

  1. On indexing: while it’s true you don’t want to index willy-nilly and it’s incomplete to say “index everything”, if your ORM isn’t automatically indexing frequently used keys, you can be sure as heck that you’ll want to make a point of indexing them, especially if you’re doing joins. Yeah, you don’t index what you don’t need, but even if you have frequent writes, you need to eat it if you’re ever going to ever query. Because people suffer from lack of indexes, unless you’re not adding an index and examining, you’re not gonna have a problem “over indexing.” I don’t know the exact fan-out/pub-sub architecture, but you can be sure you’ll be doing a lot more reads even if you cache the hell out of it. If you’re thrashing, you’re looking at having mis-configured index caches more than anything else.
  2. DRb: This is a case where it looks like he just misread. It’s easy if you don’t have the context of the talk. DRb was good enough… until it wasn’t – which is why Starling was written to replace it. Now, we still don’t know if it’s a single point of failure, but it obviates that whole rant (as to why DRb was chosen in the first place, more on that later)
  3. Caching: again, the same thing with indexes. Of course over-caching is bad, but that’s never going to be your problem because you start with no caching, and you add caching until you start losing performance. Also, the “no substitute for fixing the underlying problem” is naive – most of the time, your problems are that there’s no need to do complex queries or processing since the data doesn’t change and should be cached. durrf.
  4. Profiling: ok, this I’d sorta agree with. Mentioning ruby-prof would probably be good, but honestly, 90%+ of optimizations can be done on simple timers, explains, and logs alone. (And also, performance tuning doesn’t have all that much to do with scaling anyway.)

As to the rest of the wannabees, it really is true that if you haven’t done it, that is: been intimately involved growing a social web app from prototype to Internet-scale on a UNIX stack, then you really don’t know shit. (I know more than my fair share of people that have, and I didn’t see any of them posting armchair bs on the comments). I’m not trying to say this just to be dismissive, but only to say, you really really, don’t understand the technical challenges involved. Generating target sets on social objects is extremely expensive and ill-suited to traditional 4NF data models in RDBMSs. So is social activity fan-out and any number of activities core to Twitter’s message routing/storage and to social web apps in general. These are not traditional problems and standard, HA solutions just aren’t available.

Even if you’re architecturally sound, you’re dealing with development with extremely tight timelines/pressures, so you have to make decisions to pick things that will work but will probably need to eventually be replaced (e.g. DRb for Twitter) — usually you won’t know when and what component will be the limiting factor since you don’t know what the uses cases will be to begin with. Development from prototype on is a series of compromises against the limited resources of man-hours and equipment. In a perfect world, you’d have perfect capacity planning and infinite resources, but if you’ve ever experienced real-world hockey-stick growth on a startup shoestring, you know that’s not the case. If you have, you understand that scaling is the brick that hits you when you’ve gone far beyond your capacity limits and when your machines hit double or triple digit loads. Architecture doesn’t help you one bit there.

And the people that have experienced this and lived to tell the tale also know that it’s impossible to critique the technical/operational aspects made w/o seeing and understanding the QPS targets, load graphs, profiling data/sar info and all manner of other architectural/technical data and details (that none of us are privy to) before commenting with any sort of authority.

Anyway, if you were given the choice of working with/hiring someone like Blaine who has had the firsthand full life-cycle scaling experience and any random developer (and definitely anyone from the TechCrunch comments), I think it’s fairly obvious what the right decision would be. I guess I’ll leave it at that.

This leads to Part Deux of my rant… this lead-paint baby of an article entitled Backlash starts against ‘sexy’ databases which has the following quote, I shit you not:

“The bottom line is don’t tell me RDMBS [sic] can’t scale if you can’t write a decent query or design a normalized database schema.”

This is by one John Holland. Now, no doubt the WordPress code can be pretty shitty (although sometimes there are good reasons for the multiple queries to support various hooks/plugins), but you will never hit the type of performance problems in a WP (non-mu) installation that have people looking for MySQL alternatives because WP just doesn’t have the types of queries that destroy RDBMs.

I can understand that it’s not the article author’s (Phil Manchester) fault for conflating the “cons” with arguments that WP is badly coded with the “pros” (correct!) that you can’t write the kinds of queries you need for social apps because if he’s like the reporters I know, he probably doesn’t actually understand it at all and is doing his beat writeup, but dammit, can’t the author get some decent frickin’ technical advisors to explain this if he’s doing tech journalism? The entire article is based on characterizing a misinformed blog post as a “brewing controversy.”

I mean, I don’t want to be more mean than I have to about this, but John Holland just has no idea what he’s talking about. He picks up on Atwood‘s post on WP inefficiency, and then uses that to (completely incorrectly, and not without a tinge of reverse elitism) generalize on why the “cool kids” are hyping non-relational data stores. He goes on to boldly state “Relational databases are not the bottleneck” due to his complete lack of understanding of the actual problem set (hint: I don’t know anyone who’s suggesting WP should be switched off MySQL). This then leads to a horribly ignorant article being published by a writer who is in the best case, lazy and doesn’t understand what he’s reporting (just show two equal sides and do a writeup) or in the worse case is simply looking for a manufactured conflict that only will serve to stir controversy and confuse the non-savvy reader.

(The reasons for alternative data-stores actually exist in a couple axes – one is for more development flexibility or the ability to change functionality w/o expensive downtime (schemaless), one is for issues of scale and availability (distributed), and then a whole bunch for supporting social queries that just are horribly suited to RDMSs (multi-attribute, inverse index, mq/pubsub, etc.). Many of the alternatives are a combinations of various axes.)

Sorting Lists of Lists in Python

No language is perfect, and Python is no exception. In Python 2.4, for example, sorting a list of lists sorted by a value in the nested list (more common than you’d think), requires a bit of work and a library:

import operator
max = sorted(list_of_lists, reverse=True, key=operator.itemgetter(1))[0]

Python 2.5 makes things easier:

max = max(list_of_lists, key=lambda x: x[1])

The official Python Wiki has a nice HOWTO on Sorting in Python (also a much less comprehensive one on sorting dictionaries).

(note: OS X Leopard comes w/ Python 2.5 while, Debian Etch remains on 2.4 by default)

Getting Started w/ Python

As you might have heard, Google AppEngine launched tonight, with Python as its initial (and only) programming language to interface with its services. I started switching over to Python (from Perl) a few years ago for general processing and daemon tasks (mostly for its sweet RPC bindings and its comprehensive, if still somewhat convoluted Unicode handling). Over time, as the libraries matured, I started moving more and more over – some things were long overdue, like a CPAN equivalent (pypi and EasyInstall have finally stepped up to the plate), but in some areas, like with cross-platform GUI toolkits, things like py2app/pyexe, or with libraries like Twisted, and SciPy, and Beautiful Soup, Python has long since blown past the competition.

Earlier this year, as I was wrapping up at Yahoo!, I knew I wanted a clean start, and after reviewing what was out there decided on switching to Python as my primary language and making a go of writing my new web apps in Django (deployment, performance, and decoupling being among the primary factors; less wankery in the development community was also a big part of it). I’ve been somewhat sidetracked by a slew of other projects, but so far it’s been a good experience (and I hope to have some stuff to publish soon).

Anyway, all this is a very, very, long setup for a list of resources that may help those who are looking to get started working w/ Python. I’m still not as proficient as I’d like, so here are the references that I typically reach for:

  • PLEAC PythonPLEAC (Programming Language Examples Alike Cookbook) is a project that aims to port the Perl Cookbook to other languages. The Python port has been at 85% for years, but is invaluable when looking at basic constructs.
  • (the eff-bot guide to) The Standard Python Library – although a bit out of date and not comprehensive, it offers short and useful examples for most of the modules in Python. This is great because often times the official library docs while technically complete are also at times completely opaque. If I were to give any advice to people writing API docs, it would be to 1) have some simple real-world usage examples and to 2) allow user annotations (PHP was (and remains!) way ahead of the curve on this one. It’s amazing how primitive the core language/library docs are.)
  • Dive Into Python – I waffle back and forth on how much I like Mark Pilgrim’s book – it’s oftentimes just short of useful and not organized so well (I’m still looking for a good language reference), but it also has really useful tidbits, like when I forget how to append the system import path
  • Python-by-example – this is a new one, and I haven’t used it much (inline-search would do wonders for this) but I wholly approve of the intent: “This guide aims to show examples of use of all Python Library Reference functions, methods and classes.”
  • Otherwise, I’ve found that doing a web search almost always turns up something on ASPN or on a mailing list somewhere.
  • Lastly, there are some interactive shells that are useful, specifically IPython. Reinteract is less of a tool that I use everyday and more of something that’s damn cool. The same w/ Nodebox.

Of course, one of the biggest benefits of Python is how readable the source code is – it’s definitely a big help for seeing how things works. Have any of your own favorite Python resources? Please post ’em up on the comments.

Getting up to speed on Django probably deserves its own post…

Dashboard Widget for Posting Blog Entries to Confluence via XML-RPC

First of all, here’s a zip of the working widget (10.4.3+): Confluence Daily Log.zip. I also checked in the Dashcode project.

I had an old WordPress widget from a year or two back that I had written, so I thought this would be a simple port, but I forgot that I had written it w/ a set of Python proxies because 1) xmlrpclib is awesome and 2) the JavaScript XML-RPC libraries I had tried (at the time I believe the best was jsolait) were maddening.

I’ve moved onto jsxmlRPC, which is an improvement from the prior options, but still has some issues. I also gave JS-XMLRPC a try, but the lack of documentation, examples, and the verbose retardedness of it all quickly convinced me otherwise. And I looked at @tomic briefly, but just couldn’t justify 300+K of JS dependencies for it. If I continue to have problems w/ jsxmlRPC, I may switch. Mostly I went w/ jsxmlRPC because I approve of its interface and of its documentation (it’s not that hard, a reference implementation would cover most of it).

My preference would have been something that magically did its business, but I actually had to really dig into jsxmlRPC’s code to get things working. While the demo worked well enough, it was barfing when interacting with Confluence. Turns out this is because jsxmlRPC does not handle parsing of payloads according to the spec – param values can be returned without a nested type tag (defaults to a string data format). jsxmlRPC tried to find a nested value all the time, which as you might guess, caused all sorts of brokenness. This can be fixed by changing the following (swap the while with this if) at the top of the getResultFromValueNode() function (line 385ish):

if ("#text"==valueNode.nodeName){                                              
  return valueNode.textContent;                                                  
}

Note: I saw that jsxmlRPC was still barfing on empty values. That’s probably a simple fix, but my brain’s pretty fried and I’m tired of looking at the code.

I developed this widget in Dashcode, which is many sorts of awesome and very much simplified the process, however the debugger and stack frame was for some reason not quite as helpful as one would imagine it to be. (It was all sorts of unhelpful in tracking down the errors I was looking for even when stepping through). Once I ported the code to Firebug and added a few console.logs(), the problems became much clearer.

In terms of functionality, everything seems to work for me. It does whipped up a very dumb local autosave as well. Here’s what’s missing:

  • Confirm dialog when navigating away from edited entries – I track the editing, but when I found out that Dashboard doesn’t support confirm() I tabled that feature
  • Handling reauth: I have no idea how long tokens last (forever? it’s not specified in the Confluence RPC docs), but if they expire, I don’t have a good way of trying to reauth (this would just involve adding some extra timers and some exception handling so not that bad to implement) – Update: – looks like these expire pretty quickly. I’ve uploaded a new version that just reauths before every API call (getBlogEntries, getBlogEntry, and storeBlogEntry)
  • Help – Some notes for the settings might be useful. IE, the “Space Key” is the shortname for your Confluence Space and the “Endpoint” is http://yourconfluenceinstall/rpc/xmlrpc
  • Widget Icon

OK, it’s late. I’m going to bed.

Adventures in MacPorts: FuseFS Edition

On Saturday, Richard Crowley published an awesome looking hack called PownceFS that creates a FUSE filesystem mounting your friends’ files from Pownce as a local filesystem. Cool! Now lets try getting it working on the Mac…

# port install fuse-bindings-python
...
--->  Activating fusefs 1.1_3+darwin_9
Error: Target org.macports.activate returned: Image error: /Library/Filesystems/fusefs.fs/Contents/Info.plist already exists and does not belong to a registered port.  Unable to activate port fusefs.
Error: Status 1 encountered during processing.

Well, that sucks. MacPorts doesn’t play nice if you installed macfuse. Apparently there’s a proper way to uninstall, but I missed this, and just did a quick and dirty fix…

# rm -rf /Library/Filesystems/fusefs.fs/
# port install fusefs
--->  Activating fusefs 1.1_3+darwin_9
********************************************************
*  fusefs is already loaded. You may need to restart.  *
*  Alternatively, if feeling adventurous, you can run  *
*  `sudo kextunload -b com.google.filesystems.fusefs`  *
********************************************************
--->  Cleaning fusefs

Now libfuse installs properly. If you get errors, fusefs probably didn’t install properly (see Ticket #11471: fusefs misses common/fuse_param.h).

Now, if you install on 10.5, fuse-bindings-python will install python24 (Leopard ships w/ 2.5 as default), which is sort of retarded, but whatever. You may get some errors w/ fuse-bindings-python, where it gets confused about the install location – just run it again if you just installed python24 and it should work. Next, oauth:

# cd /opt/local/lib/python2.4/site-packages
# svn co http://oauth.googlecode.com/svn/code/python/oauth

Now, check out PownceFS – you’ll need to change the shebang to: #!/opt/local/bin/python2.4.

This is where I’d like to report great success, but well, it looks like the fuse-bindings-python broken:

>>> import fuse
Traceback (most recent call last):
  File "", line 1, in ?
  File "/opt/local/lib/python2.4/site-packages/fuse.py", line 26, in ?
    from fuseparts._fuse import main, FuseGetContext, FuseInvalidate
ImportError: Failure linking new module: /opt/local/lib/python2.4/site-packages/fuseparts/_fusemodule.so: Symbol not found: ___CFConstantStringClassReference
  Referenced from: /opt/local/lib/libfuse.0.dylib
  Expected in: flat namespace

Hey, looks like CoreFoundation isn’t being linked. Turns out someone submitted a patch just yesterday to fix this. You can fix this manually by editing the libfuse Portfile. In my case, it’s in /opt/local/var/macports/sources/rsync.macports.org/release/ports/fuse/libfuse/Portfile – you can add the following line below the patchfiles fuse-2.7.1-macosx.patch line:

configure.ldflags-append    -framework CoreFoundation

Now, reinstall (uninstall the python-bindings and the libfuse packages and reinstall) and import fuse will stop barfing at you. Sweet Jesus we’re almost there… The final step is installing json-py – just unzip this in your site-packages folder.

And… well, it sort of works. It’s mounted:

Python@fuse1     0Bi    0Bi    0Bi   100%    /Users/lhl/powncefs

And while I can’t do any file operations, I can tab complete and see some friends:

lhl@octo powncefs $ ls powncefs/
ls: powncefs/: Operation not permitted
lhl@octo powncefs $ ls 
MarcD              deprimer           mattb              rnair
TheBrad            edwardho           maximolly          ryancarson
adactio            elatable           me3dia             samfelder
agendacide         elbowdonkey        meandmybadself     spullara
akoblin            fauxstor           migurski           sugarlime
allaboutgeorge     fraying            mlaaker            symphonicknot
ask                iamcal             monstro            t
basictheory        jamescronin        mroth              thincvox
beach              jmacias            natekoechley       uvince
benvoluto          jmcnally           neb                waxpancake
botz               joshuakaufman      nickf              whatevernevermind
buzz               kentgoldman        paulh              xeni
carriewestlake     kevnull            peterme            yahooza
caterina           laughingsquid      photojunkie        
chaddickerson      leia               plasticbagUK       
dansays            lhl                rabble             

Well, it’s getting late. Good luck and hope this helps for anyone trying to get PownceFS working on OS X.

Thunderbird vs. Mail.app

I’ve been a Mail.app user for a long time. Every time that I’ve tried switching to Mozilla Mail or Thunderbird over the past few (4?) years, I’ve ended up back on Mail.app. Mail has never been the fastest or most featureful application, but its IMAP, while sometime slow, has been rock steady, and certain things like the auto-saving/window reopening and the Address Book integration are really quite nice. I recently figured out how to view Exchange invites of course, MailActOn has made organizing my mail entirely in the realm of possibility.

What’s the point of all this? Basically to describe that Mail.app was working for me… until my corporate mail was switched around that is. My new setup requires me to tunnel an IMAPS connection. It turns out that Mail.app has a bug where it’ll try to connect to the server with the default port 993 regardless of what you specify the server port is (and it’ll fail silently without telling you that’s what the problem is – thanks Apple!). Since opening tunnels as root wasn’t high on my yes-I’d-like-to-do-this-every-day list, I decided to once again check out the latest build of Thunderbird.

And, with a mess of extensions and some tweaks, I’m settling in. Thunderbird is much faster than Mail.app (1.5.0.4+ is Universal) and has support for IMAP subscriptions and IDLE which is nice. (I also figured out the weird Inbox nesting issues I’ve had in the past: you need to set the IMAP server directory as “INBOX/” in the IMAP server advanced settings). Here are the major changes I’ve made so far to make things work better:

  • Advanced Remove Duplicates saved me hours helping to remove the 20K dups generated while my getmail was freaking out
  • In the account settings, turning on the “Offline” settings for folders to emulate Mail.app’s sweet offline IMAP message caching behavior
  • Headers Toggle gives me back full header toggling w/ the ‘H’ key
  • GMailUI – this extension is AWESOME, enabling a whole bunch of useful key bindings, better search, and best of all, one key archiving
  • Nostalgy – another priceless extension, this allows easy keyboard navigation and mail moving. Hooray!
  • keyconfig – now this is the motherload – if I had found this last time, I probably wouldn’t have switched back. keyconfig lets you write arbitrary JavaScript and bind them to keys – right now, I’ve only gotten around to writing some quick binds to switch between text/html message views, but with enough rooting around through chrome/extension JARs and XPIs, I think I can solve most of my remaining niggles
  • Remember Mismatched Domains – this is useful if you’re tunneling since the cert won’t match

I’m also running a couple of plugins that aren’t publicly available for parsing dates out of Outlook VCALENDARs, but once there’s a Universal Binary Lightning build, that shouldn’t be a problem. Also, I’m fervently waiting for Address Book integration.

It’s been a long road for Thunderbird, but I think that like Firefox, the extension architecture will be what will give it the edge in the long run (as it’s been bearing out).

While I have some tweaks I want to make, I’m confident that I’ll be able to easily make them with keyconfig (almost a GreaseMonkey equivalent – now if there were something that could bind arbitrary onloads…). On my list: better pane/folder navigation, a message rewrapping/dynamic replacement script, and custom JS expression-based filtering.

We’ll have to see what happens over the next few weeks (and I’m sure I’ll be looking at Mail.app again in Leopard), but I have a feeling that Thunderbird may end up sticking around this time.

Lunix Tech Tips

Almost every engineer at Yahoo! gets a *NIX workstation in addition to a PC/laptop (in my case, I requested a Linux box instead of the more traditional BSD, and a Powerbook). While KVMs come standard, a fair number use their the workstations almost exclusively headlessly for local development, me among them. I’ve never been a fan of X Windows (can’t I just have working mouseless copy and paste for all my applications?), and my life seemed like it was fine without it.

Yesterday, I couldn’t log in through my laptop, and I decided to bite the bullet and finally try to get X working for me. I’ve actually made a lot more progress than I thought I would, and I’ve learned some interesting new things (that I’m writing down so I don’t forget), but this experience has served to confirm my previous assumptions that my life will continue to be fine without interacting with UNIX on the desktop.

  • 1920×1200 on 2405FPW – One problem that plagued me was that I couldn’t get the 2405FPW monitor running at native resolution in X. As far as I knew, it should have been working, but it wasn’t. I finally tracked down a lead that I had missed, and after downloaded and compiling read-edid, I found out the missing ModeLine arguments, and also had to correct the HorizSync settings in the Monitor Section of the xorg.conf
  • RHEL4 up2date sucks – Many of the problems I’ve had wouldn’t be issues in Debian. Up2date reads YUM repositories, so I added the Fedora Extras in. (RHEL4 is based off of FC3) It’s not perfect, and some of the packages just fail (or have unresolved dependency trees), but it’s an improvement
  • Quicksilver-like tools – I found a couple, but ended up switching window managers (for a bunch of other reasons) so I never got to try it out. ion, the window manager I’m now using lets you do keyboard binds and scripting, which is good enough for my purposes, even if it’s not very slick
  • A better window manager – so, being fed up with how I couldn’t have a decent copy-and-paste experience (1 set of keyboard shortcuts across all applications – honestly, can it be that hard?) I set off to find something to solve my clipboard woes… and I haven’t found it. I did however try a bunch of window managers (on my list: tiling w/ arbitrary window splitting and sizing, remembering layouts, full keyboard access, customizability). Ratpoison and wmii proved too limiting, but ion seems to be almost there with the other window manager features I was looking for. It has built in Lua for high-level scripting of behaviors and arbitrary keybinding, so if I had a couple days to spare (which I don’t), I could probably get it near how I think a window manager should actually work. Of course, it falls down on the copy and pasting, but maybe I can find a third party app to do what I want.
  • RHEL4 window manager switching – RHEL4 uses gdm to handle X Windows logins. It’s really bizarre. There are lots of config files lying around, none of which seem to work in actually displaying a third-party window manager in the selection list. After lots of searching, I found that that the script I wanted to access was switchdesk-helper. I then added custom branches for my window managers and the appropriate switchdesk files. I suspect there’s a better and easier “correct” way to do it, but I know better than to assume that’s the case…
  • Reversing mouse buttons – One of the things employees go through arriving at Yahoo is a full ergo review. Since then, I’ve been mousing w/ my left hand at work. It takes a while to get used to it, and I have to admit, I still don’t feel very comfortable doing fine dragging operations (hence my extreme desire for the shift-arrow control-key ranging and copy-and-paste that you get with any non-terminally retarded UI). While KDE had this built into its preferences, other Window Managers don’t. While it’s a command is given in the BSDE FAQ, it’s actually wrong and doesn’t work on Linux. You need to run xmodmap with the command “pointer = 3 2 1 4 5” not “pointer = 3 2 1”. If you do the latter, it’ll barf at you. wee!
  • xterm colors – The defaults for xterm are a blinding black on white. I solved this problem years ago, but for some reason, my .Xdefaults changes weren’t loading. Turns out that if that happens you have to rund xrdb on the .Xdefaults to update (ha ha!, of course!) I also finally got around to changing the xterm*color4, which by default is an illegibly dark blue when running on a dark background to something better (I’ve settled for now on RoyalBlue

As you can see, that’s what I like about Linux. It just works.

On an unrelated techie note, in Firefox with Adblock Plus, you can go into about:config, filter for the adblock preferences, and change extensions.adblockplus.defaultstatusbaraction to 3 so that clicking the statusbar icon will default to toggling Adblock Plus on and off. I was just playing around with that and I realized it was just countint down the menu. Very cool.