Some Notes on Distributed Key Stores

Posted on April 20, 2009April 26, 2009 by lhl

Last week I ended up building a distributed keystore for a client. That wasn’t my original intention, but after doing testing on just about every project out there, it turned out to be the best (only?) solution for our needs.

Specifically, a production environment handling at least 100M items with an accelerating growth curve, very low latency retrievals, and the ability to handle 100s of inserts/s w/ variable-sized data (avg 1K, but up in many cases well beyond) … on EC2 hardware. The previous system had been using S3 (since SDB is limited to 1K values) – err, the lesson there, BTW is don’t do that.

So, these requirements are decent – something that actually requires a distributed system, but something that shouldn’t be beyond what can be handled by a few nodes. My assumption was that I’d actually just be doing some load testing and documenting installation on the keystore the client picked out, and that would be that. This was not the case.

I’m still catching up on a number of other projects, so I don’t have a great deal of time to do a formal writeup, hoewver, the work I’ve done may be useful for those who might actually need to implement a production keystore.

Some other recent useful starting points may be Richard Jones’ Anti-RDBMS roundup and Bob Ippolito’s Drop ACID and think about data Pycon talk.

MySQL – while the BDB backend is being phased out, MySQL is a good baseline. With my testing, on a single m1.large, I was able to store 20M items within one table at 400 inserts/s (with key indexes). Key retrievals were decently fast but sometimes variable. There are very large production keystores are being run on MySQL setups. Friendfeed has an interesting writeup of something they’re doing, and I have it on good authority that there are others running very big key stores w/ very simple distribution schemes (simple hashing into smaller table buckets). If you can’t beat this, you should probably take your ball and go home.
Project Voldemort – Voldemort has a lot of velocity, and seems to be the de facto recommendation for distributed keystores. A friend had used this recently on a similar-scale (read-only) project, and this was what I spent the majority of my time initially working with. However, some issues…
- Single node local testing was quite fast – 1000+ inserts/s, however, once run in a distributed setup, it was much slower. After about 50M insertions, a multinode cluster was running at <150 inserts/s. This… was bad and led me to ultimately abandon Voldemort, although there were other issues…
- There is currently only a partially complete Python client. I added persistent connections in as well as client-side routing w/ the RouteToAll strategy, but well, see above
- Embedded in the previous statement is something worth mentioning – server-side routing currently doesn’t exist.
- While I’m mentioning important things that don’t exist, there is currently no way to rebalance or migrate partitions, either online, or, as far as I could tell, even offline. This puts a damper on things, no?
- As a Dynamo implementation, a VectorClock (automatic versioning) is used – this is potentially a good thing for a large distributed infrastructure, but without the ability to add nodes or rebalance, it means that for a write-heavy load, it would lead to huge growth with no way for cleanup of old/unused items (this of course, also is not implemented)
LightCloud – this is a simple layer on top of Tokyo Tyrant but the use of two hash rings was a bit confusing and the lack of production usage beyond by the author (on a whopping 2 machines containing “millions” of items) didn’t exactly inspire confidence. Another problem was that it’s setup was predicated on using master-master replication which requires update-logs to be turned on (again, storing all updates == bad for my use case). This was of course, discovered rooting through the source code, as the documentation (including basic setup or recommendations for # of lookup & storage nodes, etc is nonexistent). The actual manager itself was pretty weak, requiring setup and management on a per-machine basis. I just couldn’t really figure out how it was useful.
There were a number of projects that I tried, including Cassandra (actually has some life to it now, lots of checkins recently), Dynomite and Hypertable that I tried and could not get compiled and or set up – my rule of thumb is that if I’m not smart enough to get it up and running without a problem, the chances that I’ll be able to keep it running w/o problems are pretty much nil.
There were a number of other projects that were unsuitable due to non-distributed nature or other issues like lack of durable storage or general skeeviness and so were dismissed out of hand, like Scalaris (no storage), memcachedb (not distributed, weird issues/skeeviness, issues compiling) and redis (quite interesting but way too alpha). Oh, although not in consideration at all because of previous testing with a much smaller data set, on the skeeviness factor, I’ll give CouchDB a special shout out for having a completely aspirational (read: vaporware) architectural post-it note on its homepage. Not cool, guys.
Also, there were one or two projects I didn’t touch because I had settled on a working approach (despite the sound of it, the timeline was super compressed – most of my testing was done in parallel with lots of EC2 test instances spun up (loading millions of nodes and watching for performance degradation just takes a long time no matter how you slice it). One was MongoDB, a promising document-based store, although I’d wait until the auto-sharding bits get released to see how it really works. The other was Flare, another Japanese project that sort of scares me. My eyes sort of glazed over while looking at the setup tutorial (although having a detailed doc was definitely a pleasant step up). Again, I’d finished working on my solution by then, but the release notes also gave me a chuckle:
released 1.0.8 (very stable)
- fixed random infinite loop and segfault under heavy load

OK, so enough with all that, What did I end up with you might ask? Well, while going through all this half-baked crap, what I did find that impressed me (a lot), was Tokyo Cabinet and its network server, Tokyo Tyrant. Here was something fast, mature, and very well documented with multiple mature language bindings. Testing performance showed that storage-size/item was 1/4 of Voldemort’s, and actually 1/2 of actual size (Tokyo Cabinet comes with built-in ZLIB deflation).

Additionally, Tokyo Tyrant came with built-in threading, and I was able to push 1600+ inserts/s (5 threads) over the network without breaking a sweat. With a large enough bucket size, it promised to average O(1) lookups and the memory footprint was tiny.

So, it turns out the easiest thing to do was just throw up a thin layer to consistently hash the keys across a set of nodes (starting out with 8 nodes w/ a bucket-size of 40M – which means O(1) access on 80% of keys at 160M items). There’s a fair amount of headroom – I/O bottlenecks can be balanced out with more dedicated EC2 instances/EBS volumes, and the eventual need to add more nodes shouldn’t be too painful (i.e. adding nodes and either backfilling the 1/n items or adding inline moves).

There are some issues (an issue w/ hanging on idle sockets) but current gets are at about 1.2-3ms across the network (ping is about 1ms) and it seems to otherwise be doing OK.

Anyway, if you made it this far, the takeaways:

The distributed stores out there is currently pretty half-baked at best right now. Your comfort-level running in prod may vary, but for most sane people, I doubt you’d want to.
If you’re dealing w/ a reasonable number of items (<50M), Tokyo Tyrant is crazy fast. If you're looking for a known, MySQL is probably an acceptable solution.
Don’t believe the hype. There’s a lot of talk, but I didn’t find any public project that came close to the (implied?) promise of tossing nodes in and having it figure things out.
Based on the maturity of projects out there, you could write your own in less than a day. It’ll perform as well and at least when it breaks, you’ll be more fond of it. Alternatively, you could go on the conference circuit and talk about how awesome your half-baked distributed keystore is.

UPDATE: I’d be remiss if I didn’t stress that you should know your requirements and do your own testing. Any numbers I toss around are very specific to the hardware and (more importantly) the data set. Furthermore, most of these projects are moving at a fast clip so this may be out of date soon.

And, when you do your testing, publish the results – there’s almost nothing out there currently so additional data points would be a big help for everyone.

2009-04-22: Performance comparison: key/value stores for language model counts – BDB, TC, TT memcache
2009-04-24: Redis Performance on EC2 – tests on a couple EC2 instance sizes and vs real hardware

Infrastructure for Modern Web Sites

Posted on January 28, 2009February 4, 2009 by lhl

One of the things that I did when I wrapping up at Yahoo! was to begin to take a look at the current state of web frameworks. I ended up picking Django, but I have to say, I was disappointed with the state of what’s out there. Friends will have heard me bemoaning this sad state of affairs – that while Rails and Django might make CRUD easier, that the ORMs weren’t suitable for scaling beyond “toy” sizes, and that more importantly, they didn’t seem to address almost any of the pain points of building and maintaining a modern website.

A couple recent posts, most notably Krow’s Scaling, Systems Required list, but also Tom Kleinpeter’s post asking Where Are the AB Testing Frameworks? reminded me that I had made my own list. I was originally going to start working on these, but since I’ve now been side-tracked by a few projects, I thought I’d put it out there before it gets too completely irrelevant.

I’ve split this into two sections. The first I call “below the line,” which are more system level (some things straddle the line):

API Metering
Backups & Snapshots
Counters
Cloud/Cluster Management Tools
- Instrumentation/Monitoring (Ganglia, Nagios)
- Failover
- Node addition/removal and hashing
- Autoscaling for cloud resources
CSRF/XSS Protection
Data Retention/Archival
Deployment Tools
- Multiple Devs, Staging, Prod
- Data model upgrades
- Rolling deployments
- Multiple versions (selective beta)
- Bucket Testing
- Rollbacks
- CDN Management
Distributed File Storage
Distributed Log storage, analysis
Graphing
HTTP Caching
Input/Output Filtering
Memory Caching
Non-relational Key Stores
Rate Limiting
Relational Storage
Queues
Rate Limiting
Real-time messaging (XMPP)
Search
- Ranging
- Geo
Sharding
Smart Caching
- dirty-table management

The second section, which I call “above the line” are common application level components that typically depend on one or more of the components above. There are of course a huge list of features for any component, but I’ve highlighted some that either aren’t commonly implemented or are particularly important:

AuthX (AuthN + AuthZ)
- Capabilities
- Multifactor Auth
- Rate Limiting
- Signup
- OpenID
- OAuth
- External import
Groups
Invites
Lists
Notifications
- Spam filtering
- Multi-protocol routing
- Fine-grained controls/rules
Presence
Social Activity Log (Newsfeed)
- Filtering
Social Model
- Connectivity (uni/bidi)
- Privacy (private, reciprocal, public)
- Views
- Traversal
Social Object
- Privacy, Social Scoping
- Voting
- Sharing
- Publishing
- Comments
- Favoriting
- Social editing
- Permissions
Tagging
- Combinations
- Relatedness
User
- Achievements/Awards
- Activity Log
- External User ID Mapping
- Permissions (see AuthX)
- Deletion/Archival
- Flagging
- Direct Messaging
- User Cards

This list is by no means complete, but maybe a good starting point. I’d be interested to hear what other people have had to build/would most miss if they had to start anew.

(What seems the biggest shame to me is that everyone is currently rebuilding this stuff over and over again and rationalizing it as some sort of secret sauce competitive advantage when it’s really infrastructure – stuff that really should be standardized so you can actually get around to doing the new and interesting stuff.)

Update: For those of you who feel the urge to comment about not needing this functionality: if existing frameworks work for you, that’s great. Also, if you’re not building a site that provides a service to users and have or are planning on being able to grow it, then you’ve likely not faced these pain points. Feel free to move along.

Now, I would like to hear from others working on similar problems, although I understand that most of those people remain under the corporate veil where this sort of information remains “competitive advantage.” Hopefully putting this list out there helps people realize that everyone’s building the same stuff over and over again (to varying levels of quality).

On Application Development

Posted on January 27, 2009 by lhl

The other day, Jeff Atwood posted a piece entitled A Scripter at Heart that distinguished programming vs scripting. Simon Willison had a strong (negative) reaction to that, and proposed distinguishing by the term “dynamic languages”. Yesterday Matt Biddulph posted a bit about some of his experiences as a web developer working with Objective-C and the iPhone (some more discussion), and since I’ve been doing something similar this month, I thought I’d throw in my 2-cents (my experience so far has differed from Matt’s), since it also relates to how I veiw the divide of two very different types of programming (systems vs application?).

To preface, like Matt, my background is also primarily as a web developer, although not exclusively – I’ve written my share of Lingo, Java Applets, OpenGL, Shake scripting, Max/MSP and Processing and other stuff. These days I hang my “expertise” hat on web architecture and systems, but I’ve done a fair amount of just about everything on the web side of things including some lower level things like working on Apache modules.

This isn’t to brag (you’ll note no accomplishments of merit mentioned above :), but simply to give some context of where I’m coming from. Learning Cocoa has been interesting. Of course, first and foremost, there’s the unique feeling of being a newbie again – that awful confusion, but also the excitement and then that somewhat retroactively forgetful feeling of incomprehension at not understanding how something works…

This learning phase may have maybe been more painful that it could or should have been. “Learning Cocoa” encompasses, not just a language (Objective-C) tied intimately to multiple very large sets of libraries (collectively Cocoa, but also CoreFoundation, AppKit, and in my case Quartz, Core Image and CoreAnimation as well as an inscrutable third party API), but also XCode and Interface Builder, each with a myriad number of settings, plists, etc.

While I think that a further discussion of the total lack of context and the bits and pieces of documentation/tutorials that did help me get my bearings may be the topic of another post, I did want to mention that the Apple Developer Documentation did not help me as I would have hoped in terms of orientating myself.

Some more observations:

It’s sort amazing how much more work seems to go into accomplishing very little, and how your ambitions scale along with that. I’ve spent more time working on looping some animations and making sure it doesn’t leak memory for example than say the Event SRP, or heck, the entire offline-task system on MyBO. Maybe it’s just my experience so far (biased say with spending a solid week fighting a certain third party SDK while learning the fundamentals), but I can see now why desktop apps haven’t seem to evolved as quickly as web services have. There’s just a lot of slog involved.
Note: PyObjc doesn’t make things easier – it’s just … hideous
Although… it would avoid Objective-C 2.0’s ridiculous memory handling – there’s garbage collection on the RunLoop, but only in some cases (for explicitly init’d, alloc’d and retain’d objects) but the AutoRelease doesn’t happen in threads, which by the way NSTimer launches, so make some subpools, but be sure not to over-CFRelease lest you cause an ecxeption (and crash) down the line, but good luck w/ MallocDebug if you missed anything and need to track it down… Don’t I have better things to do with my brain cells?
Casting through contexts is just out of control. NSImage, CGImage, and CIImage? Really?
Get used to writing at least 10 LoC to do what seemingly should be a single easy action (or declaring something in at least two if not more files and sections). Coming from scripting languages, the amount of boiler plate is mind boggling
Also, as someone used to CPAN, PEAR, and PyPI, it’s also been interesting discovering how spoiled by the ease of third party libraries and how much less common and more effort it takes. Maybe I just haven’t gotten quite that far yet…

As a web developer, I’ve often complained about the crudity and lack of development and debugging tools, but having dipped my toe on the flip, I guess it’s tough all around. Application development seems to be dense, convoluted and, well, sometimes just plain masochistic.

It’s also interesting that for as many (and there are many) calls there are in the standard Frameworks, how equivalently difficult it seems to be to do anything that you *want* to do (this will be another near-future post where I talk at length about the current state of web “frameworks”).

But, who knows, maybe in a few months I’ll at this post and shake my head and wonder how I could ever be so confused.

Recently Reading

Posted on January 25, 2009January 25, 2009 by lhl

The middle of last year was pretty much completely dominated by politics for me. Memeorandum replaced Techmeme as my starting page, and TPM and FiveThirtyEight were at the top of my reading list. Since then, my attention has started floating back. Here are some of my recent faves (blogs that have been intersecting well with some of my current interests):

aaronland – for whatever reason (having more spare time? 🙂 I noticed myself reading more of Aaron’s excellent (and lengthy) essays this year. (geo, maps, photos)
tecznotes – another one of the blogs that I’ve been following for a while that I’ve been digging a lot more – probably also has to do w/ being able to get back into doing cool stuff. Mike’s everyoneiknowisdoingawesomeshit tag seems apt to mention here (visualization, maps)
Duke Listens! – there are a couple music technology blogs I’ve been following, but this one, by Sun researcher Paul Lamere I think has been the most consistently interesting (music, recommendations)
SmugBlog: Don MacAskill – I’ve spent more than my share of time the past couple years thinking about scaling, and it was nice to find a blog/community of people talking about some of the nuts and bolts (mysql, hardware, scaling)
Perspectives – James Hamilton (AWS) has also been publishing some great stuff along those lines, mostly around data center efficiency (data center, hardware, scaling)

I also now have a pretty reliable stream of AV stimulation through my Vimeo channels and groups. Not that there’s any shortage of interesting things – it always amazes me when people talk about being bored online – attention continues to be what’s in short supply for me, even now being able to set my own schedule.

I’ve been cranking away for the past few weeks on a new project, but hopefully I’ll have a chance this week to catch up w/ some posts, including some of the stuff I’ve been working on.

A Few Random Observations on Events

Posted on January 14, 2009January 17, 2009 by lhl

As one might expect, I have a few thoughts now and again about “events,” even if I have continued my life as a shut-in so far this year. That being said, some days are more event-filled than others (today for example there’s a dinner, drinks w/ an out of town friend, Larry Lessig’s last SF book reading, and a show I just found out about – all, unfortunately, happening at the same time tonight…).

In any case, it was the last event that I want to write about a bit since it was a bit of a serendipitous discovery, and is a good example at some of the gaps that still exist with event tools.

A couple days ago, I saw a pretty neat video featuring a new Electro Harmonix effects box. This demo was definitely a cut above the average music gear demo (this one is even better). Browsing around today, it turned out that Mark blogged about this video yesterday on Boing Boing, and that an EHX employee ended up posting some more info about the performers, including a link to their YouTube accounts. Their current video is entitled Show on Wednesday! and 10 seconds in, it turns out that it’s in San Francisco (the YouTube profile doesn’t have anything about the location). A quick check on the posting date (yesterday) confirmed that the show is in fact happening tonight.

The reason for this lengthy description of how I discovered this event is because it’s really quite a long (and fragile) chain of serendipitous events (particularly in clicking into the comment thread (which definitely wouldn’t have happened if I had caught this post later in my feed instead of randomly browsing on the site), and then choosing the right YouTube account (of three linked), and then clicking play on their new video to discover that they were local).

Now, of course, as any band of any sort would, they have a MySpace page. Which has a big graphic highlighting their show – and they have their show entered (and a MySpace blog post), so if someone happened directly on there, I guess they could find out about it.

At this point, I had enough information to enter it onto the old red and yellow, where I could continue to add to my copious notes on entry improvements (perhaps the topic of some future post).

Now, of course, since I’m an events geek, I decided to continue along this trail and the next stop was to the Red Devil Lounge’s site and more specifically, their calendar. It, like most other venue sites, is about par for the course, appearing to be hand generated in Dreamweaver. Interestingly, it does have an RSS feed, generated by a commercial desktop app no less (FeedForAll). The interesting (and somewhat amazing) thing about venue sites is that across the board, they haven’t really changed much in the past decade…

Now one interesting thing about the event is that I got the official title wrong (I didn’t change it on the Upcoming event – one of the things we (and no-one else) ever tackled was multiple representations/ownership of the same event). But, rather than go off on tangents about the minutiae of event modeling (there are a bunch of more interesting coinciding process issues with editing even of canonical entries as well), I did want to point out something that caught my eye.

Pirate Cat Radio‘s official link for the “Baghdad By The Bay Showcase” is actually an Upcoming RSS syndication link of the guy who runs the Baghdad By The Bay show, RICK!. Now, it’s pretty cool that Rick has a somewhat active account that has his radio show schedule (sorry recurring events never got better), but surprisingly, the actual physical show wasn’t listed (score one for Upcoming’s entry-dupe checking).

Now, since I don’t do this for a living anymore (and haven’t for a while now, so I’m out of the loop), so this isn’t really any sort of rigorous analysis, and there are some guys attacking the music side of things much more vigorously (for example, it looks like Rick added the concert to Sonic Living last month, and there’s a heckuva lot of Tour/Ticket related activity), so that area, while still incomplete is actually getting a lot more attention than others.

But I guess one of the things that struck me was how there’s lots of information out there, but it’s not particularly well connected. There are some pretty huge gaping holes and the “serendipity” feels more haphazard than gratifying, and well, it’s all just a lot of work.

Hmm, I’ll just end here. I think I started off wanting to talk more about interactions of calendaring/semi-private/public event planning/interactions and proactive discovery, but this is getting a bit long for a ramble, so maybe next time when the spirit moves me (i.e. when I’m avoiding real work).

Clipboard Copying in Flash 10

Posted on January 13, 2009January 13, 2009 by lhl

One of the things I noticed about the DevFormatter plugin I’m using was that the clipboard copying code was no longer working. This apparently is because of the new security upgrades in Flash 10 which now have additional user-initiated action (UIA) requirements for various functions, including System.setClipboard().

While inconvenient, especially since it’s somewhat commonly used, it was a necessary change due to some high-profile clickjacking attacks, and well, a good idea regardless, when you think about it.

Surprisingly, months after Flash 10’s release, it seems that neither the WP plugins I looked at, nor the most popular general syntax highlighter script seem to have fixed their clipboard functionality, as the workaround isn’t too onerous. Instead of the having a JS triggered Flash copy, just reverse it with a Flash button calling a JS function – not quite as elegant since you’ll need as many Flash buttons as you want copy triggers, but not too onerous. It’s been a while since I’ve done any Flash work, but luckily, it didn’t take very long at all.

Since this might prove useful to others, I’ve done this in AS 2.0 for more compatibility and made the package available here under a GPLv3 license: clipboard_copy.

Of course, if you want to create your own unencumbered version, the code is easy enough to create yourself. The JS call looks something like:

function clipboard_copy(id) { return document.getElementById(id).innerHTML; // or instead of innerHTML, you can get plain text: // [((document.all)? "innerText" : "textContent")] }

Note, you can have the clipboard JS function act on a selection if you’d like, but for my purposes (integration w/ code blocks, getting the text was better).

The Flash is similarly simple. Here’s the AS 2.0 event attachment and ExternalInterface:

// AS 2.0 Event copyButton.addEventListener("click",click); function click(event:Object):Void { var item = "testid"; var jsFunction:String = "clipboard_copy"; var returnValue:String = ExternalInterface.call(jsFunction, item).toString(); System.setClipboard(returnValue); }

Easy peezy.

A couple of notes:

If you’re testing, you’ll want to run from a web server, not the file system, otherwise you’ll get sandbox errors
Regular cross-domain rules also apply of course
I used the Button Component for my version, which is admittedly a bit fugly. You could in theory have a text-only Flash link that you subsequently styled w/ JS (i.e., to match font-family and font-size), but I’ll leave that as an exercise for the reader

Online Tools for A New Small Business

Posted on January 1, 2009March 5, 2009 by lhl

One of the interesting things I’ve been doing recently has been looking at support tools for running a new company. I remember Ev writing about this a couple years ago. This research was pretty new for me since none of the following services even existed when we started Upcoming (not that we had any need for most of these anyway; we were focused exclusively on building a cool app: our only capital cost was servers [offset by AdSense] and our burn rate was our cost of living).

Anyway, after a day or two of poking around, here’s a list of the top picks (and in some cases, worthy alternatives):

Google Apps – a no-brainer for email and document sharing. Unfortunately, while good for individual services, its functionality for even basic sharing is rudimentary to non-existent. Shared documents require manually sharing each document (no shared spaces) and there’s no concept of shared email (for handling shared support, customer service, etc.)
Price: free
Dropbox – Fully integrated w/ on the Desktop, up to 2GB. It just works.
Price: free
FogBugz On Demand – I’ve been using hosted FogBugz for a couple years now. It still has some UI rough edges (although less than JIRA, I suppose) and its Evidence-Based Scheduling is a unique (and awesome) feature. Also, it’ll hook up to email for handling support, which fills in that gap. So, we’re using it for Task, Issue, Effort, and Support Tracking.
Price: free (2 person Student and Startup Edition)
Xero – the international edition (they are New Zealand-based) of this Accounting service was released just a couple days ago, but so far I’ve been incredibly impressed by the functionality and polish. It’s far better than anything else we looked. Besides all the regular banking features, it also does Invoicing and Expense claims tracking. (Reading about the company itself is interesting – I guess there aren’t lots of NZ startups, and the fact that they did an early IPO means all their early growth numbers are public record).
Price: ~$25/mo (NZ$499/yr)
PipelineDeals – after reviewing all the big CRM tools (starting with Salesforce and SugarCRM) I was feeling pretty depressed – they’re all ridiculously bloated, clunky, and just pretty much unusable. I couldn’t imagine being forced to use anything like that on a daily basis. PipelineDeals was a breath of fresh air and supported everything we need for contact tracking as well as providing the best lead/sales tools that I found.
Price: $15/mo per user

One alternative worth highlighting is Relenta (Demo l/p:demo). It integrates a shared email system with contact management (it also supports pretty robust email campaigns/newsletters) with support for canned response, auto-responders, role filtering, etc. I remember talking about an app like this w/ some friends years ago, and it’s a great implementation. It wasn’t a good fit for us since we needed something for, well, selling stuff (a surprise, I know), but if your needs are more customer support focused, be sure to take a look at Relenta. I also looked at Highrise, which is slick, but found it to be pretty shallow.
MailChimp – although CampaignMonitor is nice, its per/campaign pricing model didn’t make a lot of sense for our use. Mailchimp’s more flexible pricing (which includes monthly pricing) was a better fit, and support for segmentation and A/B testing I guess makes up for individual stats being an add-on. (Vertical Response is another service that has some interesting services like Online Surveys and Snail Mail Postcards, so that might be worth looking into, but at least by my Twitter @replies, MailChimp won out unanimously).
Price: $10/mo (0-500 subscribers)

Lastly, while Silicon Valley Bank got a lot of love for being the bank for startups, for the day to day business needs (bill/direct payments, business taxes, payroll, merchant account) it looks like Wells Fargo Small Business is a much better fit. Other payroll options include SurePayroll (which used to do WF’s payroll) and PayCycle, although I’m not sure there’s enough of a cost difference to justify the extra hassle. That being said, it might be worthwhile to use Costco/Elavon Merchant Processing.

There are a few other things that we’ll probably end up trying out (UserVoice, GetSatisfaction, maybe some MOO cards) but I think this pretty much covers most (if not all) of our business needs. Anything I’m missing? Or are there any favorite apps/services that people like? Feel free to comment.

Python os.walk() vs ls and find

Posted on December 13, 2008March 5, 2009 by lhl

Since I wasn’t able to find a file cataloguer and dupe-finding app that quite fit my needs (for the Mac, DiskTracker was pretty close, I’d definitely recommend that of all the apps I tried), I started to code some stuff up. One of the things I was interested in starting out was how well using Python’s os.walk() (and os.lstat())would perform against ls. I threw in find while I was there. Here are the results for a few hundred-thousand files, the relative speed which was consistent over a few runs:

python (44M, 266173 lines)
---
real  0m54.003s
user  0m18.982s
sys 0m19.972s

ls (35M, 724416 lines)
---
real  0m45.994s
user  0m9.316s
sys 0m20.204s

find (36M, 266174 lines)
---
real  1m42.944s
user  0m1.434s
sys 0m9.416s

The Python code uses the most CPU-time but is still I/O bound and is negligibly slower in real-time than ls. The options I used for ls were -aAlR, which apparently produces output with lots of line breaks, but ends up being smaller than find‘s single-line, full-path output. The find was really a file-count sanity check (the 1 difference from the Python script is because find lists itself to start with). Using Python’s os lib has the advantage of returning all the attributes I need w/o the need for additional parsing, and since the performance is fine, I’ll be using that. So, just thought it’d be worth sharing these results for anyone who needs to process a fair number of files (I’ll be processing I’m guessing in the ballpark of 2M files (3-4TB of data?) across about a dozen NAS, DAS, and removable drives. Obviously, if you’re processing a very large number, you may want a different approach.

Firefox 3, Developing and Browsing

Posted on June 11, 2008March 5, 2009 by lhl

I tend to leave a lot of tabs open when I’m browsing (mostly because of a lack of good research/organizing tools – but that’s another day’s post). Right now for example, I have 231 tabs open in 36 windows (I’ve been working on a counting extension which I should be releasing soon).

Firefox 3 has been great for my style of browsing – the Session Restore is more reliable, and overall it’s much zippier and generally more stable. The last part however comes with a caveat: I’ve found that while FF3 is more stable on a fresh install, the instability caused by extensions has also greatly increased. A few weeks ago, after it got particularly bad, I made a fresh profile, moved my data over, and now just have Adblock Plus and Filterset.G installed. This has seemed to work pretty well for my day to day browsing.

But what about development? Glad you asked. I now have a separate copy of Firefox 3 in my Applications folder named “Firefox Dev” that runs a separate “Development” profile – John Resig has a good writeup on how to set this up. To help reduce confusion when app switching, I also replaced the ICNS file with an alternate set (copy the new firefox.icns file into FFDev.app/Contents/Resources/).

And now I’m free to run lots of leaky/beta (but useful for dev) extensions in a separate process:

Adblock Filterset.G Updater 0.3.1.3
Adblock Plus 0.7.5.4
ColorZilla 1.9
Firebug 1.2.0a30X
Greasemonkey 0.8.20080505.0
InfoLister 0.10
keyconfig 20060828.1
Live HTTP headers 0.14
Modify Headers 0.6.4
SwitchProxy Tool 1.4.1
Tamper Data 10.0.3
Web Developer 1.1.6
YSlow 0.9.5b1

(Listing generated courtesy of InfoLister)

Rearchitecting Twitter: Brought to You By the 17th Letter of the Alphabet

Posted on May 24, 2008March 5, 2009 by lhl

Since it seemed to be the thing to do, I sat down for about an hour Friday afternoon and thought about how I’d architect a Twitter-like system. And, after a day of hanging out and movie watching, and since Twitter went down again while I was twittering (with a more detailed explanation: “Twitter is currently down for database replication catchup.”; see also) I thought I’d share what I came up with — notably, since my design doesn’t really have much DB replication involved in it.

Now, to prefix, this proposal is orthogonal the issue of whether statuscasts should be decentralized and what that protocol should look like (yes, they should be, and XMPP, respectively). That is, any decentralized system would inevitably require large-scale service providers and aggregators, getting you back to the architecture problem.
So now onto the meat of it.

As Alex’s post mentions (but its worth reiterating), at its core Twitter is two primary components: a message routing system, where updates are received and processed, and a message delivery system, where updates are delivered to the appropriate message queues (followers). Privacy, device routing, groups, filtering, and triggered processing are additional considerations (only the first two are currently implemented in Twitter).

Now this type of system sounds familiar, doesn’t it? What we’re looking at most closely resembles a very large email system with a few additional notifications on reception and delivery, and being more broadcast oriented (every message includes lots of CCs and inboxes are potentially viewable by many). Large email systems are hard, but by no means impossible, especially if you have lots of money to throw at it (*ahem* Yahoo!, Microsoft, Google).

Now, how would you might you go about designing such a thing on the cheap w/ modern technologies? Here’s the general gist of how I’d try it:

Receive queue
- Receive queue server – this cluster won’t hit limits for a while
- Canoncial store – the only bit that may be DB-based, although I’d pick one of the new fancy-schmancy non-relational data stores on a DFS; mostly writes, you’d very rarely need to query (only if you say had to check-point and rebuild queues based on something disastrous happening or profile changes). You’d split the User and Message stores of course
- Memory-based attribute lookups for generating delivery queue items
- Hooks for receive filters/actions
Delivery queues – separate queues for those w/ large followers/following), separate queues also for high priority/premium customers
- Full messages delivered into DFS-based per-user inboxes (a recent mbox, then date-windowed mboxes generated lazily – mboxes are particularly good w/ cheap appends)
- Write-forward only (deletes either appended or written to a separate list and applied on display)
- Hooks for delivery filters/actions (ie…)
Additional queues for alternate routing (IM, SMS delivery, etc) called by deliver hooks
The Web and API is standard caching, perhaps with some fanciness on updating stale views (queues, more queues!)

Note that this architecture practically never touches the DB, is almost completely asynchronous, and shouldn’t have a SPOF – that is, you should never get service interruption, just staleness until things clear out. Also, when components hotspot, they can be optimized individually (lots of ways to do it, probably the first of which is to create buffers for bundled writes and larger queue windows, or simply deferring writes to no more than once-a-minute or something. You can also add more queues and levels of queues/classes.)

The nice things about this is that technologically, the main thing you have to put together that isn’t out there is a good consistently hashed/HA queue cluster. The only other bit of fanciness is a good DFS. MogileFS is more mature, although HDFS has the momentum (and perhaps, one day soon, atomic appends *grumble* *grumble*).

Now, that’s not to say there wouldn’t be a lot of elbow grease to do it, especially for the loads of instrumentation you’d want to monitor it all, and that there aren’t clever ways to save on disk space (certainly I know for sure at least two of the big three mail providers are doing smart things with their message stores), but creating a system like this to get to Internet scale is pretty doable. Of course, the fun part would be to test the design with a realistic load distribution…

random($foo)

Category: Web