Kindle: +1 Week RFEs

Busy with pre-moving tasks, but thought I’d post a quick followup on the Kindle. I’ve bought 3 book so far (I’m keeping a spreadsheet, so far I’ve saved $33.22, or 47.43% off of buying the physical books off Amazon – one of them was also out of stock, so that was an extra bonus; When I get a chance I’ll have to compare the book buying rate to the past year). I also sent out an email to kindle-feedback. Here are the points of improvement I included (specifically software, and not the industrial design, which I’m sure they’ve heard ad nauseum):

  1. One of the first things that I did even before getting the Kindle was to queue up a bunch of samples. This is great, but even with this limited amount of titles, it’s pretty hard to find the title I’m looking for. My first set of suggestions are all related to library management:
    • A smarter dashboard style listing would be nice. For example, you could have the Home screen be split in two, with a “Recently Read” and “Newly Arrived” listing. Paging to the next page would get you a traditional listing.
    • Although, while full-text indexing may be out of the question, allowing searching/filtering by limited metadata from the “Show and Sort” menu, or, if there’s a dashboard, having a search/filter box accessible at the bottom of the first page would be a great way to allow a user to quickly find a title from a large (100+ volume) library.
    • Along the lines of organizing a bookshelf (the potential storage capacity, even without any additional SD storage far outstrips the current Content Lister’s ability to manage), a number of improvements would make things better:
        Tagging of titles (and allow listing by tag/section, filtering by tag)

      • Archiving – for example, read books
      • Read % / Status – related to the former, but being able to filter or organize by which books you’re currently reading, haven’t started, and have finished – the metadata is all there, but it’s not being displayed
    • Along the lines of metadata and display, the current separation of listing and managing seems unnecessary. One alternative, especially if you add a second smaller line that contains status and other metadata is to have each book have two click areas (the current 3-segment tall title, which remains the same – clicking opens the book, and a second 1-2 segment tall status line which brings up a context line — note, this space already exists, so it wouldn’t even affect the # of books that could be listed by much…)
  2. A related request would be for storage of a reading journal — this data is stored by the device (it autobookmarks and knows which books were last opened, how long, etc.) and, at least according to the Kindle TOS is being reported to Amazon.com. It seems like a big opportunity is being missed by not having a user-accessible journal (the Wii is a good example of what this might look like to the end user).
  3. Although I’m not a fan of DRM, I really like what you guys are doing with the media management of purchased books. This is very compelling, although I’m disappointed that it doesn’t extend to periodicals. There are some periodicals I’d be open to subscribing to (any hope of getting The Economist?), but that’s definitely a sticking point to me. I like to annotate and file articles of interest – the latter functionality doesn’t seem to exist at all, and the former works, although it’s too bad that there’s no way to better manage the annotations or get it off the device wirelessly.
  4. In terms of legibility, if there were different fonts or line-height adjustment, that’d be quite welcome. This is especially noticeable w/ the experimental web browser.
  5. I very much like the ability to make annotations, especially when reading technical papers, essays/articles (unfortunately, the conversion process is somewhat lackluster/tedious – when I tried sending an HTML file to the kindle.com address, it converted it as plain text (tags in the page galore), and since I’m on a Mac, I had to use a third party toolchain (Mobiperl). Err, in any case, my suggestion for annotations is fairly simple – when viewing/editing an annotation, it currently requires a second click to show it. I can (somewhat) understand a second click to edit, but wouldn’t it be better to just show the note (and menu) when one clicks on a line w/ a note?
  6. Along the lines of notetaking, I’ve taken to carrying around the Kindle when I’m out and about – there’s lots of times where it’d be useful to use it to type a quick note, but there isn’t any way to do that in a standalone manner. Lists are another potentially useful app, which leads me to ask…
  7. Is there any particular reason there isn’t an SDK available? Is there one planned? It seems like there’s a lot of potential for Kindle’s functionality to be extended, whether in terms of additional apps, or for things related its core capabilities. I can think of a half dozen things off the top of my head that would do a lot, I think, to help get a random person to plunk down $360 on the device. The e-book space is littered with devices that require enormous amounts of low-level effort just to get to a point where useful apps can be developed (these, of course are very different skillsets, so rarely has anything exciting to end-users ever happened). It seems like the Kindle is well positioned to be different in this regard. I know there are potential pitfalls (although, having been intimately involved in making similar design decisions [open APIs and web services], somewhat overblown since it’d be easy enough to control via dev keys or just by the fact that without easy/automatic distribution, the userbase is self-limiting), but I believe the rewards are manifold, and I hope you guys at least give it a try.

There’s one additional issue that I didn’t mail in that’s been getting on my nerves – when buying a book, it comes down the pipe quite quickly, and it’s a simple (almost one click) process that you go through once you get to the end of the sample, but it doesn’t replace the sample chapters, and in fact starts you off all over again. IMO, the ideal experience would be to have some additional pages unlocked so you can continue reading, then, when the full book has finished downloading, to port your annotations, remove the sample file, and open the full book at the location where you left off from the sample. True that kind of polish is typically missing from 1.0 products, but it’s usually the difference between the magical product you love and… well, everything else.

OK, I Got a Kindle

Over the weekend, I broke down and ordered a Kindle (which arrived today). There are lots of good reasons not to get one. Heck, I wrote a screed about it myself last year. (What? Speak up, I can’t hear you over the cognitive dissonance.)

So, why’d I end up getting one? Ironically for a “gadget” purchase, it was the practical aspect that finally pushed me over: I’ll be out of town the next few months and it’ll be inconvenient and impractical for me to buy/store books, or have access to my bookshelf.

While I’m strongly against DRM, I’m also a big proponent of what Amazon is doing with their yourmedialibrary initiative. Anyone whose heard my spiel on digital media knows that I’m a big proponent of media management as a primary value-add that makes paying for digital media worthwhile. As we accrue more and more digital stuff, having a convenient service that stores, tracks, organizes, and delivers it when and where we want it is going to be increasingly important (and necessary).

I have a lot of books that I really like (and that are quite nicely formatted and probably won’t be replaced anytime soon by eBooks) but looking at the couple hundred volumes on my bookshelves, I’m having a hard time finding many that have truly sentimental value. I think at the end of the day, I could cut down my shelf by at least two-thirds, maybe more. The upshot, besides much easier future moving, is that I’d probably use the books much more when the text of my library is fully searchable and easily annotable.

(Obviously, this will probably be different for everyone, but I think more and more will start thinking like this, especially as digital music and video take over. I have about 100 DVDs. None have been touched in months. And the only time I touch the albums I’ve bought are to rip them.)

Kindle and iLiad

And now for some talk about the devices. This will be somewhat more of an iRex exit review than a Kindle review (since I just got the latter), but irrespective, I think the former will give some insight into what I’m looking for and expecting of the Kindle.

In terms of the actual reading experience, having had the iLiad e-ink device since its release (Summer 2006), I knew what to expect of the screen. In comparison, the Kindle’s screen is smaller (6″ vs 8″ diagonal), very slightly denser (167ppi vs 160ppi), and has worse grayscale (4 vs 16 shade). It is slightly faster refreshing and a little brighter (40% reflectance vs 32-35%) thanks to a newer Vizplex screen, but overall it’s very similar. The serifed font on the Kindle is heavier and wider, but also better hinted than the iLiad, so while it fits even less text on the page, it may be a bit more legible. If you’ve never seen an e-ink screen, it’s really worth doing. You don’t really won’t understand the fuss until you do. It’s much easier on the eyes than any backlit display, and much more “solid” than any reflective LCD. It’s a flat matte plastic that’s hard to describe. The closest thing I can liken it to is that it looks like the fake screns on the computer stand-ins in office furniture displays.

The iLiad supports more formats, of particular interest being PDF (it runs a modified version of xpdf) and has had a fair amount of hacking done to it. It also has built-in wifi. Unfortunately, a number of issues conspire to make these advantages moot. (Actually, there’s one main one which I’ll get to last.)

Even though the screen is larger than the Kindle’s, it’s still comparatively small (about A6) so A4 PDFs aren’t very legible (the zooming doesn’t work well). This means that it’s not very good for reading technical papers on, and that most real reading (books, etc.) need to go through a reformatting/conversion process. If you’ve dealt with PDFs, you know how difficult that can be, since PDFs aren’t semantic, but layout based by nature. HTML files are an option, but the built in browser doesn’t paginate (or remember your position, or font size for that matter), so if you’re looking to read a book… well, good luck. And while the wifi sounds great in theory, in reality, there’s never been any way to load documents on wirelessly.

All these (and the many other design flaws, both in the hardware and software) could be overlooked or worked around if not for the one major, MAJOR flaw that made the iLiad useless for me – it never had any working power management. That’s right, no sleep, suspend, or hibernate. The lowest power screen in the world (which, come to think of it, these e-ink screens are) doesn’t help one bit in that case. Despite many promises to the contrary, iRex has never been able to address that problem.

Now, granted, as an early adopter, I don’t expect things to always work, but unfortunately, despite the original claims of long battery life (made in page turns, with no hint that it’d be constantly sucking juice), the device barely makes it through a few hours, not even a full day. This is a bit mystifying considering the success that OLPC and Amazon have had with instant suspends. Even worse, there’s no sleep or hibernate, so a full power cycle is required before reading. Surprisingly, they’ve released additional products (presumably aimed at real consumers) that haven’t addressed the problem at all.

To give you an idea of what this means: the iLiad took 49 seconds to boot up, and then another 14 seconds to load up the PDF. That’s over a full minute just to do the equivalent of opening a book up. I don’t think they mention that in the “features” section of their marketing. Considering that the average cell phone wakes up instantly, and heck, my laptop is up in 5s, this failing is really just incomprehensible to me.

This aspect of course was the Kindle’s easiest sell. The reviews and reports give it an average of 4-5 days of battery-life w/ the wireless off, and 1-2 days with it on. More importantly, resuming from suspend to where you left off takes between 3-4 seconds. That’s not too shabby (opening a new book from the menu also takes about 3-4 seconds). That there is basically the difference between a daily-use device vs. an over-expensive toy.

In terms of data loading, the Kindle has both the email gateway which I’ve tested, and is certainly convenient (after giving it some thought I’m pretty sanguine about using it since I’m pretty sure that the liability implications of keeping/tracking the files sent trump any value they might get from storing it for future data mining), and it simply mounts as an external drive when connected via mini-USB (another failing of the iLiad is its ridiculously large and awkward dongle attachment for power, USB, and network connectivity).

While there is no official Mobipocket software for the Mac, there is an alpha version of a linux tool, and more importantly, an open source set of tools called Mobiperl that seems to work well.

All in all, it’s doubtful that I’ll ever touch my iLiad again (well, we’ll see how OpenInkpot does), but from my limited time playing around with the Kindle so far, it looks like it should do the job that the iLiad never could.

Which isn’t to say it’s perfect. Even with my limited usage, it’s obvious there’s definitely lots that could be improved (for example, the content lister is pretty impossible for organizing anything close to the storage limit – it’s just a straight file listing with no ability to organize (tag, search, look up) or way to keep track of the the read/unread status). And yes, the industrial design is heinous – even ignoring the aesthetics, it’s pretty much impossible to pick it up without accidentally turning the page (death by 700ms cuts?). And it’d be nice if there was a way to open up or work on the device itself (igorsk has been the only person who’s done anything of note so far), but for now, I’ll be happy with having a device that should be usable for what I got it for.

Leonard for Obama ’08

Friends and followers of my blog know that I’ve been pretty vocal and proactive in my support for Barack Obama. A couple weeks ago, a call went out for web geeks, and I threw my hat in the ring.

With the paperwork all sent in, this is just letting people know that I’ll be heading out to Boston at the end of the month to work full-time on BarackObama.com and related shenanigans. I believe they’re looking for more people, so if you have an interest in jumping on, drop an app. Also, I know that there are lots of friends that are enthusiastic but can’t necessarily drop what they’re doing… drop me a line, I have schemes1.

1: Schemes evolving as I begin reviewing FEC regulations.

Google Spreadsheet, My Asset Allocation, Investing 101

I recently started porting my spreadsheets over to Google Spreadsheets. I’d long resisted Google Docs, but I got to say that it’s been pretty painless. The imports have seemed to work, although it’s had limited success on fancier formatting and on charts and I’ve run into some annoyances (it’d be really nice if there was a “view source” mode where you could just edit formulas, or a mass replace to change a specific column in the formula, and a paste w/o formatting option). There are good things that have made it worthwhile, like easy collaboration/sharing and painless publishing.

Among those spreadsheets I’ve been importing are some of my financial docs. One nice function is GoogleFinance() which not only lets you get current, but also historical stock data. Random tidbit: when returning historical quotes, the GoogleFinance() function wants to dump an array into multiple cells. Here for example is how you’d return just to the price of VTSMX from January 1st:

=INDEX(GoogleFinance("VTSMX", "close", "1/1/2008"), 2, 2)

A couple years ago, it occurred to me that learning some basic finance/investing (which sadly, isn’t really taught at all anywhere in the school system) might not be a bad idea. Except for the odd post or two, I haven’t published too much about it, even though my interest has taken me far into “financial geek” territory (like reading the Journal of Indexes for fun). My original plan was to kick off a financial section for my blog, but since that’s going to have to wait at least for a few months, I though I’d at least get the ball rolling by publishing the target asset allocation that I ended up with (and presumably, will be sticking to for a long while):

Most of it’s pretty standard, with a few exceptions.

  • I’ve sliced and diced because a decade plus after Fama and French published on the Three Factor Model, it still seems to hold up (suggesting that it’s not just a matter of market inefficiency).
  • I’ve added REITs and CCFs as major asset classes for diversification purposes. The larger than average numbers and addition of CCFs are influenced by a lot of reading. Two of the strongest articles on that are: The Benefits of Low Correlation and The Rewards of Multiple-Asset-Class Investing. This year has been a good illustration of how adding some CCFs can decrease volatility. I’ve added some “standard” plans at the bottom of the spreadsheet so you can see how the performance compares.
  • Lastly, I’m over-weighted on international, at least by conventional wisdom (although, if you wanted a true index, you’d actually not be far off if you invested by market cap). Financial advisors usually recommend something like a 70/30 or 80/20 split of domestic investment because historically the US market has been safer, more efficient, and better performing, but I don’t really think that’s as true moving forward. Also, since it’s as likely that I’ll be living/traveling internationally, the relative buying power argument for overweighting your local market doesn’t make as much sense for me either.
  • There aren’t really many (any?) good general purpose tools for keeping track of your investments long term (that generate personal rate of return factoring taxes, dividends, etc). That might be a neat little tool to design. I started writing a scraper for my Vanguard account, so I might have something useful sooner rather than later there.

And yes, ultimately, the exact percentages of your asset allocation aren’t as important as picking something reasonable, sticking with it, and rebalancing regularly. That pretty much sums up what I’ve learned and the extent of my future maintenance. Although in case I don’t get around to elaborating for a while, here are the main takeaways:

  • Get familiar with Modern Portfolio Theory, particularly the Random Walk – Very few of the most dedicated, privileged, and informed investors consistently beat the market. You are unlikely to do much better.
  • Costs – One of the few things that you can control (besides Asset Allocation) is costs (expense ratios, to some degree taxes). John Bogle is of course the most famous champion of this. Here’s a more recent article of his talking about AA and costs (along w/ the standard charts showing aggregate differences)
  • Losses are worse than gains – this is a simple bit of mathematics, but it’s worth elaborating on as it’s at the center of MPT and diversification. Basically, if you lose 50% you need to gain 100% just to get back to where you started. Cumulative losses are even worse, as compounding (Einstein’s quoted “most powerful force in the universe”) cuts both ways. A tool like FIRECalc will generate very pretty graphs the range of badness depending on when you start withdrawing money.
  • Rebalance – Basically it’s a rule that forces you to buy low and sell high. An annual or semi-annual rebalance is really the only time you really need to look at how your investments are doing. The rest is really just watching grass grow (or giving yourself ulcers) unless you’re doing some crazy tactical asset allocation scheme. There are lots of different articles discussing when/how often to do this, but not much conclusive that isn’t dependent on backtesting. It seems like picking one year is a good compromise between increased risk (as your AA skews) and minimizing transaction costs.

These, along w/ the notes I mentioned above, I think really cover most of the basics. The only other really big thing is to learn about the tax implications (CGT, tax efficiency, and tax loss harvesting) and why tax-deferred investing is your friend. If there’s interest, I can post a spreadsheet that illustrates some of the latter.

Oh, also, listen to Warren Buffett.

Firefox 3, Developing and Browsing


I tend to leave a lot of tabs open when I’m browsing (mostly because of a lack of good research/organizing tools – but that’s another day’s post). Right now for example, I have 231 tabs open in 36 windows (I’ve been working on a counting extension which I should be releasing soon).

Firefox 3 has been great for my style of browsing – the Session Restore is more reliable, and overall it’s much zippier and generally more stable. The last part however comes with a caveat: I’ve found that while FF3 is more stable on a fresh install, the instability caused by extensions has also greatly increased. A few weeks ago, after it got particularly bad, I made a fresh profile, moved my data over, and now just have Adblock Plus and Filterset.G installed. This has seemed to work pretty well for my day to day browsing.

But what about development? Glad you asked. I now have a separate copy of Firefox 3 in my Applications folder named “Firefox Dev” that runs a separate “Development” profile – John Resig has a good writeup on how to set this up. To help reduce confusion when app switching, I also replaced the ICNS file with an alternate set (copy the new firefox.icns file into FFDev.app/Contents/Resources/).

And now I’m free to run lots of leaky/beta (but useful for dev) extensions in a separate process:

(Listing generated courtesy of InfoLister)

Electric Cars, Tilting At Windmills

Philip Greenspun (yeah, that one) recently wrote a post on the cost of converting the entire U.S. to electric cars that has kicked off a pretty interesting discussion. I ended up doing my own back-of-the-napkin calculations which pleasantly appear to corroborate with what Brad Templeton (yeah, that one) posted at the same time.

In chasing down the numbers (lots and lots of searching), and reading the comments (and looking at some other recent discussions, like the recent O’Reilly search/platform posts) I was reminded how lame the technology currently is in terms of online deliberation tools. Even simple things like representing citations, agreement, rebuttals, etc. aren’t supported at all. It’d be interesting to try to bring some of the research and projects being done on deliberative software to the web at large. (What’s interesting is that 5+ years since I was last really into this space, nothing usable has really come out/caught on… that in itself is worth further exploration.)

This conversation also got me to finally watch Who Killed The Electric Car? and really dig in to the history and the latest stuff going on with EVs and PHEVs. There’s a lot there, which I’ll eventually be posting (again, pointing to complete dearth of good knowledge collection/research tools). Wherefore art thou Gobbler? (That’s a rhetorical question. teachers.yahoo.com got shit-canned and nothing ever came out of the gobbler despite it’s obvious usefulness just about everywhere.)

What was sad about the documentary was that it highlighted a true failure of collective action. The “climax” of the doc films a scene that takes place in February/March 2005 (modern times! with Internet!) when GM sends the last of the EV1s sitting in a lot to destruction while protesters try to stop them. Despite having collected $1.9M to try to buy back the cars, keeping vigil for almost a month, and getting arrested for civil disobedience, GM “prevails” (apparently, even now there continues to be bizarre controversy over the few remaining EV1s). In the larger scheme of things, that particular battle was lost years ago, when the CARB regulations were overturned (this actually appears to be continuing battle), and GM’s “victory” was phyrric, at best (the ironies are many-fold, like GM selling their Ovonics battery stake to Chevron, which are now being used in Toyota Prius’ or that once again, oil prices are killing Detroit).

Well, it’s easy to get distracted by schadenfreude, but here we are, 30 years later, with no progress made, and the evidence now abundantly clear (and getting consistently worse) that our energy consumption habits are unsustainable and likely to bite us in the ass sooner, rather than later. From this perspective, GM and the whole auto industry were just a bunch of marks, no less bamboozled or inveigled than anyone else. After all, they’re simply working within their short-term economic self-interest. The ten-year outlook doesn’t even register in the 10Q, much less the “fate of the world” or the “consequences for your grand-children.” In economic terms, those would be externalities – not their problem, and not a cost they they would have to bear. This, I believe applies doubly so for the Oil Industry, which I’ve concluded are the true mustache twirling villians (well, probably the most apt comparison would be to the tobacco companies, although the Oil Industry dwarfs them, hence the caps). Quoting myself:

It seems to me, and I’m sort of surprised that others haven’t mentioned it, is that the real reason this is a thought experiment is because the the government and the drivers aren’t the sole (or even the primary) actors, but rather the oil and related industries. The combined market cap of the oil & gas industry is just shy of $1.9 Trillion – it’s larger than #2 (drug manufacturers) and #3 (central banking) combined. Exxon Mobile (XOM) alone, is $480B – larger than the combined market cap of all auto manufacturers ($381B). XOM currently has an EBITDA of $76.5B. It continues to post record profits as oil prices have gone up (currently 35.4% quarterly revenue and 17.3% quarterly profit growth). The profits they stand to reap over the next several decades (which is the timeframe – remember XOM started as Standard Oil back in 1870) is in the trillions.

Now what might be a most interesting continuation of this thought experiment is that, in light of this sort of economic landscape and these economic motivations, what would a transition plan to a saner automotive and energy system look like? How would you make the dominoes fall and what would be involved? Hmm…

I’ve been giving it some thought, and it’s a tough problem. It seems that there’s a number of strings that could/should be tugged to sort of unravel things. What worries me is that while reasonably, it would seem that we should all be taking this quite seriously (after all, no matter how rich you are, global environmental problems would end up affecting you and your progeny), it’s quite plausible that as individuals, a society, a species, we just don’t have the mental faculties or social organization to rationally/cohesively react to global/species-affecting crises.

That’s one way of saying that the technology of social augmentation and collective action may have a key role to play in trying to catch us up so that we can be smart enough fixing our problems as we have been dumb in causing them.

(For those looking for Don Quixote references, sorry, I was just using the phrase as a way to sum up the gist of the second part of this post. I’ll change the title if I can think of something more apt or descriptive.)

Rearchitecting Twitter: Brought to You By the 17th Letter of the Alphabet

Since it seemed to be the thing to do, I sat down for about an hour Friday afternoon and thought about how I’d architect a Twitter-like system. And, after a day of hanging out and movie watching, and since Twitter went down again while I was twittering (with a more detailed explanation: “Twitter is currently down for database replication catchup.”; see also) I thought I’d share what I came up with — notably, since my design doesn’t really have much DB replication involved in it.

Now, to prefix, this proposal is orthogonal the issue of whether statuscasts should be decentralized and what that protocol should look like (yes, they should be, and XMPP, respectively). That is, any decentralized system would inevitably require large-scale service providers and aggregators, getting you back to the architecture problem.
So now onto the meat of it.

As Alex’s post mentions (but its worth reiterating), at its core Twitter is two primary components: a message routing system, where updates are received and processed, and a message delivery system, where updates are delivered to the appropriate message queues (followers). Privacy, device routing, groups, filtering, and triggered processing are additional considerations (only the first two are currently implemented in Twitter).

Now this type of system sounds familiar, doesn’t it? What we’re looking at most closely resembles a very large email system with a few additional notifications on reception and delivery, and being more broadcast oriented (every message includes lots of CCs and inboxes are potentially viewable by many). Large email systems are hard, but by no means impossible, especially if you have lots of money to throw at it (*ahem* Yahoo!, Microsoft, Google).

Now, how would you might you go about designing such a thing on the cheap w/ modern technologies? Here’s the general gist of how I’d try it:

  • Receive queue
    • Receive queue server – this cluster won’t hit limits for a while
    • Canoncial store – the only bit that may be DB-based, although I’d pick one of the new fancy-schmancy non-relational data stores on a DFS; mostly writes, you’d very rarely need to query (only if you say had to check-point and rebuild queues based on something disastrous happening or profile changes). You’d split the User and Message stores of course
    • Memory-based attribute lookups for generating delivery queue items
    • Hooks for receive filters/actions
  • Delivery queues – separate queues for those w/ large followers/following), separate queues also for high priority/premium customers
    • Full messages delivered into DFS-based per-user inboxes (a recent mbox, then date-windowed mboxes generated lazily – mboxes are particularly good w/ cheap appends)
    • Write-forward only (deletes either appended or written to a separate list and applied on display)
    • Hooks for delivery filters/actions (ie…)
  • Additional queues for alternate routing (IM, SMS delivery, etc) called by deliver hooks
  • The Web and API is standard caching, perhaps with some fanciness on updating stale views (queues, more queues!)

Note that this architecture practically never touches the DB, is almost completely asynchronous, and shouldn’t have a SPOF – that is, you should never get service interruption, just staleness until things clear out. Also, when components hotspot, they can be optimized individually (lots of ways to do it, probably the first of which is to create buffers for bundled writes and larger queue windows, or simply deferring writes to no more than once-a-minute or something. You can also add more queues and levels of queues/classes.)

The nice things about this is that technologically, the main thing you have to put together that isn’t out there is a good consistently hashed/HA queue cluster. The only other bit of fanciness is a good DFS. MogileFS is more mature, although HDFS has the momentum (and perhaps, one day soon, atomic appends *grumble* *grumble*).

Now, that’s not to say there wouldn’t be a lot of elbow grease to do it, especially for the loads of instrumentation you’d want to monitor it all, and that there aren’t clever ways to save on disk space (certainly I know for sure at least two of the big three mail providers are doing smart things with their message stores), but creating a system like this to get to Internet scale is pretty doable. Of course, the fun part would be to test the design with a realistic load distribution…

Blueball-o-rama

After the craziness last week (Where 2.0, WhereCamp, and a GAE Hackathon in between), I was looking forward to taking a breather, but have instead jumped headlong into working on some much delayed music hacking and getting serious (along with self-imposed deadlines) with the Objective-C area. I’m also catching up on publishing stuff from last week, so here’s the summary of my Bluetooth work.

nodebox.py
As Brady mentioned in his Radar writeup, Blueball came primarily out of discussions on how to do an interesting Fireball-like service/tool for a single-track, stuck-in-the-middle-of-nowhere (sorry Burlingame) conference. Also, my desire for Brady to say “blueball” on stage. (score!) Fireball and Blueball are also both parts of a larger track that I’m exploring. It may take a while, but hopefully something interesting will start emerging.

I had a session on Proximity and Relative Location at WhereCamp, where I stepped through my code (really simple collectors, very half-assed visualizations running a simple spring graph) and talked a bit about the useful things that can be done with this sort of sensing.

The particularly interesting bits (IMO) are in applying the type of thinking that was being done on Bluetooth scatternets a few years back on patching together “piconets.” That is, by stitching together the partial meshes, you can get very pull out all sorts of transitive (inferred) properties. There are of course visualizations and pattern extraction you can do on that, but by matching the relative with the absolutes, you can get far wider coverage for LBS and related services. And of course, you can do your own reality mining on social connections when you start relating devices to people.


blueball v1 from lhl on Vimeo.