Google Spreadsheet, My Asset Allocation, Investing 101

I recently started porting my spreadsheets over to Google Spreadsheets. I’d long resisted Google Docs, but I got to say that it’s been pretty painless. The imports have seemed to work, although it’s had limited success on fancier formatting and on charts and I’ve run into some annoyances (it’d be really nice if there was a “view source” mode where you could just edit formulas, or a mass replace to change a specific column in the formula, and a paste w/o formatting option). There are good things that have made it worthwhile, like easy collaboration/sharing and painless publishing.

Among those spreadsheets I’ve been importing are some of my financial docs. One nice function is GoogleFinance() which not only lets you get current, but also historical stock data. Random tidbit: when returning historical quotes, the GoogleFinance() function wants to dump an array into multiple cells. Here for example is how you’d return just to the price of VTSMX from January 1st:

=INDEX(GoogleFinance("VTSMX", "close", "1/1/2008"), 2, 2)

A couple years ago, it occurred to me that learning some basic finance/investing (which sadly, isn’t really taught at all anywhere in the school system) might not be a bad idea. Except for the odd post or two, I haven’t published too much about it, even though my interest has taken me far into “financial geek” territory (like reading the Journal of Indexes for fun). My original plan was to kick off a financial section for my blog, but since that’s going to have to wait at least for a few months, I though I’d at least get the ball rolling by publishing the target asset allocation that I ended up with (and presumably, will be sticking to for a long while):

Most of it’s pretty standard, with a few exceptions.

  • I’ve sliced and diced because a decade plus after Fama and French published on the Three Factor Model, it still seems to hold up (suggesting that it’s not just a matter of market inefficiency).
  • I’ve added REITs and CCFs as major asset classes for diversification purposes. The larger than average numbers and addition of CCFs are influenced by a lot of reading. Two of the strongest articles on that are: The Benefits of Low Correlation and The Rewards of Multiple-Asset-Class Investing. This year has been a good illustration of how adding some CCFs can decrease volatility. I’ve added some “standard” plans at the bottom of the spreadsheet so you can see how the performance compares.
  • Lastly, I’m over-weighted on international, at least by conventional wisdom (although, if you wanted a true index, you’d actually not be far off if you invested by market cap). Financial advisors usually recommend something like a 70/30 or 80/20 split of domestic investment because historically the US market has been safer, more efficient, and better performing, but I don’t really think that’s as true moving forward. Also, since it’s as likely that I’ll be living/traveling internationally, the relative buying power argument for overweighting your local market doesn’t make as much sense for me either.
  • There aren’t really many (any?) good general purpose tools for keeping track of your investments long term (that generate personal rate of return factoring taxes, dividends, etc). That might be a neat little tool to design. I started writing a scraper for my Vanguard account, so I might have something useful sooner rather than later there.

And yes, ultimately, the exact percentages of your asset allocation aren’t as important as picking something reasonable, sticking with it, and rebalancing regularly. That pretty much sums up what I’ve learned and the extent of my future maintenance. Although in case I don’t get around to elaborating for a while, here are the main takeaways:

  • Get familiar with Modern Portfolio Theory, particularly the Random Walk – Very few of the most dedicated, privileged, and informed investors consistently beat the market. You are unlikely to do much better.
  • Costs – One of the few things that you can control (besides Asset Allocation) is costs (expense ratios, to some degree taxes). John Bogle is of course the most famous champion of this. Here’s a more recent article of his talking about AA and costs (along w/ the standard charts showing aggregate differences)
  • Losses are worse than gains – this is a simple bit of mathematics, but it’s worth elaborating on as it’s at the center of MPT and diversification. Basically, if you lose 50% you need to gain 100% just to get back to where you started. Cumulative losses are even worse, as compounding (Einstein’s quoted “most powerful force in the universe”) cuts both ways. A tool like FIRECalc will generate very pretty graphs the range of badness depending on when you start withdrawing money.
  • Rebalance – Basically it’s a rule that forces you to buy low and sell high. An annual or semi-annual rebalance is really the only time you really need to look at how your investments are doing. The rest is really just watching grass grow (or giving yourself ulcers) unless you’re doing some crazy tactical asset allocation scheme. There are lots of different articles discussing when/how often to do this, but not much conclusive that isn’t dependent on backtesting. It seems like picking one year is a good compromise between increased risk (as your AA skews) and minimizing transaction costs.

These, along w/ the notes I mentioned above, I think really cover most of the basics. The only other really big thing is to learn about the tax implications (CGT, tax efficiency, and tax loss harvesting) and why tax-deferred investing is your friend. If there’s interest, I can post a spreadsheet that illustrates some of the latter.

Oh, also, listen to Warren Buffett.

Firefox 3, Developing and Browsing


I tend to leave a lot of tabs open when I’m browsing (mostly because of a lack of good research/organizing tools – but that’s another day’s post). Right now for example, I have 231 tabs open in 36 windows (I’ve been working on a counting extension which I should be releasing soon).

Firefox 3 has been great for my style of browsing – the Session Restore is more reliable, and overall it’s much zippier and generally more stable. The last part however comes with a caveat: I’ve found that while FF3 is more stable on a fresh install, the instability caused by extensions has also greatly increased. A few weeks ago, after it got particularly bad, I made a fresh profile, moved my data over, and now just have Adblock Plus and Filterset.G installed. This has seemed to work pretty well for my day to day browsing.

But what about development? Glad you asked. I now have a separate copy of Firefox 3 in my Applications folder named “Firefox Dev” that runs a separate “Development” profile – John Resig has a good writeup on how to set this up. To help reduce confusion when app switching, I also replaced the ICNS file with an alternate set (copy the new firefox.icns file into FFDev.app/Contents/Resources/).

And now I’m free to run lots of leaky/beta (but useful for dev) extensions in a separate process:

(Listing generated courtesy of InfoLister)

Electric Cars, Tilting At Windmills

Philip Greenspun (yeah, that one) recently wrote a post on the cost of converting the entire U.S. to electric cars that has kicked off a pretty interesting discussion. I ended up doing my own back-of-the-napkin calculations which pleasantly appear to corroborate with what Brad Templeton (yeah, that one) posted at the same time.

In chasing down the numbers (lots and lots of searching), and reading the comments (and looking at some other recent discussions, like the recent O’Reilly search/platform posts) I was reminded how lame the technology currently is in terms of online deliberation tools. Even simple things like representing citations, agreement, rebuttals, etc. aren’t supported at all. It’d be interesting to try to bring some of the research and projects being done on deliberative software to the web at large. (What’s interesting is that 5+ years since I was last really into this space, nothing usable has really come out/caught on… that in itself is worth further exploration.)

This conversation also got me to finally watch Who Killed The Electric Car? and really dig in to the history and the latest stuff going on with EVs and PHEVs. There’s a lot there, which I’ll eventually be posting (again, pointing to complete dearth of good knowledge collection/research tools). Wherefore art thou Gobbler? (That’s a rhetorical question. teachers.yahoo.com got shit-canned and nothing ever came out of the gobbler despite it’s obvious usefulness just about everywhere.)

What was sad about the documentary was that it highlighted a true failure of collective action. The “climax” of the doc films a scene that takes place in February/March 2005 (modern times! with Internet!) when GM sends the last of the EV1s sitting in a lot to destruction while protesters try to stop them. Despite having collected $1.9M to try to buy back the cars, keeping vigil for almost a month, and getting arrested for civil disobedience, GM “prevails” (apparently, even now there continues to be bizarre controversy over the few remaining EV1s). In the larger scheme of things, that particular battle was lost years ago, when the CARB regulations were overturned (this actually appears to be continuing battle), and GM’s “victory” was phyrric, at best (the ironies are many-fold, like GM selling their Ovonics battery stake to Chevron, which are now being used in Toyota Prius’ or that once again, oil prices are killing Detroit).

Well, it’s easy to get distracted by schadenfreude, but here we are, 30 years later, with no progress made, and the evidence now abundantly clear (and getting consistently worse) that our energy consumption habits are unsustainable and likely to bite us in the ass sooner, rather than later. From this perspective, GM and the whole auto industry were just a bunch of marks, no less bamboozled or inveigled than anyone else. After all, they’re simply working within their short-term economic self-interest. The ten-year outlook doesn’t even register in the 10Q, much less the “fate of the world” or the “consequences for your grand-children.” In economic terms, those would be externalities – not their problem, and not a cost they they would have to bear. This, I believe applies doubly so for the Oil Industry, which I’ve concluded are the true mustache twirling villians (well, probably the most apt comparison would be to the tobacco companies, although the Oil Industry dwarfs them, hence the caps). Quoting myself:

It seems to me, and I’m sort of surprised that others haven’t mentioned it, is that the real reason this is a thought experiment is because the the government and the drivers aren’t the sole (or even the primary) actors, but rather the oil and related industries. The combined market cap of the oil & gas industry is just shy of $1.9 Trillion – it’s larger than #2 (drug manufacturers) and #3 (central banking) combined. Exxon Mobile (XOM) alone, is $480B – larger than the combined market cap of all auto manufacturers ($381B). XOM currently has an EBITDA of $76.5B. It continues to post record profits as oil prices have gone up (currently 35.4% quarterly revenue and 17.3% quarterly profit growth). The profits they stand to reap over the next several decades (which is the timeframe – remember XOM started as Standard Oil back in 1870) is in the trillions.

Now what might be a most interesting continuation of this thought experiment is that, in light of this sort of economic landscape and these economic motivations, what would a transition plan to a saner automotive and energy system look like? How would you make the dominoes fall and what would be involved? Hmm…

I’ve been giving it some thought, and it’s a tough problem. It seems that there’s a number of strings that could/should be tugged to sort of unravel things. What worries me is that while reasonably, it would seem that we should all be taking this quite seriously (after all, no matter how rich you are, global environmental problems would end up affecting you and your progeny), it’s quite plausible that as individuals, a society, a species, we just don’t have the mental faculties or social organization to rationally/cohesively react to global/species-affecting crises.

That’s one way of saying that the technology of social augmentation and collective action may have a key role to play in trying to catch us up so that we can be smart enough fixing our problems as we have been dumb in causing them.

(For those looking for Don Quixote references, sorry, I was just using the phrase as a way to sum up the gist of the second part of this post. I’ll change the title if I can think of something more apt or descriptive.)

Rearchitecting Twitter: Brought to You By the 17th Letter of the Alphabet

Since it seemed to be the thing to do, I sat down for about an hour Friday afternoon and thought about how I’d architect a Twitter-like system. And, after a day of hanging out and movie watching, and since Twitter went down again while I was twittering (with a more detailed explanation: “Twitter is currently down for database replication catchup.”; see also) I thought I’d share what I came up with — notably, since my design doesn’t really have much DB replication involved in it.

Now, to prefix, this proposal is orthogonal the issue of whether statuscasts should be decentralized and what that protocol should look like (yes, they should be, and XMPP, respectively). That is, any decentralized system would inevitably require large-scale service providers and aggregators, getting you back to the architecture problem.
So now onto the meat of it.

As Alex’s post mentions (but its worth reiterating), at its core Twitter is two primary components: a message routing system, where updates are received and processed, and a message delivery system, where updates are delivered to the appropriate message queues (followers). Privacy, device routing, groups, filtering, and triggered processing are additional considerations (only the first two are currently implemented in Twitter).

Now this type of system sounds familiar, doesn’t it? What we’re looking at most closely resembles a very large email system with a few additional notifications on reception and delivery, and being more broadcast oriented (every message includes lots of CCs and inboxes are potentially viewable by many). Large email systems are hard, but by no means impossible, especially if you have lots of money to throw at it (*ahem* Yahoo!, Microsoft, Google).

Now, how would you might you go about designing such a thing on the cheap w/ modern technologies? Here’s the general gist of how I’d try it:

  • Receive queue
    • Receive queue server – this cluster won’t hit limits for a while
    • Canoncial store – the only bit that may be DB-based, although I’d pick one of the new fancy-schmancy non-relational data stores on a DFS; mostly writes, you’d very rarely need to query (only if you say had to check-point and rebuild queues based on something disastrous happening or profile changes). You’d split the User and Message stores of course
    • Memory-based attribute lookups for generating delivery queue items
    • Hooks for receive filters/actions
  • Delivery queues – separate queues for those w/ large followers/following), separate queues also for high priority/premium customers
    • Full messages delivered into DFS-based per-user inboxes (a recent mbox, then date-windowed mboxes generated lazily – mboxes are particularly good w/ cheap appends)
    • Write-forward only (deletes either appended or written to a separate list and applied on display)
    • Hooks for delivery filters/actions (ie…)
  • Additional queues for alternate routing (IM, SMS delivery, etc) called by deliver hooks
  • The Web and API is standard caching, perhaps with some fanciness on updating stale views (queues, more queues!)

Note that this architecture practically never touches the DB, is almost completely asynchronous, and shouldn’t have a SPOF – that is, you should never get service interruption, just staleness until things clear out. Also, when components hotspot, they can be optimized individually (lots of ways to do it, probably the first of which is to create buffers for bundled writes and larger queue windows, or simply deferring writes to no more than once-a-minute or something. You can also add more queues and levels of queues/classes.)

The nice things about this is that technologically, the main thing you have to put together that isn’t out there is a good consistently hashed/HA queue cluster. The only other bit of fanciness is a good DFS. MogileFS is more mature, although HDFS has the momentum (and perhaps, one day soon, atomic appends *grumble* *grumble*).

Now, that’s not to say there wouldn’t be a lot of elbow grease to do it, especially for the loads of instrumentation you’d want to monitor it all, and that there aren’t clever ways to save on disk space (certainly I know for sure at least two of the big three mail providers are doing smart things with their message stores), but creating a system like this to get to Internet scale is pretty doable. Of course, the fun part would be to test the design with a realistic load distribution…

Blueball-o-rama

After the craziness last week (Where 2.0, WhereCamp, and a GAE Hackathon in between), I was looking forward to taking a breather, but have instead jumped headlong into working on some much delayed music hacking and getting serious (along with self-imposed deadlines) with the Objective-C area. I’m also catching up on publishing stuff from last week, so here’s the summary of my Bluetooth work.

nodebox.py
As Brady mentioned in his Radar writeup, Blueball came primarily out of discussions on how to do an interesting Fireball-like service/tool for a single-track, stuck-in-the-middle-of-nowhere (sorry Burlingame) conference. Also, my desire for Brady to say “blueball” on stage. (score!) Fireball and Blueball are also both parts of a larger track that I’m exploring. It may take a while, but hopefully something interesting will start emerging.

I had a session on Proximity and Relative Location at WhereCamp, where I stepped through my code (really simple collectors, very half-assed visualizations running a simple spring graph) and talked a bit about the useful things that can be done with this sort of sensing.

The particularly interesting bits (IMO) are in applying the type of thinking that was being done on Bluetooth scatternets a few years back on patching together “piconets.” That is, by stitching together the partial meshes, you can get very pull out all sorts of transitive (inferred) properties. There are of course visualizations and pattern extraction you can do on that, but by matching the relative with the absolutes, you can get far wider coverage for LBS and related services. And of course, you can do your own reality mining on social connections when you start relating devices to people.


blueball v1 from lhl on Vimeo.

Catching Up On Reading

In between the projects and mini-projects, I’ve also begun to do some catching up on reading. Carrying around a book everywhere and taking lots of public transportation has helped tremendously in that regard. I started with Shirky’s Here Comes Everybody a few weeks ago, just finished Weinberger’s Everything Is Miscellaneous (that’s been sitting around since his original book tour when he came to Yahoo!) and polished off Cory’s new YA novel Little Brother in a single sitting (what’s so compelling and chilling is the realization that we’re about half-a-step away from a sort of grim-meathookiness. It gave me more of an appreciation of the fiction coming out of the 80s). And I’m now starting Glut.

One thing that’s interesting is that all these books have related/complementary media (podcasts, talks, etc.) attached to them (and all worth the time spent, I’d say, particularly Alex Wright’s Web That Wasn’t presentation). Now, obviously anyone who’s done much spelunking on Wikipedia (lately, I’ve begun doing a bit of random clicking around on Slideshare as well), or you know, random Internet browsing can tell you there’s a lot of “stuff” out there. However, I thought I’d share a list of some of the more useful/structured resources I’ve found (online video, lectures, etc):

This post sort of got kicked off while I was watching this very engaging recent talk (via) that I can’t favorite because YouTube still has a 500-favorite limit (now below 0.6 favs/day):

May 2008 Mix

It’s been a good couple months for music. Sometimes mixes don’t really have a name or a theme. Here’s some stuff I’ve been digging lately. 43 minutes.

Oh yeah, trying Schiller‘s latest SoundManager 2 release (mostly based off of the inline player) and some custom posting code. (Mixwit doesn’t seem to have an API?)

Tech note: for those that are looking to use SoundManager, you might want to check out to see if you’re build has the onpause handler. If it doesn’t you can add it to the SM the defaultOptions at the top and then call it in the pause() function around line 727 (just apply the onpause the same way the onstop above it is applied).

Adium Productivity Tip

Recently, in my quest to minimize distractions, I’ve been going through and …fixing things. First was updating my procmail and my spam filtering so that the only mail that ends up in my Inbox is real mail. (Inbox Zero for the past month!) And here’s what I did for Adium so I could leave IM on while I was working:

Adium Productivity Tips

Turning off dock animations (no more bounces) and hiding Adium while in the background (no more window popping) were really huge. I also have my dock hidden, which helps as well. (The menu bar gives me enough cues so that I can see what’s going on, but that isn’t too bothersome while I’m trying to work.)

And of course, I have my work.py script, which lays out my Terminals appropriately. At some point I might start scripting Space switching for that, but at least right now, this seems to be working out OK.