Structured Blogging and Data Representation

Most conversation about structured blogging has dealt with the idea at the application and delivery level (microformats, etc). I’ve been interested in the relationship of blogging and other loose KM applications (specifically wikis and outliners) for a while now. I have a belief that ultimately, these applications are more alike then they are different, and can/should be intrinsically tied together (that’d make it a blikiliner, right?), however I’ve been hung on on the best way to store, represent, and relate the common data. With the imminent promise of time to pursue these issues, I’ve started to pick back up my earlier work, and with a fresh pair of eyes.

  • Caching – my original (and still current) approach towards representing pieces of data (microcontent) has been using directional graphs (with typed nodes and relationships, very much like the RDF model). One of the roadblocks I hit was that unlike with trees, there are no high performance ways to store this in SQL. Last year it occurred to me that I should completely forget that, store them as simply as possible (nodes and relationships), and simply build the partial sets (views) and cache those with an appropriate lookup table. Or I could bite the bullet and see if using an RDF database makes sense)
  • First class data structures – this actually is something I’m still thinking about. In a graph model, maybe it doesn’t really matter as long as you can extract types and relationships. There are some things that you’d definitely want to extract (like external and internal hyperlinks), but that can be pretty trivially done post-hoc… Once you’ve committed to that sort of data representation, all you have to worry about then is how to usably combine that information. Still up for question is how fine grained nodes should be, and how to best point within nodes (think about annotations and purple number style addressing). One could conceivably split elements to DOM constituents pretty easily, but that ups the number of nodes you need to keep track of up a magnitude or two (but might be a better alternative to an xpath type approach).

I’m still doing searches to see if there’s been anything new published over the past year or two. Some links: