There is nothing inherently new about demanding faster dissemination of content and news. The hubbub about the real-time web is a continuation of the same trend that made us command the radio airwaves, lay millions of miles of fiber optic cable, even introduce RSS. If there is one constant, then it’s the certainty that the technology powering this trend is always evolving. Better delivery mechanisms are introduced into the ecosystem, vetted by the community over time, and, if successful, become the new standard in our distributed web.
The continuous cycle of innovation has only one major down side: there is never a clear technological winner. Take RSS for example. Instead of repeatedly visiting your favorite sites just to discover that there is no new information, RSS inverts the relationship and allows your news reader to deliver the stories to you — a great time savings! However, how does the news reader know to check for new content? Well, it could periodically ask the server (polling), or perhaps the publisher could notify it that new content is available via a ping server. In a few cases, data can be transferred via near real-time protocols such as XMPP or AMQP. Or, if you prefer to stay within the specification of the HTTP protocol, perhaps you could leverage the new and emerging PubSubHubbub and RSSCloud standards.
Seems complicated? It is, and if you are looking to deliver a seamless experience to your users, you have to cover all of the delivery mechanisms to fulfill the promise of the real-time web. This isn’t something you can knock off in a weekend.
Cogs & Wheels of the Real-Time Web
- Polling - the consumer has to periodically check the news feed for content. This is by far the most pervasive method of propagating news and also the most inefficient. The more frequently you check, the more frequently you’ll discover that there is no new content. Plus, resources are used both on the consumer and the producer sides.
- Ping servers – launched in late 2001, ping servers are a mechanism for publishers to notify the world of newly available content. A consumer could then theoretically subscribe to this global stream and listen for notifications of publisher updates. It is a great mechanism but there are a few problems. Ping servers do not guarantee 100% coverage, and if you’ve ever subscribed to the ping stream you will know that most of it is spam. In other words, it’s an accelerator technology and it cannot replace polling.
- XMPP, AMQP and long-lived HTTP connections are standalone protocols built specifically for routing messages in (near) real-time. A few publishers have enabled these channels, but due to their deployment and integration costs the overall adoption is incredibly low.
- PubSubHubbub and RSSCloud are the new emerging standards that are promising to deliver the functionality of publish-subscribe architecture, but within the bounds of the HTTP specification: lower entry barriers, faster adoption rate. Of course, having FeedBurner, TypePad and Wordpress adopt these standards also goes a long way. A good fraction of some of the most popular feeds can now be consumed via these protocols. Having said that, this does require new server and client infrastructure, so don’t expect your local newspaper, blogger, or news reader to have it enabled just now.
Leveraging PostRank’s Real-Time Web platform
Want to build a real-time delivery platform? You have to think about all of the technologies described above, and this is before we even think about the dozens of different publishing standards (RSS, ATOM, etc.), spam filtering, and metadata and language normalization issues. At PostRank, we’ve dedicated the last two years to building just such a platform to power our service.
PubSubHubbub, RSSCloud, XMPP, ping servers, and polling – we’ve got you covered. Whatever the mechanism, we’re always looking to shave off a few extra seconds and get you one step closer to the real-time web. Additionally, all the content is normalized, passed through additional language and sentiment analysis filters, enhanced with additional metadata (feed engagement, topics and tags, etc.) and then streamed to our clients and internal applications via the best-suited protocols (WebHooks, AMQP, RSS, etc).
Building or thinking of a Real-Time web application? Come by and talk to us at the ReadWriteWeb Real-Time Summit this week (October 15th) in Mountain View, or ping us directly at anytime.