Metrics API Stats & Query by URL

The Metrics API is one of the central components of the entire PostRank infrastructure. After all, after we aggregate, clean, and save each of the individual events (tweets, comments, bookmarks, and so on) associated with each URL. We also need a way to query this database. In fact, on average, our metrics API is serving ~20,000 URL lookups every minute (~300 lookups a second), for our internal applications at PostRank, as well as for our partners.

The full PostRank archive contains over 5 billion individual engagement events with all kinds of online content. When you query the Metrics API, you are actually accessing this entire archive in real-time and retrieving the latest set of aggregated metrics. There is no caching, there are no delays; all of the ranking data is served in real-time for each request. Needless to say, it takes more than just a single database or an API server to power such a service.

Metrics API Update: Query by URL

If you have used our Metrics API before, then you will know that we required that your query should specify the MD5 hashes (a 32 character string) for each URL — that is how the data is stored internally at PostRank, so it was a simple direct mapping. However, as many of our partners know, hashing URLs is tricky business: a single character change will result in a different key. So, if you didn’t remove a tracking parameter from your URL, or omitted that extra slash, then you might get a completely different set of results.

To simplify this case for everyone, we just pushed out an update to the Metrics API, which allows you to query by URL directly, without having to specify your own MD5 keys. A simple change to the API and a big win for every user. PostRank will do all of the URL normalization, handle all the edge cases, and return to you all the metrics associated with the URL you have submitted. Let’s take a look at a simple example:

Both of the queries are for the same URL; the second one just happens to have some extra Google Analytics tracking parameters — should you remove those when you query the API? Don’t worry about it, just pass in the URL as you have it, and let PostRank do the rest. Both requests now return the same results.

Human Based Computation & the Social Web

At PostRank we have spent a lot of time and effort thinking about how to help publishers identify where their audiences are engaging. However, the converse is also a very interesting question: as a consumer, which sites should you engage with to find like-minded communities of people and relevant content?

Earlier today, Alex Kosorukoff (Chief Scientist at StumbleUpon) posted some really interesting data analysis he performed over the weekend using PostRank’s Metrics API:

“I took a sample of 2169 urls pulled from about 200 feeds in my google reader. Those feeds cover a pretty diverse set of topics, including science, engineering, entrepreneurship, business, management, psychology, legal, photography, music, humor, lifestyle, etc. I pulled the Postrank metrics for each of those urls into a user engagement matrix. Each row of the matrix represents a url information, and each column has values of a single engagement metric (e.g. number of posts on twitter) across all the 2169 urls. I computed the Pearson correlation between every pair of columns. This resulted in a matrix visualized below.”

The Pearson correlation matrix above highlights that certain sites tend to exhibit similar behaviours, which is arguably not surprising. Some of this, as Alex points out, can be attributed to data exchange between the different sites, but just as likely, it simply shows that no matter what tool we prefer (e.g. Delicious or Diigo for bookmarking), not surprisingly, we tend to engage with similar content. So, no matter if you are a publisher or a consumer, you could either discover more like-minded people, or a wider audience for your content, if you continue to explore different networks — a win-win.

The topic of Human Based Computation (HBS) is an incredibly interesting one to explore and it is always exciting to see intriguing studies and results in the space. Check out Alex’s post (“Mapping the social web with PostRank“), and let us know if you have other interesting results or applications of this data! Social search, discovery, collaboration, gaming, and the list goes on — the Social Web is a deep well of ideas to explore and improve on.

Introducing Enhanced Sorting and Filtering for PostRank Analytics

One of the greatest values of the social web for publishers is its power to keep content “evergreen”. Just because a post was published a year ago, if it contains interesting, relevant content, someone could discover it anew at any time, share it, and it can get noticed by a whole new audience.

Publishers also know that people who come to their sites already engaged with the content tend to stay longer, read more, and come back, which helps them grow reach and influence.

These content truths help explain why additional sorting and filtering functionality have been our most-requested features since we launched PostRank Analytics.

By default posts are sorted and displayed by publication date — the most recent content appears first. But if an older post is getting engagement, publishers want to know about it, to be able to engage with the conversation, and perhaps point readers to additional relevant content.

PostRank Analytics sortingWe get it, so today we’re launching new sorting and filtering functionality!

When you’re logged in to your Analytics account, once you navigate to a site’s Analyze view, you can now sort either by publication date or by activity.

Clicking the activity option will display posts in the order of most recent engagement. So if a post just got a new tweet or digg, it’ll be at the top, even if it doesn’t necessarily have the most engagement overall. Doesn’t matter if it’s a new post getting its first engagement, or an older post getting new life thanks to the social web.

PostRank Analytics author filteringWe also know that on sites with multiple authors, comparing the content’s engagement as a whole might not always provide the most relevant picture. Publishers also want to know which topics they cover that tend to be the most popular. Not a problem. In the same site Analyze view, you can now filter posts by the author that wrote them, by topic tags, or both.

Where do the topic tags come from? From the most relevant source there is — you! When you published your posts, you added descriptive tags or topic categories to make your content more organized and searchable.

When we gather and analyze your content, we collect those tags and display them in a drop-down field for you to use in filtering posts. It works the same way for authors; it’s just names instead of topics. (For those publishers who have not been using tags or category topics, we recommend enabling that in your blog platform or CMS.)

There are lots of cool ways you can use these features to beef up your knowledge of how your content is doing with your audience.

  • Make new connections with readers by seeing what posts have gotten the most recent engagement, and, using info from the Activity stream, contacting them and get a conversation going or answer questions.
  • Filtering by author and tag, publishers can compare their engagement when writing about a specific topic to other authors’ engagement when writing about similar topics. Who’s getting more engagement from the audience? Is the key writing style, data presentation, or another factor?
  • Filter by topic, either for a single author or group. Do posts about taking photos get as much engagement as posts about editing photos? Do readers seem to love lists posts, no matter what they’re about? Get an insider’s view of how your audience ticks, and learn from that how to increase your engagement.

You know your content better than anyone, and now PostRank Analytics offers functionality to help you see the hard data to prove your intuition, help you see new trends, and make even stronger, more influential audience connections.

PostRank extension for Google Chrome & Reader!

Are you an infovore? Do you keep up with hundreds of RSS feeds, Twitter, Facebook, LinkedIn and maybe a few other sources on a daily basis? If so, then an inbox with an unread count in the hundreds, or even a thousand plus stories, should not be an unusual experience. Many of us at PostRank fall into this category, which is why our original product (AideRSS at the time), was all about enabling you to filter and customize your RSS feeds.

PostRank + Google Reader

Our Firefox extension has become an indispensable tool for many daily Google Reader users, and today we’re happy to announce the official PostRank extension for Google Chrome! Install it, load up Google Reader, and enjoy all of the benefits of filtering by PostRank engagement.

Because the Chrome extension is using native APIs (Firefox requires Greasemonkey support), we’ve been able to significantly improve the user experience as well — all of the ranking is done in real-time and all the ranks are completely contextual. So if you are reading a specific feed, the PostRank scores are all relative to that feed. But if you are navigating through a folder (e.g. Business), or even the “All Items” tab, then all the stories are ranked with respect to the stories in that view.

Additionally, you will notice a PostRank drop-down in the top navigation menu, which enables you to set and modify your filter criteria for each feed, or folder. Only have 5 minutes to spare? Switch the filter to “Best stories” only and the extension will do the rest.

For more details, check out our PostRank Labs site, or head directly to the Chrome Gallery to install the extension!

Oh, and one more thing…

If you go to the extension options page, you can enable the PostRank extension on a selection of other popular sites as well: Digg, Reddit, Google News, and even Google search results! For each story or search result, the extension will pull in the PostRank metadata for the site, and if you hover over the link, you can preview all of the metrics we have aggregated for that site.

Update: We pushed out an update for the extension that improves the UI and rankings in expanded view considerable. Check it out!

Social Graph, Influence and the Social Web

PostRank began with a simple vision of helping people “find and read what matters“. To do so, we focused on what in 2006-2007 was still a relatively young phenomenon: the Social Web. Because many of the user-to-user interactions were done in public, we observed that we could harness this wisdom of the crowd to create an algorithm that would identify the top most engaging stories for any topic, theme, or time period. In practice, this worked out even better than we expected.

Today PostRank collects over 17 million user generated engagement events every day, which, combined with an accumulated archive of many billions of metrics, gives us an incredibly rich dataset to meet our original goal.

However, the more time we have spent working towards helping our users “find and read what matters“, the more we realized that oftentimes, it is not just the content, but the person behind it that many of us want to know more about. It turns out that finding engaging and relevant people, whatever your interests are, also goes a long way toward identifying engaging and relevant content.

Owning the Social Graph vs. Knowing the Person

Facebook, Google, and Twitter are locked in a battle for the Social Graph. Each is trying to maximize their share of the graph, offers a single sign-on solution, and in general would love to see you use them as your primary online identity.

However, just as in real life, most of us belong to many — often distinct — social networks, which span different contexts (personal, professional, hobby, etc.), and which, (as Google researchers found out) we often have no incentive to mix. In other words, just as in real life, to capture a good profile of a person and his or her interests online, one needs to look across all the different networks.

With that in mind, and because PostRank is already collecting tens of millions of user-generated events on a daily basis, we have been prototyping a number of social network analysis tools that go well beyond a site URL or a single network profile. We want to understand the person behind each interaction for a number of reasons:

When it comes to ranking based on audience engagement, influence matters. To measure influence reliably, we need to understand who the person is on a deeper level than number of followers on any one given network.

When it comes to discovery and helping our users find relevant content, understanding the people behind the content is where the next big breakthrough will happen.

When it comes to advertising in the Social Web, identifying individual experts and influencers, and providing a mechanism to effectively track their engagement is a critical piece of the overall puzzle.

Social Graph, Influence and PostRank

In fact, PostRank is already weighting some engagement events based on the cross-network profile (#1), or influence of the person. Before the end of the year we fully expect that all of our engagement events will be enriched with this data.

As for the discovery piece (#2), today we are launching a technical preview on one of our topics: click into any site within the Ruby topic (ex: yehudakatz.com, railscasts.com, igvita.com, or click on the screenshot above) and near the top, you will see a full, machine aggregated profile of the individual behind that URL – name, title, geo information where available, and all the social networks they are on.

Finally, today we are also announcing PostRank Connect to help – you guessed it – connect brands and PR agencies to the experts and influencers on the Social Web. This is a service we have run behind the scenes for a number of partners and we are excited to make it public.

Visit the Connect site to learn about all the details (and if you are an expert or influencer in your field, register to get your PostRank Analytics account for free!)

Needless to say, this is just the beginning, so stay tuned for more.