-
Notifications
You must be signed in to change notification settings - Fork 85
Comparing Protocols
People want a comparison of the concrete differences between fat pinging (PubSubHubbub, XMPP pubsub) and light pinging (rssCloud, XML-RPC pings, changes.xml, SUP, SLAP). This document aims to construct and convey an evaluation of these protocols that's easy to understand.
The core difference is how new information from feeds is delivered from a publisher to a subscriber:
- Light pings: Send the URL of the feed that has updated to the subscriber.
- Fat pings: Send the updated content of the feed to the subscriber.
There is also another series of criteria to consider for each protocol. (+) is good, (-) is bad.
| Consideration | XML-RPC ping | changes.xml | SUP | SLAP | XMPP pubsub | rssCloud | PubSubHubbub |
|---|---|---|---|---|---|---|---|
| Transport | (+) HTTP | (+) HTTP | (+) HTTP/HTTPS | (-)UDP | TCP/XMPP | (+) HTTP | (+) HTTP/HTTPS |
| Distribution style | Ping/Poll | (-)Polling | (-)Polling | Ping/Poll | (+) Push | Ping/Poll | (+) Push |
| Latency | Low | (-)High | (-)High | Low | (+) Minimum possible | Low | (+) Minimum possible |
| Thundering herd | (-)Yes | (-)Yes | (-)Yes | (-)Yes | (+) No | (-)Yes | (+) No |
| Spamable (no topics) | (-)Yes | (-)Yes | (+) No | (+) No | (+) No | (+) No | (+) No |
| DoSes Publishers | Preventable | (+) No | (+) No | Preventable | Preventable | Preventable | Preventable |
| DoS Relay attacks | (-)Yes | (+) No | (+) No | (+) No | (+) No | (-)Yes | (+) No |
| Possible to implement on $5/month hosting | (-)No | (-)No | (-)No | (-)No | (-)No | Maybe | (+) Yes |
| Message format | XML schema | XML schema | JSON | (-)Binary packet | (-)Complex XMPP | XML schema | (+) Original RSS or Atom content |
| Secure notifications | (-)No | (-)No | Somewhat | (-)No | (+) Yes | (-)No | (+) Yes |
| Publisher complexity | XML-RPC client | XML-RPC client | SUP IDs | (-)UDP send | (-)XMPP send | XML-RPC/(+) REST ping | (+) REST ping |
| Subscriber complexity | (-)Crawl pipeline | (-)Crawl pipeline | (-)Crawl pipeline | (-)Crawl pipeline | XMPP client | (-)Crawl pipeline | (+) Simple webapp |
The rest of this document will compare light and fat pinging by these metrics:
To simplify this explanation, latency is represented as network "hops": the time it takes on average for data to propagate between two Internet nodes.
Naively, light pings and fat pings look the same:

There are four network hops required to deliver new content to a subscriber.
However, this leaves light pinging hubs open to relay denial of service attacks, so the Hub must verify there is new content:

- Result:
- Light pings require at least six network hops.
- Fat pings require at most four network hops.
- Fat pinging is 33% faster than naive light pinging.
Often publishers will be combined with their own hub (for better integration with their application, better statistics gathering, optimizations) yielding:
- Light ping: 3 hops.

- Fat ping: 1 hop.

- Result: Fat ping is 66% faster.
With popular sites, a feed will be served from multiple datacenters:
- Light ping case:
- Wait for propagation delay to fill all caches before sending pings, meaning the whole system operates as fast as the slowest node
- All caches represent a dependent point of failure, meaning more waiting and retries

- Fat ping case:
- Integrated hub may immediately send fat pings before feeds are updated externally

- Result:
- Latency incurred by caching/replication delays can be zero with fat pings.
- In the light pinging case, it is always non-zero.
- This is irrelevant for single-host sites, but it gets worse and worse the bigger a site is.
- Mitigating this problem for light pings requires specialized knowledge of datacenter topology, which violates the abstraction of DNS.
Assume an average feed is 100KB consisting of fifty 2KB posts.
Take the case of a single new item with 2KB of data and 100 subscribers to the feed:
- Light ping: 10MB served by publisher

- Fat ping: 200KB served by publisher

- Result:
- Fat ping requires 98% less data (i.e., light ping requires 50x more).
- Light pinging requires publisher to serve 100x more HTTP requests (the "thundering herd").
- This result remains the same even with 1,000,000 subscribers to a feed.
To prevent denial of service attacks, light pinging Hubs must verify there is new content:

- Result:
- Even worse than naive light pinging case.
- Same bandwidth overhead as naive light pings.
- Fat pinging is 33% faster than light pinging.
Light-ping advocates suggest that the Hub should re-serve only the new content on behalf of the publisher:

- Result:
- Equal bandwidth as fat pings.
- Still 100x as many incoming HTTP requests as fat pings.
- 33% more latency than naive fat pings.
- 66% more latency than combined publisher/hub fat pings.
- Trust/security model for proxied feed on behalf of publisher unclear.
Assume parsing a whole feed on average takes 10ms per item. Again assume an average feed has 25 items.
Take the case of a single new item being sent to 100 subscribers to the feed with naive pings:
- Light ping: 25 seconds of CPU time consumed by subscribers (250ms each).

- Fat ping: 1.25 seconds of CPU time consumed total; 250ms by the hub, 10ms by each subscriber.

- Result:
- Fat pings require 95% less CPU (i.e., light pinging requires 20x).
- Fat pings 99.6% cheaper for subscribers (i.e., light pinging requires 25x).
- Light pinging requires publisher to serve 100x more HTTP requests (the "thundering herd") which have overhead.
- This result remains the same even with 1,000,000 subscribers to a feed.
However, the Hub must verify the feed is new to make this safe:

- Result:
- Light pings require 25.25 CPU seconds consumed; 250ms by the hub, 250ms for each consumer
- Even worse than the naive case.
And when, for light pings, Hubs re-serve only the new content on behalf of the publisher:

- Result:
- Equal CPU as fat pings.
- Still 100x as many incoming HTTP requests as fat pings.
- 33% more latency than naive fat pings.
- 66% more latency than combined publisher/hub fat pings.
- Trust/security model for proxied feed on behalf of publisher unclear.
Assume the publisher tells hubs the feed URLs.
- Light ping: Send a feed URL in some format.
- Fat ping: Send a feed URL in some format.
- Result: More or less equivalent, though some interop issues (e.g., SOAP) could exist.
-
Light ping
- Subscription protocol code
- Feed fetching pipeline
- Feed parsing code
-
Fat ping
- Subscription protocol code
- Parsing code
-
Result
- Fat pinging does not require the complexity of a feed fetching pipeline.
- Significant because efficiently doing feed fetches in an asynchronous way can be hard, if not impossible for simple hosting providers.
Assuming that Hubs must verify that the original feed has changed or else they will just be an open relay for DoS attacks.
-
Light ping:
- Receive ping
- Verify feed document has changed (one hash of the text contents)
- List subscribers
- Send ping to subscribers
-
Fat ping:
- Receive ping
- Parse feed document, determine if individual entries have changed (multiple hashes, one for each item)
- List subscribers
- Send new content to subscribers
-
Result:
- Roughly the same.
- Fat pinging requires a reparse of the feed on each publish notification, light pinging does not; they only need to check a hash of the whole content has changed.
- Light ping requires only one hash per feed instead of one per item for fat pings. Storage usage can be mitigated by fat pinging by having a cap on total storage allowed per feed.