Congestion Control II

A couple of days ago, I wrote an article about congestion control on the internet. In the comments, I promised a more technical exploration of the subject, and this is it.

Congestion is a problem on the internet. It will always be a problem on the internet, just as it is on our roadways. Yes, bandwidth keeps increasing. So does the number of users, and the number of network applications per user, and the required bandwidth per application. Supply will never stay ahead of demand for long, if at all. Sooner or later, it will become logistically or economically infeasible for service providers to add more bandwidth all throughout their networks. Sooner or later, enough people will be sending enough traffic over the network’s weakest links that it won’t matter if other parts of the network are overbuilt. The “more bandwidth” crowd will lose to those who recognized the need for congestion control and did something about it.

When thinking about how to handle congestion, it’s possible to look at things from many different perspectives. Here are some examples:

  • Different kinds/definitions of fairness.
  • The core of the network vs. the edge.
  • Distinguishing flows vs. hosts vs. “economic entities” (i.e. users).

To keep this from turning into a major dissertation, I’m going to say little about these.

  • On fairness, I think Briscoe has really said all that needs to be said about flow-rate fairness vs. cost fairness. Note that “flow-rate fairness” does not necessarily refer to flows in the sense that they’re often discussed in the congestion-control literature. It refers to the rate at which things (packets) flow, whether they’re associated with connections or hosts or anything else. It’s an unfortunate re-use of terminology, but that’s life. I think cost fairness is the proper goal/standard for congestion control. If you disagree, keep it to yourself. I might discuss things in terms of cost fairness, but for the most part the concerns I’ll address are directly translatable to a flow-rate-fairness world and I don’t want to get bogged down trying to make the flow-rate zealots happy.
  • As for the core vs. the edge, I bring it up because they’re very different environments calling for very different kinds of congestion management. While routers at the edge can reasonably hope to make flow/host/user distinctions (modulo the mess that NAT introduces), routers at the core have no such hope. They must also process packets at a much higher rate, so the most they can do is make hopefully-useful guesses about what packets to drop when they’re overloaded. The dumbest approach is simply to drop the last packets to arrive. The various flavors of RED (Random Early Detect) are slightly better, and I believe Stochastic Fair Blue (which I’ve written about before) is even better, but they’re all still basically guesses. While congestion control in the core is a fascinating subject, it’s not what I’ll be writing about here.
  • On flows vs. hosts vs. users, most of the people weighing in on these issues have tended to focus on users and I’ll follow suit. NAT makes a hopeless muddle of the whole thing anyway, so the only thing you can really be sure of is one entity paying the bill for one physical line. There’s a lot of merit to the argument that if there are multiple uses huddled behind NAT then that’s their problem anyway.

OK, enough stage-setting. On with the show.

Comment Policy

I was talking to someone who’s just getting into blogging – more about that soon, I’m sure – and the topic of commenting came up. As we discussed it, I remembered that I had meant to post something about comments here, so here we go. My comment policy is essentially very simple: I only delete spam. I have comment moderation turned on not to restrict legitimate comments but because lately Akismet – as good as it is, and it really is very good – hasn’t been catching everything. Rather than have icky spam attached to my most-current and most-read articles, I’m leaving moderation on until Akismet catches up with whatever trick this wave is using. I expect that won’t be long, and then I’ll turn moderation off again because I prefer to run that way. There are only two cases in which I can even imagine a reasonable claim that I deleted or disallowed something that wasn’t spam.

  • In one case, I chose to hide a particular post and its comments from search engines, including a referrer check that will decline to produce the content if the request came from a search engine. The post is still there, you can still see it if you come here and use my search feature or otherwise know to look for it, but you can’t get to it from Google. I am sensitive to the internet sometimes remembering things longer than it should; that was my compromise between being an accessory to continued harm and being guilty of censorship.
  • I define spam rather broadly. My definition includes advocacy or endorsement for a product, party, or position without what I consider a good-faith attempt to do more than advertise. Yes, I reserve the right to make that determination myself. This is my own virtual property, after all. On exactly one occasion, I deleted a comment that was spam by the just-given definition but probably not considered so by its author. Too bad for him.

That said, I’m sure some readers will be tempted to point out that I’ve often been rude to people who have crossed me here, told them to go away, etc. Yes, I have. I will almost undoubtedly continue to do so when I think the situation warrants, though I’d prefer to have more civil conversations. What I won’t do, unless and until I find a reason to change my policy and announce such a change, is delete people’s comments. You can disagree with me, you can annoy me, you can do whatever you want but – at least for now – as long as you don’t spam me I’ll leave your comment up.

Yes, comments are enabled on this post. Can I comment on yours?

Corporate Liability

While I continue to revise and re-revise and re-re-revise my tech-oriented congestion-control post in my head, here are some thoughts on the other topic that has been rattling around in there recently. I got into a discussion recently about corporate liability protection. I call it “liability shedding” because I think that saying “the buck stops here” when “here” a legal fiction created for the express purpose of not allowing people to be held accountable for their actions is unnecessary and evil. Yes, that really is the express purpose of corporations. It’s why the institution was created, and the main thing that distinguishes it from any other kind of partnership or contractual relationship. The theory is that in certain situations, there is a thing that needs to be done, where the business risk from potential liability precludes its being done by the private sector under normal conditions, but the benefit to society is large enough to justify offering an inducement in the form of protection from that liability. As I wrote almost three years ago, the building of the Erie Canal is an example of this theory being put into practice. As I also wrote, the theory no longer applies to modern corporations which are still given liability protection plus many other novel benefits of legal personhood without any requirement that the public good be served either initially or on a continuing basis. It’s a totally one-sided deal. If it were considered as a contract, the standard of mutual consideration would not be met and the contract would be considered invalid.

Of course, we’re not going to eliminate corporations. What we can do, though, is address the evil that is liability shedding. The real problem with limiting corporate liability to the current assets of the corporation itself is that it screws up the incentive system upon which capitalism and free markets (not quite the same thing BTW but that’s a different topic) rely. A reasonable incentive system colocates risks and rewards. You take the risk, you reap the rewards. With a corporation in the way, though, you’re matching limited risk with unlimited reward. It’s a sweet deal for those hiding behind the liability shield – which is all many “shell” corporations are really intended to be – but it comes at the expense of those who actually create value by building and inventing and so on. My very simple solution is to restore the connection between risks and rewards. Instead of associating liability – i.e. risk – with the assets of the corporation, associate it with the profits. In other words, not with what went in but with what went out. Similarly, apportion that liability not based on who holds stock (what’s in) but based on who took profits (what’s out). If you choose to partake of profits, you partake of risks as well.

In more concrete terms, if you have ever bought stock in a corporation and then sold it at a profit, then you remain responsible for a share of its liability proportional to that profit divided by the total profit (or current value) associated with that corporation. If that doesn’t sound good to you, don’t buy the stock. Don’t take that risk. That association of risk with profit is unavoidable when you purchase more tangible products or services, and there’s no reason it shouldn’t apply to stock as well. It might be reasonable to have people’s share of risk expire after a certain time as a book-keeping simplification (more about that in a moment) or as a statute-of-limitations sort of thing, but not because it’s inherently unfair for those who profit to share in risk proportionally. I’d like to make a couple more observations about the practical consequences of this change as well.

  • The book-keeping would not be particularly complex or burdensome. In database terms it’s a simple table: transaction X, person Y, profit Z. To determine a person’s share of liability, all you need to do is add up the profits in all records for person Y and divide it by the total. Corporations and financial institutions have to deal with calculations that are orders of magnitude more complicated every single day. If mutual funds and hedge funds and other intermediate holders of stock want to pass either profit or liability on to their respective stockholders, that’s their problem and one they already know how to address in the case of profit so it doesn’t add a burden either.
  • In addition to restoring the proper incentives in the form of a balance between risk and reward, this idea would also discourage non-value-creating speculation. The effect on “buy and hold” would be minimal, but frantic flipping of stocks to profit from momentary fluctuations might not be so tempting if liability accumulates in the process. I think that’s a good thing, as I believe such speculation is parasitic: it benefits a very few while harming the companies in which they speculate and the markets themselves.
  • This would not kill incentives to invest, because there would still be far more upside than downside. If you gain no money by investing in a risky venture, you gain no liability either. On the other hand, if that venture generates more profit than liability, then you gain both. Since the amount of profit generated in the economy is far larger than the amount of liability, and the amount of profit for a particular company has no limit, that’s still a very large potential upside. But wait, you say, the company could be liable for more than was invested. That means investors could lose more than they put in. Damn right. If the company does more damage than it’s worth, then why shouldn’t the victims have that recourse? That’s kind of the whole point here. You can’t limit liability to the corporation itself (or its value) without denying recourse to those who are harmed, and that’s unfair to them. Again, if investors don’t feel comfortable with that risk then they shouldn’t buy the stock – or they should buy insurance, just as they do against other risks. In fact, many kinds of private partnerships and practices already purchase such insurance. Some are even required to carry such insurance, either by regulation or as a requirement to receive private (non-stock) investment, and it doesn’t prevent them from pursuing their entrepreneurial goals.

In short, I think this fairly simple change would be a very large step toward truly free markets, removing one of the major distortions in the markets we have by restoring the incentives that everyone since Adam Smith has recognized as essential. I think I’ve managed to address most of the more predictable objections. What did I miss?

Elephants and Land Mines

Apparently, elephants in Angola and Botswana have an uncanny ability to avoid land mines.

“It’s quite a mystery,” said Curtice R. Griffin, the UMass professor of wildlife ecology leading the research. He estimates that hundreds of elephants have crossed through the mine fields without incident.

“We have not detected any carcasses. We know elephants have an incredibly powerful sense of smell. But how they have come to associate the smell of a mine with danger is not known,” he said.

Sadly, I suspect this will lead to elephants being pressed into mine-clearing duty.

Congestion Control

I had been meaning to write about network congestion control for a while anyway, but it seems like now I have an extra reason. George Ou just posted an article called Fixing the unfairness of TCP congestion control, lauding (and somewhat representing) Bob Briscoe’s excellent paper on Flow Rate Fairness: Dismantling a Religion. Because Ou is an ardent opponent of so-called “network neutrality” the response has been predictable; the net-neuts have all gone nuts attacking Ou, while all but ignoring Briscoe. One example, sadly, is Wes Felter.

This is fine, but for the same cost as detecting unfair TCP, ISPs could probably just implement fair queueing.

I think you’re a great guy, Wes, but did you actually read Briscoe’s paper? You cite Nagle’s suggestion of queuing at ingress points, as though it somehow contradicts Briscoe, but that is in fact pretty much what Briscoe ends up recommending. The only difference is in how we define “fair” and Briscoe makes a pretty good case for cost fairness instead of flow-rate fairness. He points out, for example, how current techniques merely create an “arms race” between service providers and the application developers, creating new problems as the race continues but never really solving the old one. He even takes aim at some of the tricks that providers like Comcast have been using.

While everyone prevaricates, novel p2p applications have started to thoroughly exploit this architectural vacuum with no guilt or shame, by just running more flows for longer. Application developers assume, and they have been led to assume, that fairness is dealt with by TCP at the transport layer. In response some ISPs are deploying kludges like volume caps or throttling specific applications using deep packet inspection. Innocent experimental probing has turned into an arms race. The p2p community’s early concern for the good of the Internet is being set aside, aided and abetted by commercial concerns, in pursuit of a more pressing battle against the ISPs that are fighting back. Bystanders sharing the same capacity are suffering heavy collateral damage.

That sounds like a pretty serious indictment of forging RST packets and so on. What more could you want? The simple fact is that congestion control is necessary, it will always be necessary, and current methods aren’t doing a very good job. This issue is almost orthogonal to network neutrality, as it’s about responses to actual behavior and not preemptive special treatment based on source or destination before any behavior is manifested, so don’t let opposition to Ou or to his views on network neutrality color your evaluation.

Gaming-resistance is just as desirable a property in congestion control as it is in other areas we’ve both studied, and right now real users are seeing degraded performance because a few developers are gaming this system. I remember talking to Bram Cohen about this issue while BitTorrent was being developed, for example. He was very well aware that congestion control would sometimes slow his transfer rates and very deliberately designed BitTorrent to circumvent it. That benefits BitTorrent users, and I just used BitTorrent to download an ISO yesterday so I’m not unaware of its value, but really, what makes BitTorrent users so special? Why should those who create congestion not feel its effects first and most? How, exactly, is it “fair” to anyone else that they don’t?

Wrong Again, Megan

Megan McArdle is, predictably, trying to rationalize her refusal to learn a darn thing from having been wrong about Iraq.

The universe being a complicated place, you can usually tell multiple stories from the same pieces of evidence. We learn by gambling on what we think the best answer is, and seeing how it turns out. Most of us know that we have learned more about the world, and ourselves, from failing than from success. Success can be accidental; failure is definite. Failure tells us exactly what doesn’t work.

Baloney. Failure can be accidental too, but the Iraq war was no accident. What bothers many people about the decision to initiate that war is not that the conclusion was wrong but that the very process leading up to it was practically designed to guarantee a wrong conclusion. It is, unfortunately, the very same process and mindset still very much in evidence among people like Megan. Not all failures are deliberate, but the deliberate ones provide an opportunity for self-improvement. That can only happen, though, if the people who were deliberately wrong engage in a little introspection, and change those disastrous mental habits. Megan not only fails to engage in such introspection, but actively refuses to do so. Putting on airs because you were wrong is just a little bit silly, but claiming the “virtue of failure” while refusing to do the one thing that might be virtuous in this situation simply piles a moral flaw on top of the intellectual one.

Good for the Goose

From a story on the missing White House email scandal:

It would be costly and time-consuming for the White House to institute an e-mail retrieval program that entails pulling data off each individual workstation, the court papers filed Friday state.

But wait, isn’t that sort of records retention something the government routinely requires of business under Sarbanes-Oxley and less than they require of certain businesses such as financial and health-care institutions under even more stringent laws? Surely the operation of our all-powerful federal executive branch is more important, and thus should be more subject to scrutiny, than mere commerce, right? The people in the White House who seem to think they’re better than economists and scientists and generals and intelligence analysts (not to mention legislators and judges) at their respective jobs don’t get to claim “the dog ate my homework” either. What excuse is there for their failure to do themselves what they demand of others?

World Champion

I’ve always had a weakness for “make groups to clear the board” kinds of games, from Tetris to Zuma, Sokoban to Shisen-Sho. Lately I’ve been playing Colored Symbols II, which is a variant of SameGame which comes in many forms. One feature is a high-score list, and I’m proud to announce that I now hold the global all-time high score of 3156. I pretty much lucked into it, really. The game dropped a killer combo right into my lap at the outset, and then all I had to do was not screw it up. Nonetheless, it’s something to be proud of for at least a few seconds.

OK, I’m done.

Caching vs. Replication

Ned Batchelder started an interesting discussion about caches. In the course of that discussion, the distinction between caching and replication came up, and I think there are some very frequent misunderstandings about that distinction and its implications so I’ll attempt to clarify. Here are some definitions:

  • Cache: a data location created/deployed to provide lower request latency than the main data store (either by being located nearer to requesters or by using faster components).
  • Replica: a data store, separate from that where a request is served, that is created/deployed to continue service after a failure.

In short, a cache exists to improve performance and a replica exists to improve resilience. A cache that doesn’t improve performance is a failure, as is a replica that doesn’t improve resilience, but the possibility of failure doesn’t turn one thing into another. Since defining things in terms of purpose or intent often leaves things unclear, here are some practical implications of the difference.

  • Caches need not be current or complete. They may return stale data, or no data at all, although many caches are designed to avoid stale data and “transparent” caches will re-request data from the main store instead of requiring that the requester do so after a miss.
  • Replicas must be both current and complete (perhaps not perfectly but always within defined limits), and authoritative or at least capable of becoming authoritative. “Authoritative” means that they may not be contradicted by alternative sources of information; if a conflict exists, the authoritative source is unconditionally given precedence over any non-authoritative one. (Authority loses its meaning if authorities disagree, of course, but that’s a philosophical issue best left for another time. For now, assume that authorities always agree.)
  • Caches exist to improve request latency, but replication might actually degrade request latency at the nearer data store as messages are exchanged with the further one to preserve the required replica behavior.
  • Replicas exist to improve resilience, but caching might degrade resilience as the number of components (the caches and extra data paths) and logical complexity both increase.

Much of the confusion arises because a single pool of data can serve as both a cache and a replica. In fact, if multiple replicas are simultaneously accessible at all (i.e. “live” or “dual active” replication vs. “standby” or “active/passive” replication) then it’s often easy to use the nearest replica as a cache. Enabling this can often yield big advantages for little work, but it can also be disastrous if such simultaneous access leads to thrashing. I’m sure some of my readers will recognize this phenomenon as it applies to simultaneous access using two disk controllers with “auto-trespass” enabled, leading to 100x performance degradation. It’s still safe to use a replica as a cache, though, even if it’s terribly inefficient. By contrast, using a cache as though it were a replica might qualify as one of the classic mistakes in computing. Often, you can get away with doing that 99% of the time, until that one time when the cache does what a cache does and returns stale data. Then you can have very hard-to-debug data corruption or misbehavior on your hands. This is but one example of a lesson I often have to pound into people, regardless of whether caching or replication is involved. In fact, it’s so important that I’ll put it in a quote box.

Never make copies of data without a strategy for dealing with currency/consistency issues.

As I said in Ned’s thread, “don’t sweat it and live with the consequences” is a valid strategy so long as it’s a conscious choice; simply ignoring or forgetting the issue is not.

…but I digress. Getting back to the topic at hand, another source of confusion seems to revolve around performance. In Ned’s thread, the claim was made that replication can improve performance more than caching. I consider that untrue; what I would say instead is that a hybrid can outperform a “pure cache” (i.e. one which is not also a replica). This can happen for two reasons. The simpler reason is that since a replica must be complete it’s also likely to be larger than a cache, so any comparisons are really apples to oranges. The more complicated reason is that replicas need to “push” data between replicas before a need for it is recognized, whereas caches usually “pull” data in response to a user request. Thus, at the time the user requests data, the hybrid replica/cache will already have it locally while the pure cache might have to request it from the (remote) authoritative store. This is not really a performance benefit of replication itself, though, which becomes apparent when one considers that push-based caches are also possible – and do exist in the form of web Content Distribution Networks. (CPU and disk caches also tend to have prefetch features which are essentially similar.) Such a cache would, for equal resource levels, outperform a cache/replica hybrid. The supposed performance benefit for replication is really a performance cost associated with how caches are usually implemented. They are implemented that way because it usually involves less complexity, and because there are also cases – involving read vs. write ratios and other things I won’t get into – where “push” would be a disaster and “pull” is the correct model, so it’s not a mistake, but it does sometimes make it appear that replication is faster than caching when it’s not. A pure replica will always degrade performance at least somewhat, while a pure cache is supposed to improve it. The advantage accrues not to the function but to devices that happen to implement multiple functions and exploit synergy between them.

Joe’s Hair Salon and School for the Blind

I’m not a big fan of the “mullet” hairstyle on anyone, but a mullet on a woman is a crime. On the other hand, combining the rural look of a mullet with the urban look of rimless glasses might constitute an insanity defense.