Sorry, AccordionGuy

Couldn’t resist. (local copy just in case)

Fighting Cliques

In response to recent events over at Catallarchy, I was all set to write an article about online cliques…until I remembered that one of the very first articles I wrote for this site over four years ago already covered most of that ground pretty well. It started with the following observation:

The only thing that is necessary for a clique to form is the existence of a small set of people less willing to criticize each other than to criticize others.

I went on to elaborate on how cliques form and how they’re maintained, and I don’t have much to add from a blog/forum user’s perspective. However, I’ve had some experience since then that allows me to expand a little on the theme from the perspective of a forum administrator or moderator.

Site Traffic and Image Leeches

It looks like it’s that time of year again, when I start looking at how the site’s doing. I thought traffic had slacked off a bit, but I went back and found my numbers from last year and I see it’s not so. At that time, 358 visits/day and 3.4 GB/month represented significant increases over what the numbers had been before then. Now, I’ve been over 500 visits/day for two months in a row. Last month that resulted in 4.26GB consumed, which means it’s a good thing I changed hosting plans/providers. This month I’ve been far more aggressive about blocking image hits from discussion forums (more about that in a moment) so my bandwidth has only been 3.49GB with two days left to be counted. The bandwidth/visit is mostly lower than last year because an ever-higher percentage hits are from RSS readers which are much more bandwidth-efficient.

The image-leech problem is a common one, so I figured I’d discuss it a little further. For those who aren’t familiar with it, here are the basics. If you link to one of the images here from your page, any browser loading your page will also fetch the image from here, increasing my bandwidth usage for the month by the size of the image. If your page is popular, that increases my bandwidth usage a lot, possibly pushing me above my monthly limit – at which point it becomes money out of my pocket. Needless to say, I’d resent that. It’s a constant worry/annoyance for everyone who runs a site with popular images, which is why the usual (somewhat counterintuitive) advice is to copy images and host them somewhere else either free or at your own expense instead of linking to them.

Though impolite, image leeching probably wouldn’t be a problem if it weren’t for web forums. If you post a link to one of my images (my cubic-wombat-poop picture is by far the most popular target) in a web-forum thread, I’ll get hit every time someone views the thread. On a busy forum, that adds up very quickly. Even worse, if you use one of these images as an avatar (the little picture that shows up next to your screen name on many forums), I’ll get hit every time someone views any thread where you’ve posted. I really hate that.

Here’s where things get a bit technical. For a long time, I tried to block abusers by site. In HTTP/Apache terms, I had entries in my .htaccess file that would look for certain host names in the “Referer” field for each request. That didn’t work all that well, because almost every day someone would be posting links to a new forum that I’d never heard from before. This month, I started blocking particular kinds of software instead. Most forum software can easily be identified by the appearance of strings like “viewtopic” or “showthread” in the Referer. I’ve developed a collection of about two dozen such strings, and this approach seems to be working very well. When I look at my logs I see appropriate denials even for sites I’ve never heard of before (but which are using familiar software). Meanwhile, requests from search engines or forums that Cindy or I use (for which I’ve created exceptions) or various personal webpages are being served just fine. If anybody wants to borrow my list of banned Referer strings, or collaborate on making it a complete list, just drop me an email.

In Love With Kate

OK, now that I’ve given Cindy a heart attack, I should mention that Kate is a text/code editor developed as part of the KDE project. Since Firefox came along to provide a decent web browser and Thunderbird to do likewise for email, one of the biggest obstacles to my use of Linux as my primary desktop has been the apparent lack of a code-editing tool that could approach the capabilities and ease (or pleasantness) of use that I’ve become accustomed to using EditPlus on Windows. To be specific, the must-haves for me are:

  • Stability.
  • A modern interface that allows effective use of either mouse or keyboard (key bindings must be configurable) and with sane cut/paste behavior. This excludes every version of emacs I’ve ever seen.
  • Auto-indent and word-wrap that actually work properly.
  • Decent search/replace functionality, including regular expressions and multi-file searches.
  • Syntax highlighting for the several programming languages I use, plus HTML. Configurability is also a key here, since I generally dislike the default highlighting schemes that come with most editors. Also, Python tends to be an acid test for whether the syntax highlighting is really robust. Many editors I’ve tried, both on Windows and on Linux, rely too much on braces (which Python doesn’t use) to indicate control-structure nesting or can’t handle the “”" form of multi-line comment.
  • Built-in FTP support so I can edit files on this website directly instead of having to download/edit/upload using two separate programs.

There are probably some others I’ve missed, but which I notice when I’m actually trying to get work done. That should do for a start, though. Kate is the first Linux-based editor that really satisfies these criteria. It does lack a few of the nuances I’ve become used to with EditPlus, but it also has some advantages (such as the ability for one program instance span multiple windows). I’ve been using various versions of vi and emacs just about forever, but Kate is the first Linux editor in which I actually enjoy working on code. Now Outlook’s calendar and Zuma are pretty much the only reasons I switch to my Windows box.

Amy Crawling Video

14 seconds, 1.1MB (WMV format). Probably more to come shortly.

Genetic Backup and Restore

According to a story at SciScoop, some plants apparently keep a “backup copy” of grandparents’ DNA, and can “restore” to avoid adverse genetic traits inherited from both parents.

Contrary to inheritance laws the scientific world has accepted for more than 100 years, some plants revert to normal traits carried by their grandparents, bypassing genetic abnormalities carried by both parents.

These mutant parent plants apparently have hidden templates containing genetic information from the preceding generation that can be transferred to their offspring, even though the traits aren’t evident in the parents

Needless to say, this has caused some consternation in certain circles where such things were thought to be impossible.

Irrational Fear

I found a pretty good quote about dynamic vs. static typing today, from Dave Thomas (one of the Pragmatic Programmers and an advocate for the Ruby programming language).

For a long time, developers felt that the type-safety of static languages would mean their code was more reliable.

That seems pretty intuitive. But increasingly, people are finding that not to be the case. They’re finding that the productivity gains they get from dynamic languages are enormous, and that type safety is generally not an issue. Sure, it is theoretically possible for you to have a variable called ‘person’ but discover at runtime that it’s referencing an object of class PurchaseOrder. But it just doesn’t happen in practice.

Indeed. As you all know, I’m a quantitative kind of guy. I believe the effort expended to address a problem should be proportional to the size (likelihood of occurrence times severity of consequences times magnitude of effect, or something like that) of the problem. Are type errors that big a problem? How often do type errors occur that are really errors and not just instances of legitimate polymorphism (“if it has behavior X I don’t care what type it is”), and which wouldn’t be caught immediately by even the most rudimentary unit tests? How much effort is wasted tweaking declarations or performing casts and conversions, to deal with something that wasn’t really going to be a problem anyway? Are strong compile-time type systems really no more than a crutch for the kind of programmer who doesn’t do unit tests? The more I think about it, the more I share Dave’s disdain for type bondage.

Desktop-Wallpaper Trompe l’Oeil

Exercise: take a digital picture of whatever’s behind your laptop. Use it as desktop wallpaper on said laptop, creating an illusion of transparency. For extra credit, repeat with a second laptop placed in front of the first. The effect is very cool.

(from Boing Boing, who got it from Waxy, etc.

Privoxy filters

Just for fun, I’ve been surfing from Linux instead of Windows for the last couple of days. Because I use Firefox as my primary browser, this results in little change except that I can’t use my Proxomitron filters on Linux. I’ve installed Privoxy instead, which mostly does the same thing but seems to be missing a couple of the filters I’ve become used to having. Therefore, I ported a few of my favorites from Proxomitron to Privoxy and present them here for others’ browsing pleasure.

FILTER: jd-target Disable certain kinds of popups.
s@(<a.*)\starget=”?_?(blank|new)”?(.*/a>)@$1$3@giu

FILTER: jd-shell Disable shell exploits.
s@(<a.*)shell:(.*/a>)@$1snail:$2@giu

FILTER: jd-on-click Disable onclick insanity.
s@onclick=@noclick=@gi

FILTER: jd-on-unload Disable onunload insanity.
s@onunload=@nounload=@gi

FILTER: jd-blur Disable popunders.
s@(<script.*)\.blur\(\)(.*/script>)@$1.close()$2@gisu

The first filter is actually similar to the “all-popups” filter, but it’s less specific in some ways (recognizes target names without an underscore) and more so in others (operates only within an anchor tag) so I prefer it and others might too. Note that I’ve had to replace left and right angle brackets with their corresponding HTML entity codes in the above, so you’ll have to undo that if you want the target/shell/blur filters to work.

Misunderstanding “volatile”

This site has tilted so far toward the political (or personal) lately that it almost feels weird to be writing about technical stuff again. Here goes, and I hope this doesn’t bore my non-technical readers to death.

One of the most misunderstood parts of C/C++ is the volatile keyword. Its meaning can be stated simply as that it disallows the compiler from optimizing out references to a variable. The problem comes about because many programmers assume that it means all optimizations involving that variable are disallowed, but the careful reader will note that the above definition says nothing about adding references. This became a problem several years ago at Dolphin, where we had a pretty innocent-looking piece of code:

a = b = *c;

The compiler generated code that dereferenced c twice, as though we had written

a = *c;
b = *c;

instead. This seemed rather silly, since adding memory references (and particularly pointer dereferences) is generally a good way to make code run slower instead of faster, but it was perfectly legal even with c declared as a pointer to a volatile value. The problem was that c didn’t actually point to a memory location. It pointed to a memory-mapped register, and accessing that register twice had undesirable side effects.

I think we eventually worked around the problem by creating (yet another) temporary variable to hold *c, but the way that volatile turned out to be insufficient has bothered me ever since. I decided at the time that there should be a second keyword – let’s call it fragile – that is the inverse of volatile; tagging a variable as fragile means that the compiler cannot add references. If both keywords were used, the effect would be that no optimizations (or “pessimizations”) at all involving that variable would be allowed and that the generated code would have to reproduce the accesses as they were expressed in C — exactly what many programmers think volatile alone does now. I believe that some specialized compilers have added keywords more or less like this but, as far as I know, it has never made it into a standard. It still seems like a strange omission.