And maybe librarians can start pasting ads in books?

Christina's Strip Mall by t-squared

TechCrunch shares the disturbing news that some bright bulb has developed software that allows ISPs to insert ads into web pages as they pass through the ISPs servers. The user is then unable to tell the difference between ads that are “supposed” to be there (and presumably benefit the creator of the content they’re perusing) and ads inserted by their ISP without the knowledge or consent of the author(s) of the web pages in question. Quoting TechCrunch:

As a content creator I’m horrified that any page I create could be plastered with advertisements I don’t approve of as I’m sure many others will be as well. There are probably copyright issues as well in terms of hijacking original works for profit. We can only hope that this evil form of advertising does not spread beyond Texas.

This blog, for example, is free of ads so far, and is likely to remain so for the foreseeable future. We simply don’t get enough traffic to warrant such things. You, however, might be seeing it with ads that I know nothing about, which put no money in my pocket, and without you being aware that I didn’t put them there.

Ugly as this is, the extreme case is truly horrifying. Potentially every server that passes along your HTML packets could be inserting ads and otherwise altering or rearranging content. We could reach a point where the page viewed bears almost no resemblance to the page served, totally undermining any and all efforts to introduce some reasonable design principles to this intarweb thing. (Not to mention all the privacy and censorship issues that would naturally arise.)

As useful as I find blogs like TechCrunch and Pharyngula, for example, both are already horribly blighted by oceans of ads, many of which blink, move, wiggle, and jump in ways that make me want to beat someone publicly. And those ads are placed there by the people that run (or at least manage) those sites, who have a vested interest in not making their web pages so awful people just won’t go there. ISPs will have no such compunction because they’ll pollute everything equally, so their only real risk is annoying their users so much that they switch ISPs (not always an easy option) or simply stop going on-line (which would take a lot of annoying).

I suspect, however, that encryption could do a lot to solve this problem. I haven’t thought through the details, but I suspect that encrypting all web pages using something like https would make it impossible for ISPs to insert their ads because they’d no longer have access to the raw HTML. It would cause an increase in the amount of traffic on-line (encrypted information is almost always larger than its raw, unencrypted form), but I suspect this would be nothing compared to all the traffic generated by YouTube. And I’m betting nearly everyone would gladly wait a tiny fraction of a second longer for their page to download and decrypt if it kept it free of all this ISP spam. What would be nasty is if encryption schemes ran contrary to the powerful monitoring urges of governments; it would really suck to drown in spam just so Big Brother can keep an eye on what crazy music I’m listening to on-line.

No tag for this post.

Related posts

10 thoughts on “And maybe librarians can start pasting ads in books?”

  1. If you use Mozilla, there are wonderful add-ons like ad-block and flash blocker. I don’t actually see any ads on Science blogs. The most I see are the little “f”s telling me that there’s flash content, if I wish to look at it. I don’t know how well (or if) adblocker would work against this latest…debacle.

  2. Using SSL is a generally good idea — but it’ll slow things down more than the “tiny fraction of a second” that the larger payload takes. The bulk of the overhead of using encrypted pages is the SSL negotiation that happens when the HTTPS session starts, and that’s significant. It also means that all content on the page (all the images that are loaded) have to use SSL also, or else the browser will give a popup warning about mixed secure and non-secure content. If some of the images come from multiple sites, the SSL negotiation has to be performed with each site, slowing it down more.

    So it’s a good idea, but I’d hate to have to use it. Better to shun the ISPs that do things like this. And there’ll definitely be a market for ones that don’t, because those that do will be hurting their own customers with it.

  3. I am contiually frightened by the human capacity to ruin things for others. Thanks for bringing this interesting thing to my attention.

  4. Qalmlea: I’m aware of such plugins, but have never bothered installing them. Part of this is raw laziness (which I possess in abundance), but part of it is a weird inclination to see pages for what they are, both good and bad. That little lump of coal banner ad on ScienceBlogs.com is remarkaby annoying, but at some level I’m glad I see it because it informs my broader vision of ScienceBlogs.

    Barry: You’re right about the times, especially the whole negotiation phase at the beginning. And I hadn’t even thought about the issues of encrypting images, and pages with content from multiple sites. Ugh. It definitely would be better if we can simply thwack the stupid ISPs into behaving than end up with great oceans of SSL just because of a few bad apples. I’d be a little anxious for people who don’t have (or don’t realize that they have) ISP options. The set of those without options is probably shrinking daily, but the set of people who aren’t aware of their options is probably large, and the kind of people who’d use this sort of software are probably above a little FUD to keep the less confident/knowledgeable users in line.

    CoryQ: Agreed. It really scks that the whole intarweb has been a brilliantly frustrating series of case studies in the tragedy of the commons.

  5. There’s always tor. If the problem is the ISP on your end, that’ll fix the problem, as they’ll only see an encrypted stream of mishmosh flowing into your browser. (Of course, actually logging into anything is problematic.) If it’s the ISP on their end, well, the site administrator will definitely know about it.

  6. I’m not up on the details of Tor, but I guess I’m not clear on how this would solve the problem for most users. If you’re not running a Tor server, then wouldn’t that last hop (which would presumably be visible to your ISP) not be encrypted? If that’s correct, then Tor would protect you from everyone along the route except the people at the end (including your ISP).

    We can presumably work around this by having everyone set up a Tor server, but that’s clearly not going to be an option for most users for a whole variety of reasons.

  7. The tor server generally runs on your own machine. The unencrypted communications take place (a) between your browser and your tor instance, running on the same machine, and (b) between the tor endpoint, which is somewhere random on the internet, and the site which you connect to. There’s a good diagram here.

    It’s clearly not a fix for the problem as a whole, but it is something end users can do to protect themselves from their ISPs.

  8. It was the red dotted line in that diagram that I was assuming was the weak spot. Is it reasonable to assume that all sorts of random individuals are going to install Tor servers on their computers and then route all their traffic through that?

    All that said, it does appear that Tor does provide a reasonable option for people that both care about the issues and are comfortable with a technical solution that will obviously take at least some care and feeding.

  9. Um, just a quibble, but encryption doesn’t necisarially mean larger in size. One big benefit of having encrypted html would be that the same mechanism could easily compress / decompress the page at the same time.

    Slowness wise, it shouldn’t make any sort of difference unless you have a very very slow (processor wise) machine. On the server side, you can just pregen and/or cache the encrypted pages, so no problem there.

    SSL is overkill. A public key / private key encryption scheme (like PGP) would be fine. No need for per-session key exchange and such which really does add latency. And PGPish encryption would have the significant benefit of ensuring the page is actually written by who it says it is (assuming the normal constraints that public key is from a trusted server and the private key is private of course).

    Anyway, I have thought about this quite a bit since the early days of the web, and never found a real reason “why not”… especially now that the decryption is basically nothing compared with the power of the average computer (in the olden days, it would have actually slowed things down).

  10. Good point on compressing the HTML – if you’re going to encrypt it, compressing it along the way makes a lot of sense. That won’t help much on pre-compressed material like images, however, which often are the bulk of the size of a web page.

    I also agree with your thoughts on the speed issues. A decade back, you could really tell the difference in load times between visiting a secure page and one that wasn’t. Current hardware plus broadband has really washed that out.

    I’d have to think more about the PGP solution. Are you suggesting only doing public key encryption, or using public key to encrypt a private key? The former is pretty slow and does tend to bloat messages, and the second seems to just be a different kind of key exchange. Maybe I’m just missing something?

Comments are closed.