Pragmatic IT

IT Infrastructure and Software Development from the Customer's Perspective

Privacy and the Cloud

A friend pointed me at articles from the Privacy Commissioners of Canada and Ontario about cloud computing. They raise some interesting points. By and large they’re good articles and raise points that you should consider.

I want to put a bit of context around them. I don’t think the cloud should be dismissed because of privacy concerns, but I wouldn’t blindly jump onto the cloud, either.

The article from the Privacy Commissioner of Canada had quite a few comments that weren’t directly related to privacy, and I think some of them need to be looked at.

First, the Privacy Commissioner for Canada states that cloud computing can mean an on-going cost instead of one-time fee. But there is no such thing as a one-time fee in computing. Your computing gear lasts three to five years. You need to replace it, and you need to service it while you own it. It’s much better in computing to convert your costs to a monthly cost, either by using the lease price, or by using the depreciation that your accountant would use.

Consumer lack of control refers to the challenge of moving from one cloud provider to another. For example, you want to take your blog from Blogger to Wordpress. It’s an absolutely important point to consider with cloud computing. It’s also an absolutely important point to consider when you use proprietary software (e.g. Microsoft) on your own equipment. There is a roughly equivalent amount of technical effort to switch to a different platform in either scenario.

In fact, technically you always have a way to get your data from a web site. The terms of service of the web site may prevent it, but technically you can do it. That’s not always the case with a proprietary, in-house solution.

Compromising meaningful consent refers to the fact that the cloud tends towards a single provider of most services: Facebook, Google (for search), Twitter are all dominant in their sphere. However, twenty-five years of Microsoft wasn’t exactly a world of diversity, either. Again, it’s the monoculture that’s undesirable, not the means by which we arrive at a monoculture.

Most of the Ontario Privacy Commissioner’s paper is actually about identity. I am not by any means an expert on identity. I learned some interesting things from the Ontario Privacy Commissioner’s paper.

One point I’d like to draw your attention to: Identity is impossible without the cloud, or at least the Internet. Most of the effective, practical identity mechanisms rely on an trusted third party. I believe the experts can demonstrate that this is required. You need the Internet to get to the trusted third party, and that third party is effectively a cloud service.

(What I mean by “practical” in the previous sentence is to rule out the public/private key approaches that work, but are too much of a pain for even most geeks to use.)

Finally, I want to step away from the privacy commissioners and talk about one aspect of the cloud debate: Many IT people are reluctant to embrace the cloud. Here is an example of IT backlash against the cloud. It’s important to remember that IT jobs will disappear as users migrate to the cloud. If you work in a 4,000 person organization you probably have a couple of people working full-time to support Exchange (the back end of your e-mail system). If your organization used gmail, they wouldn’t be needed.

What’s that got to do with privacy? Well, it affects the cases that the IT experts bring forward. For example, you’ll hear about the Chinese infiltration of gmail (attack on a cloud service), but you won’t be reminded about the Chinese attacks on Tibetan nationalist and supporters, which was primarily about compromise people’s personal computer.

I know that Google has way smarter people than me working on security, and they do it full time. I think I have a reasonably secure network, but I don’t even have time to monitor it to see if I’m being compromised. Security and privacy will be a differentiating factor in the evolution of cloud providers. The market advantage will go to those who provide the level of privacy their customers desire.

In the proprietary, self-hosted world, security and privacy are usually the last thing that gets any resources, because the competitive pressures are always something else.

On-Line Presence

My friend Elena Murzello just got her web site going. Elena is an actor who’s appeared as Anna in The L Word, Tennat #1 in Da Vinci’s Inquest, and a Nurse Educator in the Vancouver Coastal Health’s Unit Dose Project, among other roles. (One of these things is not like the other.)

She asked me for hints about how to generate traffic to her web site. I realized that I really should blog my thoughts, because other people could comment and correct what I say. So the rest of this post is written to Elena, but you can read it as if it’s written to you.

First, your site looks great. I’m really glad you have a blog on it, and that you’re writing new posts frequently. New posts generate traffic to your site, and traffic to your site improves your ranking on search engines. Write blog posts about what’s important to you, and about your experiences in the dramatic arts. Mention names: If you had a great moment with Nicholas Campbell while filming for Da Vinci, blog that. The more names the better.

(Let’s face it, you’ll get more traffic blogging about The L Word. Who am I kidding. And a note to the regular technical readers of this blog: The L Word is not a TV series about Linux.)

Tumblr looks like a great blog service. As usual, I learn something from you. Thanks!

Also, put links in your blog post, like I’m doing with this blog post. It’s a service to your readers, but it also generates traffic back to your site. Think of it as if you get points when someone goes somewhere popular because you pointed them there.

Of course, make sure that all search engines are indexing your site, but especially Google. You web site designer should have done this for you already, but don’t assume it was done. Ask.

You’re also absolutely doing the right thing by using Google Analytics (or any kind of traffic measuring tool). If you decide to pay someone to do some search engine optimization on your site, you need to have baseline data on how well your traffic was growing just from your own efforts. No point in paying someone for growth that you generated by yourself.

Next get on Twitter (Update: Elena’s at @ElenaMurzello) and put Echofon, a free app from the App Store, on your iPhone. Echofon makes tweeting from your iPhone easy, so that you’ll tweet a lot.

Then, use TwitterFeed or something like it to feed your blog posts to Twitter and Facebook. Your web site designer will have to put an RSS or Atom feed on your blog. Once that’s done, you can set up TwitterFeed yourself. If you don’t want to, your web site designer should be able to help you. And if they can’t, I’ll help you.

Add “Follow me on Twitter” and “Tweet this” icons and text on your web site (meaning your blog). Your web site designer will have to do that. The first makes it easy for people to follow you. The second makes it easy for people to publicize your site – free publicity!

Get yourself a YouTube account. Post clips of yourself with some commentary about why you like the clip. Put links back to your site in the YouTube comment, and blog about the video link and embed the video in your blog (as you’ve done already with other people’s work). If you want some help editing videos, let me know.

Finally, start reading Seth Godin’s blog. Seth is the original marketing brain of the Internet age and he’s an incredible generator of ideas. I find reading his blog overwhelming, but if you take even one of his ideas and implement it you’re way ahead of everyone else.

Now, about search engine optimization. I’m no expert, but I’ve heard from reliable sources that Google and the other big search engines put a lot of effort into preventing people from “gaming” search results. It makes sense. People won’t use a search engine if they don’t get the answers they really want. So Google does a lot to make sure that your rank is based on people who found your site useful. That’s why real traffic from real people is the best way to rise up in the search engine ranks.

Most search engine optimization techniques already don’t work. What I mean is that every time someone comes up with a new trick, Google and the others find a way to filter it out. No search engine optimization “expert” that you and I can afford to hire is likely to know how to outsmart Google. And even if he does today, you’ll find that next month Google has neutralized the trick.

If you really want to get someone to do search engine optimization, ask if they’ll agree to be paid based on the sustained additional growth in traffic they provide to your site. It will take some work to come up with a fair formula, but you have the raw data you need since you’re using Google Analystics. Really, if someone isn’t confident they can produce results for you, why should you be confident they can produce results?

I hope this helps. Let me know what you think.

Looking for IP Addresses in Files

I’ve moved a couple of data centres. And I’ve virtualized a lot of servers. In all cases, the subnets in which the servers were installed changed. If anything depends on hard-coded IP addresses, it’s going to break when the server moves.

The next data centre I move, I’m going to search all the servers for files that contain hard-coded IP addresses. The simplest thing to do for Linux and Unix is this:

egrep -R "\\b(\[\[:digit:\]\]\{1,3}.)\{3}\[\[:digit:\]\]\{1,3}\\b" *root\_of\_code*

The regular expression matches one to three digits followed by a “.” exactly three times, then matches one to three digits, with word boundaries at either end.

That’s not the most exact match of an IP address, because valid IP addresses won’t have anything higher than 255 in each component. This is more correct:

egrep -R "\\b((25\[0-5\] | 2\[0-4\]\[0-9\] | \[01\]?\[0-9\]\[0-9\]?).)\{3}(25\[0-5\] | 2\[0-4\]\[0-9\] | \[01\]?\[0-9\]\[0-9\]?)\\b" /!(tmp | proc | dev | lib | sys) >/tmp/ips.out

It yields about two percent fewer lines when scanning a Linux server (no GUI installed). (Thanks to this awesome site for the regular expression.)

When I run the above egrep command from “/”, I have problems. There are a few directories I had to exclude: /tmp, /proc, /dev, /lib and /sys. I used this file pattern match to get all the files in root except those directories:

/!(tmp | proc | dev | lib | sys)

The reason I wanted to exclude /tmp is that I wanted to put the output somewhere. /tmp is a good place, and by excluding it I didn’t end up writing to the file while reading it. /sys on a Linux server has recursive directories in it. /proc and /dev have special files in them that cause egrep to simply wait forever. /lib also caused egrep to stop, but I’m not sure why (apparently certain combinations of regular expressions and files cause egrep to take a very long time – perhaps that’s what happened in /lib.)

I’ll write about how to do this for Windows in another post. I’ll also write about how to do it across a large number of server.

Ubuntu Netbook Remix Desktop Disappears

I have a netbook running Ubuntu Netbook Remix (UNR) 9.04. I switched it to the regular Ubuntu desktop just to try. Before I switched back, I rebooted (the battery ran all the way down). Apparently, this is known to be a bad thing. When you restart, all you get is a blank desktop – no panels at the top and bottom to allow you to get at any commands.

The fix is described in Launchpad here, but I’m going to summarize it because it’s a little spread out in the comments to the bug.

  1. Right click on the desktop and select "Create Folder..."
  2. Double click on the folder you just created
  3. Navigate to "/usr/bin/desktop-switcher" and run it
  4. Switch back to the UNR desktop
  5. Now navigate to your home directory
  6. Show hidden files (View-> Show Hidden Files, or Ctrl-h)
  7. Delete the .gconf, .gconfd, and .config folders
  8. Log out and log back in

This should fix the problem. Now, with respect to the classical desktop, don’t do that :-)

Ubuntu Support Saturday in Vancouver

The ever-fantastic Ubuntu Vancouver Local Committee is organizing a Support Saturday. Come on down and learn about the world’s most popular free operating system. If you already use Ubuntu, get some help to make your experience even better.

The details:

Saturday January 30th, 2010 11am - 2pm
Vancouver Community College
1155 East Broadway (Broadway Campus)
Building B, Room G219

Click here for the poster.

Here’s the best door to use:

<iframe frameborder="0" height="350" marginheight="0" marginwidth="0" scrolling="no" src="http://maps.google.ca/maps?q=49.264165,-123.080798&num=1&t=h&sll=49.264228,-123.080834&sspn=0.000494,0.001241&hl=en&ie=UTF8&ll=49.264239,-123.080834&spn=0.001978,0.004962&z=18&output=embed" width="425"></iframe>
View Larger Map

Healthcare IT Doesn't Reduce Costs

A study in the Harvard Medical Journal shows that healthcare IT doesn’t reduce costs, and perhaps provides a very marginal increase in quality of care. The authors speculate that one of the reasons may be that the cost of obtaining and running the system outweighs the benefit. No kidding. That fits what I’ve seen completely.

It’s a short article, and worth reading. Thanks to Andrew for pointing me to it.

Ubuntu Local Committee Install and Tweak

The ever-fantastic Ubuntu Vancouver Local Committee is organizing a Support Saturday. Come on down and learn about the world’s most popular free operating system. If you already use Ubuntu, get some help to make your experience even better.

The details:

Saturday Dec 5th, 2009 11am - 2pm
Vancouver Community College
1155 East Broadway (Broadway Campus)
Building B, Room 219G

The poster is at: http://is.gd/560zM (a pdf).