Quotes of the Day

Cory Doctorow

…Once you get past the vanity of knowing exactly how many copies have been made, and find the zen of knowing that the copying will take care of itself, you’ll attain dandelionesque contentment.

Brent Simmons

I love it when TV People — newscasters, analysts, politicians — say they were “on the ground” somewhere. It’s a good and welcome reminder that they normally live in the clouds, in heaven, up with the angels. Not on the ground with us, where things are mysterious and messy.

JP Rangaswami

I have found the following to be true of large organisations:

* We stress the importance of human resources, human talent, human capital
* We stress the importance of teamwork and collaboration
* We stress the importance of openness and transparency
* We stress the importance of trust

And then, mysteriously, we somehow manage to create an environment where we jealously guard information; where we seek to create and extend power as a result of this jealous guarding; where we then exploit this power in all kinds of ways, some less abhorrent than others

Tying all of these together is left as an exercise for the reader.

Come on feel the noise

So… after a little more than a year with my first digital SLR - I decided the take the plunge and go from a camera that was just beyond my ability to one that is WAY beyond my ability. Hopefully I’ll give this one a bit more time to catch up to it.

There were a lot of reasons I went from the D80 to the D300, most of centers on being able to take continuous pictures faster. I take a lot of pictures of the dogs - and a lot of pictures of my wife’s horse jumping (she’s an eventer) - and I wanted something that would focus better, take more continuous frames, and did better at higher ISOs. I’m pretty happy with all of that so far (and the D300 meters in tough situations a lot better).

One thing though, that I have been more than a bit mystified by though is the noise at ISO 800. Everyone review I read about the D300 was that the noise difference (vs. the D200, D80, and others) was spectacular. I can see a difference - but it’s a lot less pronounced than the reviews seemed to make it out to be.

And the jury’s still out on this, but I’m not all that sure that what I’m seeing isn’t due to Adobe Lightroom (which is my raw processor and image organization tool of choice).

So I took this image:

Pull

The exposure is 1/640sec, f5.6 (basically as wide open as my 18-200mm will go at 150mm), ISO800 image. 0EV adjustment. NR “in camera” is set to normal. Output is 12-bit Raw (Nikon Compressed)

And decided to spend some time in all the raw processors and see how they handled the noise. Profiled is Lightroom (Adobe Camera Raw 4.4), Aperture (Trial Version), Capture NX 1.3 (License included with D300), and iPhoto. I didn’t do ACR+Bridge+PS3, ACR and Lightroom are from my understanding, the same.

Here’s Lightroom’s defaults (25% Color NR, 0% Luminance NR):

Aperture:

Capture NX:

For the record, Capture NX has a “Faster” or “Better Quality” setting. The differences for this image were really to subtle to include.

iPhoto’s default seems a lot like Aperture.

Now - what about some adjustments? I really didn’t spend much time playing with Capture NX, one, because it’s incredibly slow to preview anything (which may be cause of Leopard, I don’t know), and two, doing anything seems to radically change things (moving the slider all the way really changes the image). So here’s Lightroom, and Aperture cranked up.

Lightroom:

(luminance only, increased Color NR, didn’t make much difference)


Aperture:

Here’s an iPhoto adjustment (radical effect at 50%, a lot like Capture NX)

In case you are wondering about the “No Noise Reduction” case, here’s Lightroom, and Capture NX (turning off NR for Aperture didn’t appear to do anything for me, so it looks like some kind of default NR is always applied).

Lightroom:
(you can really see the effect of the 25% Color NR here)

Capture NX:

The Verdict

Noise reduction, like beauty, is probably all in the eye of the beholder.   But my eye really prefers how Capture NX handles the noise by default.    I know that Lightroom is trying to preserve edge detail, and it’s a “different” kind of NR - but I think that the default leaves a bit to be desired compared to the defaults for both Capture NX and Aperture.

Cranked up, it’s hard to say.   It’s good, but I’d still leave the edge to Aperture here - and certainly Capture is powerful (too powerful) - and iPhoto leaves all semblance of subtlety behind with it’s modifications.

I’m still a bigger fan of how Lightroom handles my photo library.  (though I haven’t given the Aperture trial a full and fair shake with Library handling).  Although I despise how Aperture handles zooming, I really like Lightroom’s handling here.    Capture NX, is well, just weird.  I’d only see myself using it when there’s an image that I really, really like and want to see if it will do things with it that the others can’t.

This is a really interesting software space right now, everything’s different, and good in it’s on way.   It will be fascinating to keep watching these packages battle it out in the future.

(p.s.  I did try out the Lightroom 2.0 beta, it’s NR handling doesn’t seem different.  There are differences with other defaults, but NR seems the same.)

Test Driven

Sometimes the only way to express one’s opinion about their experience is using plastic dinosaurs

I do want to get the test religion, and I think it could be really useful, particularly in libraries, but I seriously have yet to see good QC out of tests at the small application level,  other than in data transformation situations (e.g. consumption of feeds, protocol tests, etc.)

Irony

While trying to run Windows update with my VMWare Fusion-based XP install:

Script Error coming from Microsoft Update

And when running the debugger:

The line the script debugger took me to.

I’m sure it might possibly be something I’ve done to it, either with the Script Debugger install from Office Two-thousand-whatever-it-is or something else. I just find it a bit humorous.

Beware the Ides of March

So on March 15th (well, March 16th UTC and EDT) - the public facing website that I share in the responsibility for suffered an outage.

It’s a classic example of a cascaded failure - particularly a failure of human judgement (mine). I spend a lot of my time in my job trying to anticipate and either prevent, or to mitigate, cascade effects - so it’s great to review from them when they happen. You don’t get enough stories like this on the web, mostly because failures - particularly judgement failures - don’t get public exposure. That’s probably a lot out of fear, embarrassment, trade secrets, whatever. Which is a shame really, because you can’t learn unless you screw up. And even after 15+ years of doing this day in and day out, and a confidence that I can solve and work through any computing problem before me. I still do it, and it’s not the least bit ironic.

So I’ll write about mine - so you don’t have to :-)

I wrote this up as an email on the 16th for my peers in my engineering team - and I got a lot of response back saying “thanks for the hard work” - which kind of embarrassed me really. I didn’t write it up for that. I wrote it up for edification about cascade effects of both technical and judgement errors. So I wrote up another note with the followup - and it’s a great summary of what follows:

  • Logic errors in code, triggered by a somewhat rude web spider, generated well over 10,000 error emails.
  • That email volume overwhelmed our mail server (for reasons I don’t completely understand, it’s weathered things like this prior. I blame Brutus) Which caused email delays that were measured in hours
  • Which was taken at face value, and used as a basis for an flawed solution to a no-longer existing problem
  • The implementation of the flawed solution, actually caused an outage to the site, when the original problem never did

Plus a whole lot of little things in between.

I wrote “It’s a completely fascinating study about a series of decisions, errors, and failure in due-diligence, combined with on-the-spot judgement errors based on flawed (and not sanity-checked) information can lead to cascade problems - and problems made worse.”

Like the Titanic, only without Celine Dion. Okay, nothing like the Titanic. More like, Waterworld

What follows is an incredibly verbose, but relatively interesting tale.

Background

Beginning around 2:03am EDT Saturday morning, March 15th - and continuing through 1:18pm EDT Saturday afternoon, what appears from Microsoft documentation to be a SharePoint Portal Server running at a peer institiution made 72,306 spider requests to our public website.

Of these requests, 19,814 were to a “feed” URL - of which ~13,000 (number approximate due to the grep command used to calculate it) of these resulted in an application error within our application. The application error-generating spider requests ran from approximately 2:03am EDT Saturday until 11:10am EDT Saturday, likely generating emails for each and every one of the application crashes (there’s ~9,900 in a shared mailbox, I assume most to all ~13,000 were generated).

Now, those application errors weren’t the spider’s fault. However, when 13,000+ requests result in a “500″ error from the site you are spidering? I’d really hope that the spider would stop. I’ve asked the peer institution for some clarification, but never got any. I imagine the person running the site inherited the spider. That’s kind of how it goes in higher education. But, again, the errors weren’t the spider’s problem - I just expected it to be nicer about it.

Source of the application error

The application error was the result of a logic error in the application. It determined what object to retrieve for the feed by checking a parameter corresponding to the object type. It’s an either/or check. Either it’s a Article, or it’s assumed to be a Faq. One problem was that the check was case-sensitive - and the parameter in the URL was not. But the real problem was, instead of doing the smart thing on the assumption, and when assuming it was an Faq - by hard-coding Faq - it used the parameter passed to it. So if the parameter had been “wtf” - the code would have tried to retrieve “wtf” objects (using the rails parlance of “object” that is, it’s just a bunch of rows from a table) Which of course fails pretty miserably in most cases (actually thankfully) The source of this was the application itself, with the apparent production of mixed-cased URLs in the application elsewhere (or the sharepoint server downcasing URLs it was spidering, my bet is the former, but I haven’t searched for the URL-production code yet).

The application error in our application has been mitigated by actually editing code on the server. In a semi-humorous note, both myself and a co-worker implemented very similar mitigations in the trunk (I chose downcase, he chose upcase :-) ).

Real impacts to our public site

I’m not sure of the real impacts to the public site. I don’t think it ever went down actually. Certainly those ~13,000 failed requests terminated the ruby process that received the request, but it appears that the internal load balancing and the monitoring restarts worked fine. Google Analytics reports 1,940 “visits” on the 15th, compared to 1,945 “visits” Saturday March 8, and 1,676 “visits” on March 1. (yeah, we know, your blog gets more, it’s a work in progress) So I don’t think it it had any impact on the actual user experience to the site itself.

However, what it did do, was to significantly degrade the operation of our mail server. Here’s where the story really gets good.

Blissfully Unaware

Until this incident, I personally had filtered all of the rails-generated application errors to a dedicated subfolder. This has turned out to be problematic. Particularly in this instance.

The reason I do, is that in general the rails generated application errors are benign from an operational perspective. While they usually highlight coding errors - those coding errors are of varying severity. (this one is pretty serious, but mostly in the academic sense) - and they rarely represent a true systems-level severity. (which would be: the site itself is inaccessible or the errors are causing impacts to other software in the stack). So I keep them out of my Inbox to reduce the inbound noise, and to focus on real operational emails.

It would be fine to do this in most circumstances - because I’d see the new message counter on the folder being updated as messages were being filed into it. However, Apple’s mail.app, apparently as part of it’s caching mechanisms, doesn’t always seem to check sub-folders for new messages (particularly sub-folders of sub-folders) - unless you quit Mail and start it again, or actually click the “Get Mail” button (and sometimes with the latter it doesn’t seem to update). I had been running Mail.app open on both the laptop, and my desktop at home all day, as is usually the case - and hadn’t noticed anything out of the ordinary.

The combination of the filtering, and Mail.app’s lazy sub-folder updates meant that I was blissfully unaware that thousands of application error emails were being delivered. Probably much to the chagrin of my co-workers as the thousands are application error emails were likely going to their inboxes.

The only indication I had that we had some weird email issues going on was that I noticed that the email that gets sent from our issue tracking system every night to me about “due cases” hadn’t come in (we don’t set due dates on issue tracking cases for support, so the contents of the email never matter any, I just keep it coming to have an indicator that the issue tracker email is still being delivered). I didn’t think anything of that much at the time. Although, I did try to send a test email to myself from Gmail, which also didn’t deliver immediately, which was odd, but I got distracted with dinner and movies at home, and forgot about that.

Failed Troubleshooting

So just before 10pm, I take the laptop upstairs to hook it up to a charger. and check a few things on the desktop computer before bed. This is when I notice that the IMAP folder that holds the application errors has an unread count of ~4,000 emails, and when I check the folder, even more are coming in. (Remember that the spidering had actually stopped at 1:18pm EDT, and the error-generating retrieval at 11:10am EDT, but I didn’t know that yet)

So, based on the received timestamps for the email. I made a completely inaccurate snap assumption. I think we have a live spidering issue, generating thousands of errors right then, and figured I needed to do two things:

  1. stop the spider
  2. (possibly) fix the source of the application error

Our architecture at our hosting provider consists of a server that acts as a firewall (that we don’t have access to). And our web server/database server (which we do have access to). I figured that I would set a firewall rule on the server we do have access to, to block the spider.

Which I did - and restarted the firewall process on the server we have access to.

This is when a completely side issue that was really, really unfortunate happened. My shell connection to the box froze. Shell access to the hosted server is accomplished by shelling into a box on campus, then shelling to the hosted server (which remember, is behind a firewall). This cascaded shell access seems to exacerbate any network conditions that might temporarily impact the SSH process, and it’s possible that iptables restarts on any of the boxes, which normally don’t terminate or freeze a shell if the ssh ports aren’t blocked, will temporarily do so with this cascaded setup. I’m not totally sure. I really haven’t diagnosed what’s happening here. I just work around it.

So the shell froze. and I think to myself “OMG What did I just do? Did I typo the firewall rule somehow? And block all access to the machine?”

And then I tried to load our public site in the browser. It was inaccessible too. And these two pieces of evidence, blocked web access, and a frozen shell connection - made me leap to the assumption I had typo’ed the firewall rules, and all access to the box was blocked.

This is where the Talking Heads song comes to you in the most unfortunate of moments. “OMG this is not my beautiful server. OMG what have I done.”

I called the hosting provider’s NOC (which I’m sure was quite entertaining to them, not) to get them to reverse the change I just made. They first go to the firewall box (because in their setup, that’s where the firewall rules are). But while they are getting the root password to go into our server, I manage to get in again. I didn’t typo anything. But meanwhile, the mails ARE STILL COMING - so I figure we are still getting spidered. But the website isn’t working. This all seems very weird.

I ask them to block the spidering box at the firewall itself. Hang up, and continue troubleshooting. The public site is still inaccessible. But the emails keep coming.

I notice that one of my co-workers had been on the machine at around 3pm EDT, so I called him at 10:30pm or so, and asked if he had done any firewall troubleshooting - in case something else was going on and I needed to know. He hadn’t, he actually had forgotten his root password, and couldn’t make any changes - he was going to do the same thing I tried :-).

So I hang up with my co-worker, and continue troubleshooting.

All of the website processes are fine. But I shut down the ruby mongrels (and the emails keep coming about the errors! but nothing about my changes to the mongrels and monit! weird!). Then I look to fixing at the code to figure what was generating the errors. Which I fix, and check in (but didn’t get the email about, weird!) - and then I fixed it on the server - and restarted all the mongrels and the web server.

But the public site is still down!

Well, when I edited the firewall rules, I hadn’t bothered to look at the rest of the firewall rules that were in the configuration file. Why should I? In theory, those don’t change - they aren’t part of the files that I use subversion to manage (not for this box) and the site had been accessible earlier. Well, at some point, either from me, or the hosting provider, or the ever-present “Not Me” from Family Circus, the rules that let port 80 and port 443 be open on the box had changed. But the firewall had never been restarted - until I did it with my attempt to block the spider host. And when it restarted, there was no rule letting in port 80. (it’s a good thing that my co-worker had forgotten his password, he’d have ran into this same problem, although, mid-afternoon troubleshooting is much, much, better than late night troubleshooting).

So I fixed the firewall rules, and finally the public site is back.

But you guessed it, the emails keep coming!

Have I mentioned the emails?

At this point, I finally get a clue that there might be a serious problem with the email delivery (uh, duh) The server is fine, the code has changed, there’s nothing in the logs about crashes. There’s nothing in the logs anymore about the spider.

So I connect to the mail server. The management tools show that CPU usage is through the roof and has been for hours on hours. And there’s 8,000+ messages backlogged in the email queue.

I’m not sure what to do. I try restarting the mail processes in the hope that something strange is hung on it.

I stop all inbound email.

I try to use a management tool to delete the thousands of queued application error emails. This turns out to be a huge mistake. Because it locks up the email server for close to 5 minutes. Locking me out of the box entirely AND because the mail server is also the authentication server, it stops SSH access to all our other servers, while SSH tries to connect to the authentication server (even though I don’t use password authentication)

At this point, I’m like “Uh-oh. I don’t have physical access to reboot the thing without waking someone up in one of my peer groups on campus to let me in the building and that server room” (yeah, I don’t have after hours access to this particular box, which is mostly, but not completely my fault).

Finally, the server responds again. But is still processing only about 3-4 emails a second, out of 8,000. I start looking for scripts on the internet that will somehow let me delete queued mail. Probably 4,000 of those are app error mails (the app errors go to the mail server, then to a mailing list server, and then back to our mail server). A few dozen messages are probably legit, and the rest spam. But because of the legit mail, I can’t delete all of it.

So I find some scripts, delete the app error email. But the spam/virus processing scripts are totally messed up, letting in a lot of spam and eating 100% of the CPU. But I’m afraid to reboot, so I let it go on for a few hours, but the pace is just too slow. I cross my fingers, reboot the whole thing, which speeds up the processing.

Finally, at around 2:30am EDT or so, the queue clears finally, I turn on the inbound mail again, and the apperror messages keep coming. The mailing list server still has ~1400 emails in its queue (it’s a secondary list server, used for testing things related to the primary list server, and hosts only engineering staff items, it could be sacrified). So after deleting those, the server (and the system admin) can breathe again.

The Moral of the Story

Certainly this isn’t the first set of cascaded errors I’ve dealt with (and contributed to). And it certainly won’t be the last. The whole point of recounting stories is to be able to take a step back, look at things with fresh eyes and figure out what you do better next time.

Did I learn anything? No, not really. All the mistakes I made are things I know better to do already. They are the computational equivalents of “measure twice, cut once” And all the things that caused the problems that led to the mistakes I made, I already know they have the propensity to cause problems. When you look back, you can lots of little things that add up - many wrong, some right. (It’s almost like playing along with the home game of “spot how many errors are in this picture” :-) )

So what then, is the point? The point is - it’s about what you don’t do. One of the best “servers are down” people that I have ever worked with would go and get dinner (or coffee, or a bagel) before bothering to look at the problem. Sure, it increased downtime 15 minutes. But it gave him the time he needed to take a step back, evaluate the information at hand, and not react. That 15 minutes would often save him 3 hours. I’m not sure I quite have that moment of Zen yet. But I’m getting there. Once you figure out the things that you know to do, it’s beginning to learn the things that you don’t.

And that’s what cascade errors teach you. The wrong screw in the wrong place can weaken buildings. But pulling out that screw before you realize what all is depending on it being there is worse. I’m not sure that human history shows that we ever quite learn that.

But maybe the more stories we tell, the better.

Sigh

Make default feed format pluggable

milestone changed from 2.5 to 2.6.

Format of feed (RSS, Atom) should be an option

milestone changed from 2.5 to 2.6.

Tune in 3 months later for “Milestone changed from 2.6 to 2.7″ Y’all come back, hear?

Just FYI

We interrupt your internet with this important message.

  1. There is no Web 2.0 It’s just the Web (there’s really no web either, there’s just data, but that’s a separate subject altogether) But if you feel good about calling it Web 2.0 - that’s fine. Heck, I do too sometimes.
  2. There is no Web 1.5 Please eye anyone that says this to you with suspicion.
  3. Whatever Web 3.0 will be, it won’t be called Web 3.0. When and if you ever hear someone trumpeting some kind of vision in a presentation - and then calls anything Web 3.0? At that point you have an obligation to speak up and call their bluff. Or run away. The latter might be safer to your sanity.

This has been a public service of the Jay Broadcast Network. We now return you to your regularly scheduled internet.

Note to Self

Keeping around a half-million spam emails in a single folder? Terrible idea.

Wow that’s fast

After seeing Anne’s Twitter about problems at the Georgia Dome - I turned to Google wondering what might be going on. I wondered if this was some current event, or like, maybe issues with the facility itself being talked about during the basketball game there.

The Wikipedia article for the Georgia Dome ALREADY HAD INFORMATION ABOUT THIS:

March 14, 2008 Storm

A storm blew through the downtown Atlanta area during the 2008 SEC Men’s Basketball Tournament causing some damage to the dome. The storm occured during the overtime of the Mississippi State/Alabama quarterfinal game and stopped play.

I can’t find anything about this in the major media outlets online. Clearly a potential tornado that damages a building holding thousands of people in a sporting event should be news right?

But Google is beginning to find comments in weather forums, and links to some local media outlets (and their blogs!) with the story.

Moral of the story? User-generated content (and corroboration from local news outlets, utilizing modern tools) wins again. In short order, Twitter, Google, and Wikipedia, combined with local media give me a picture of what’s happening at the Georgia Dome, faster, and more accurately, than any major media outlet.

I bet there are flickr pictures and twitters from people actually there too that won’t take long to find.

[updated to add: Anne re-tweeted within minutes this output from tweetscan]

A tag similarity algorithm

In addition to being a systems manager (and customer support, and a manager, and an information architect, and all the other hats most of all of wear in small engineering teams at Universities and startups). I sometimes do development - currently almost all rubyonrails. The core application I’m responsible for is our “Identity” application - basically the user registration app for our internal tools (and an openid provider for internal users).

One of the things it’s about to do is let folks create and join communities. This isn’t as exciting as it sounds yet, or any time soon, it’s mostly an accounting feature of a reflection of existing communities (essentially committee assignments of a sort). It’ll be used to generate mailing lists, and it will probably grow to actually seeding into real networking tools to help facilitate actual virtual communities. But for right now, it’s really only accounting (which may mean it doesn’t get used all that much, that’s okay, some of this is really building block code for other applications).

One of the things that’s about to be added are user tags, both to begin to capture available interests and expertise, and to get folks used to tagging. (we do have tagging in our FAQ authoring application too).

So the actual point of the post - I’m going to use those tags to generate community recommendations.

To start, communities will have tags - not from people actively tagging them, but as an aggregation of the personal/self tags of the people that are interested in and/or members (we have potentially two roles, long story, we’ll just refer to those combined as “members”) of that community.

Ben and I decided not to have folks actually actively tag communities, figuring it was enough to begin with to get them to tag themselves.

“Community Tags”

So, Communities will have a set of tags, from its members. e.g. If Ben has tagged himself:

designer html ilovemarkup ruby

And if James has tagged himself:

ilovecoffeeandchickens ruby coder html

And if Aaron has tagged himself:

ilovenascar ruby coder

And they are all in the “engineering group” - then the engineering group will have the union of those tags:

designer html ruby coder ilovemarkup ilovecoffeeandchickens ilovenascar

However, practically, I think we’ll only ever display tags on communities that are an intersection between two members in the community (get tags where tag count >= 2) - and I think it’s probably safe to keep that match going - only ever dealing with the tags on a community where at least two people in the community have those tags.

html ruby coder

Matching users to communities

Say that Kevin has the tags:

wheresmyiphone html ruby thoughtleader ilovemarkup

Would the engineering group be a good match for him based on the tags of its members and interested users?

There’s a veritable cornucopia (okay, crapton) of correlation functions out there, most of which go over my head (even after years of math, I honestly have to spend a lot of time staring at the greek letters in symbolic math to understand it again, often turning it into pseudocode believe or not). One simple correlation function is called the “binary overlap” - which essentially boils down to calculating the intersection of Kevin’s tags, with the engineering groups’ tags, divided by the minimum of either tag set.

The idea being that if a community’s tags and Kevin’s tags completely overlap in one direction or the other - it’s a 100% match. There are other algorithms that take into account more about “different tags” - which would negatively impact the correlation (more different tags than same tags) - but I think those correlations would only be valid if people were actively picking tags for the community - and we needed to take into account how the community had different meanings for different people - based on them actively tagging their communities.

So with the simple overlap/positive correlation. In this case: (again, dealing with ONLY those tags where the count is 2 or more) - Using the simple overlap match, Kevin has a correlation to the engineering group of:

Intersection of Kevin’s Tags with the engineering Community Tags == 2 (html, ruby)
Minimum of Kevin’s Tags or engineering Community Tags == 3 (from the engineering community)
2/3 = .67

If Kevin also tags himself “coder” - he’d have a correlation of “1″

All well and good right? Well, I’m not sure the simple binary overlap is the best way to go here.

Modified Matching

I feel like we have to take into account a more majority intersection of the members of the community - but without weighting the results of a match toward larger communities (in fact, the opposite, letting folks find smaller communities more easily). That is, if a designer community has 100 members - and 90 of those members have the ‘html’ tag - but only 2 of those members of the community have the ‘ilovemarkup’ tag. My match to the community should be weighted more by the ‘html’ tag than the ‘ilovemarkup’ tag. But it has to be a percentage base, I think. If the engineering group has 10 members - and nine members of the community have the ‘html’ tag - that’s as good a match as the designer community (for that tag and person)

So - what’s the implementation of this? Well essentially the intersection in the simple binary overlap is the summation of the matching tags, where each match is given the value of “1″ - so in order to take into account the relative weight of the tags, you add the percentages of the members of the community having said tag.

Again, I’m sure there’s some fancy name for this - and this idea is in that huge list (or it’s completely flawed and not in someone’s list). But I’m not sure, my eyes starting glazing over at 11pm trying to read the details of “Levenshtein Distance” and I couldn’t make it any further :-)

Anyway, back to our original example, engineering tags (>= 2, frequency listed)

html(2) ruby(3) coder(2)

Kevin tags:

wheresmyiphone html ruby thoughtleader ilovemarkup

Correlation:

html (.67) + ruby (1) / min tag count (3) = .56

If Kevin tags himself “coder” then the correlation is:

html(.67) + ruby (1) + coder(.67) / 3 = .78

Probably a pretty good match. What about a designer community of 100 people with the following tags?

html(90) designer(100) ilovehaml(5) thoughtleader(80)

With a straight binary overlap, Kevin would have a .75 correlation (matching 3 of the community’s 4 tags) - but with the weighted correlation, he’d have a correlation of (.44) Which seems more accurate within the context of all of the members of the community.

So, essentially, the algorithm seems to reward:

  • homogeneity of the tags of the membership - at least clustered around a set of core tags
  • smaller groups (or at least more diversity with smaller groups)

Which seems to generally be the right thing to do when pulling groups of people together, smaller teams function better than larger teams. Although, the real science is later trying to deal with tag clusters and trying to get some heterogeneity around core connecting members (or a member interest in this case). But that’s beyond my simple positive correlation recommendation here.

Anyway, if you made it this far? How does this sound for you? Better options? Too complex? Really not complex enough?

Bonus QOTD

From Erica Sadun via The Unofficial Apple Weblog:

10:55. Salesforce: Proving even an iPhone can be boring.

I have nothing else to add.

TOTD: Craig Hockenberry

Twitter of the Day from Craig Hockenberry, while following the iPhone Dev event:

I bet it’s quiet in the Android offices right about now…

Note, at the time I write this Apple hasn’t said yet how/if they’ll control distribution. The dev tools will kick butt, but that distribution control may be the Android advantage.

Interesting, very interesting, times ahead.

Quote of the Day: Steve Jobs

In an interview with Fortune:

“People think focus means saying yes to the thing you’ve got to focus on. But that’s not what it means at all. It means saying no to the hundred other good ideas that there are. You have to pick carefully.”

(via: John Gruber)

The Change

During lunch today, I turned on the television (a pretty rare event around here) - and started flipping through the channels, when I caught a broadcast of “The Tim Russert Show” on MSNBC.

I’ve always been impressed with Russert, and I was pretty impressed by Tim’s guests too. And I was fascinated with the topic, Barak Obama vs. Hillary Clinton.

I was absolutely hooked by this segment - it’s highly recommended viewing:

  • About 33 seconds in, Tim reflects on an Obama/Winfrey rally (I think in SC?) about how they asked the audience members, that all had a cell phone, to text 5 of their friends and tell they needed to get out and vote. Tim’s exclamation - “what a way to communicate! what a way to organize!”
  • About 1:03 in, Norah O’Donnell talks about the power of “viral marketing” - which we all know to be true - but how this is really helping to drive the Obama campaign’s success.
  • At 1:20 - O’Donnell says that she is struck about how Hillary is instructing folks in speeches that they can log on to Hillary’s website. And she says “I thought to myself, ‘How 2000 that is’ - because everybody knows if you’re interested in a candidate - how to find their website”
  • And my favorite part - at 1:45 - Eugene Robinson begins talking about the Obama campaign, about how paid staffers, all the way down in the organization, operate with a “sense of agency” that they are a “thinking part of the campaign” - at 2:06 “it doesn’t seem to be strictly hierarchial; it could be a newer, more networked kind of organization” It’s a hypothesis, he says, but that’s the organization’s appearance.

How amazing, a networked organization, staffers that seem to be able to operate with latitude and make decisions, the use of viral marketing and ubiquitous technology, running up against a command-and-control, top-down, “How 2000″ type of organization.

Obviously there’s a lot at play here, Obama has a charisma that Hillary doesn’t have. And the same kind of organization doesn’t seem to be doing Ron Paul much good on the Republican side (Ron Paul is certainly no Barak Obama either).

Hillary’s talking points might be right though, speeches without action don’t really work. So who’s action is working the best here?

p.s. Obama has won 10 states in a row. What a way to communicate, this newer, more networked kind of organization.

[updated to add...]

I really wanted to find this clip online when I saw it on MSNBC. Did I go to MSNBC or CNBC first? No. I went to YouTube. YouTube went down today, so I had to go to the *NBC sites. Did I find it there? No. “The Tim Russert Show” doesn’t even seem to have a web prescence. Where did I find it? YouTube, courtesy of a French-speaking Canadian Blogger - whose own commentary I eventually read after writing my own. Using Google Translate. What a way to communicate. This newer, more networked, technology. How NOT 2000.

Dear Time Warner Cable,

Dear Time Warner Cable,

So, I went to your corporate site to check out again your Digital Phone service? Just to see the latest pricing and what your service offering is?

Yeah, I was browsing the feature page - and I’m guessing you really could use an update of the way you manage your site:

cabletheft.jpg

Wait, let’s look at that title more closely.

cabletheft_noreally.jpg

That’s probably your corporate favorite title and all. But I’d recommend not putting that on pages where prospective customers are looking at your service offerings.

- Jay

I for one welcome my health records overlord

So, in the next few days, you’ll have to be under a technical rock to not know that Google has partnered with the Cleveland Clinic on medical records access for patients and care providers.

I imagine that a lot of the reaction that I’ll be seeing in my aggregator will be a lot like Fred Stutzman - because I tend to surround my aggregator with folks that think like Fred. I always respect Fred’s viewpoints and I almost universally agree with Fred on his viewpoints on things.

But not this time.

Now, I really do think that Fred has some very good talking points. And normally, I’d be all up in arms about the privacy implications of this.

But not this time.

(conflict of interest alert - I own a whopping 2 shares of Google stock)

Admittedly, maybe it’s that I’m not passionately concerned about the strict privacy of my medical records themselves. Maybe it’s because I’m southern, and we’ll talk about our ailments with strangers like most of America talks about the weather. I am passionate about protecting privacy though in general, so I don’t think that’s it.

So what I think it is is that the state of the medical records today is garbage and Google getting into this can only make things better.

I know that my dentist makes pretty good use of information technology - in fact, the best I’ve seen. Their patient records system is available from the receptionist’s desk, to the hygienist, to the dentist themselves.

But it’s a vertical, closed system. Running on Windows. I think on XP, but it might have still been Windows 2000. And they were the most advanced I’ve seen.

I’ve got glimpses of the billing records system at my primary care physician. Enough to know how poor it seemed to be. And that’s billing, I think all my patient records are still on paper there. And I think they had to fax it back and forth between them and the specialist I saw early last year. And in theory, a x-ray I had was in electronic form, but that was only shared with them and the primary care physician.

The summary statement - the state of my records is likely incredibly poor. Incomplete items, various paper copies in multiple places. And I have none of them.

With a company like Google getting into this (or even, honestly Microsoft, even though they have yet to show that they have the faintest clue about building an online service for this sort of thing) - it can’t go anywhere but up. While the privacy implications of the text comments and images, and medical terms associated with them being all wrapped up in my gmail, and search history, website analytics is certainly something to watch, at least I have the faintest glimmer of hope of finally having full access to my records, using modern systems and modern architectures, built by developers that have at least shown a far greater clue about systems design and usability than almost all vertical integrators and medical software companies whose software I’ve seen.

I would have a greater hope that I would be able to access my records, to audit their use, and at least figure out what and who is doing with them (outside Google).

This revolution can’t come soon enough.

Shortcuts

The quote about the quote of the day from Bob Plankers

How many times a week do you work around shortcuts, where the original person saved a few minutes but cost weeks of time later?

My entire job seems to be centered on predicting and avoiding (or sometimes laying the foundation for) cascade effects.

Superheroes

Mild mannered best friends by night:

Duo

Practicing for their superhero debut by day:

Supah Fly Winston Snooka

So…

If Apple really does have some product with the word “Air” in it - how long is before Adobe or Nike or like, the earth, (duuuuuuude, mother nature maaaan) sues them?

I’m sure some trademark troll has already, and Darl McBride wants to, but we won’t count them for now.

If only camera equipment wasn’t so expensive in the UK

It might be nice to move there.

The British Educational Communications and Technology Agency recently released a report that concludes, among other things:

Pupils, teachers and parents should also be made aware of the wide range of free-to-use products currently available and on how to use and access them.

That’s pretty inspiring actually. I’d hope that we could do that here in the U.S. but I don’t have much hope that we ever will.

We are too busy educating our populace to grow up and detain 5 year olds as terrorist suspects and separating them from their mothers

My wife and brother-in-law are right, common sense isn’t so common anymore.

Hooray for the Brits.