Quote of The Day

From Keith Whitney

“Few things retain their coolness across generations.”

I had a Macintosh Plus at my co-op job at GTE Government Systems – complete with beer can mouse. When I managed to finagle myself a leftover IIci – that was the greatest day ever.

My first personal Macintosh was a 6100/AV – then I fell off the bandwagon with a bunch of NT-based PC’s for years.

This post would be tagged (if I had tags) as “uphillbothwaysinthesnow”

Creating Your Own Gem Server

Update – August 1, 2008

This page remains one of my most active pages for Google Searches. A number of folks want to run their own gem server, and for good reason.

Unfortunately it’s outdated and has been for a while, and I’ve been too lazy to update it.

I also hate for you to leave with outdated information – so please check out the updated information about the commands that are now built into gem and available for facilitating running your own gem distribution site.


I’ve been hand updating gems for too long on my servers, but I have been hesitant to do anything more automated as long as my systems were updating against http://gems.rubyforge.org. Rubyforge has been down more than once when I really needed to update something, and I don’t want to create an automated dependency on a third-party service

So before I automate things, I needed a way to point to my own gem server. Which thankfully, it turns out, that it’s not all that hard to do so (it’s harder than it has to be, and man oh man is the gem cache a pain in the rear). This all assumes that you already had rubygems installed on the boxes that you are moving over to point to your own server. (This also assumes that you use a Linux/Unix server, I’m sure all this works on a Windows server, but I haven’t tested one and honestly don’t care).

Setting up your server

  1. You need a web server. (yes I know you can run gem_server, but get a real one). You are on your own for that one. You also need a rubygems install on that box. You are also on your own for bootstrapping rubygems on that box and any other ones.
  2. Decide where you will put your gems (say in a “mycoolgems” directory off the docroot for your webserver)
  3. $ mkdir [docroot]/mycoolgems/gems
  4. Copy your .gem files that you want to host to $ [docroot]/mycoolgems/gems
  5. Rubygems installed a "index_gem_repository.rb" ruby script in your path. (well probably in your path, it’s in /usr/bin on my systems). You want to run this to generate a yaml-based index of your gems. (appropriately named “yaml” – and a compressed yaml.Z) e.g. index_gem_repository.rb -d [docroot]/mycoolgems

Pointing your systems to your own server

Setting up the server is the easy part. The harder part is pointing all your boxes to your own server. And only your own server. You’d think that one neat thing about the 0.9.2 rubygems release is that it includes a “gem sources” command to theoretically add and remove gem sources that your boxes would look at, but you’d be wrong. Because you can’t get rid of the base source of http://gems.rubyforge.org without modifying the sources gem or the sources distribution on your own box. You can theoretically modify [lib/ruby]/gems/1.8/gems/sources-0.0.1/lib/sources.rb and change:

module Gem  @sources = ["http://gems.rubyforge.org"]  def self.sources    @sources  endend

to

@sources = ["http://yourserver.yourdomain"]

However, that customization is likely going to get blown away the next time you update rubygems with a gem update --system – because the sources gem is built by the rubygems update. So what do you do? Build your own gem update.

  1. Download the RubyGems source
  2. Edit pkgs/sources/lib/sources.rb to point to your own server
  3. Rebuild the gem by issuing a rake package – which will build the rubygems update gem with your source changes (in pkg/ – in my case pkg/rubygems-update-0.9.2.gem)
  4. Copy this gem to your gem server’s gems directories (and rebuild the yaml index as appropriate)
  5. On your other servers – clear out [lib/ruby]/gems/1.8/cache
  6. Remove the [lib/ruby]/gems/1.8/source_cache
  7. Run gem update --system --source http://yourserver.yourdomain

That’s All Folks (probably)

Voila! You just managed to point your server to your own gem server! Install away.

It’s good to keep one box pointed to http://gems.rubyforge.org – and take advantage of the new “gem outdated” command to keep track of changes in your installed gems that have been deployed to rubyforge.

[Updated: April 19, 2007 Thanks to Jim Weirich for pointing out that the index script should actually be index_gem_repository.rb and NOT generate_yaml_index.rb. generate_yaml_index.rb is a holdover from earlier rubygems versions]

status? updating twitter

So back in January Kevin issued a challenge:

“Will Jason start twittering before December 2007?

Of which Kevin decided to write a reminder about today.

I think Kevin forgot his Dante because I actually said “snowball” and “hell”

I’ve always been fascinated by the idea of presence in instant messaging systems – particularly xmpp. In my old job, my boss had this “dotboard” thing that we had to move our magnetic dot on a whiteboard within in/out columns to indicate whether we were there or not. Since it was on the second floor, and I and the systems team were in the basement, well, that didn’t happen much. I always wanted to build a “virtual dot system” (if we had to have one) – and I think one of the team started to after I left, using the xmpp status.

Making presence a string can be pretty powerful, at a glance you can get a feel for the mood/activity of your colleagues – and when used effectively – gives a pretty good overview of the activities of the organization. I don’t really care about pseudo-random strangers. But my colleagues? that’s interesting to me.

I actually created a twitter account in January, right after Kevin’s challenge. I didn’t use it though. I’d occasionally check twitter to figure out where Kevin was at (he sometimes travels a lot, and twitter became useful in knowing when he’d be in the office). Or figuring out when James was headed out for the day – or whatever Beth was up to. Twitter became a tool for institutional knowledge (or at least work team knowledge) – which is really, really useful, because status reports don’t really work well, and we don’t have a lot of round-the-room-review in staff meetings (everybody forgets what all they’ve been working on anyway).

I didn’t use it, because honestly – I’m not in the habit of visiting web pages anymore (if I can’t get it in a feed, I really don’t want it) – and I also didn’t yet another information distraction. The IM interface wasn’t working at the time that I signed up. And twitter was beginning to suffer from scaling issues, and was slow, slow, slow.

So, I was just checking it out from time to time.

So what changed – when did I go from being bystander to twittering fool?

Two things happened:

one was twitterific.

two was Beth, who posted a comment after a particularly dismal IT group meeting one day that said

“Where did our “we can do it” attitude go?”

And I wanted to respond. You see, I’ve written about this time and time and time again. I absolutely hate the tendency in myself where the first phrase out of my mouth is “I can’t see how we possibly can do that” – usually it’s in response to an one-off idea where the actual implementation details of that idea would swamp everything else we are trying to do – or the idea is a solution before the problem is actually even identified. Kevin once gave me a pep talk about this (the best pep talk ever). But there’s sometimes when being the outfielder IS NOT a good thing.

I really want to create great things. I really want my work to having REAL meaning. I want the Yak Shaving to have a purpose. It’s incredibly frustrating to see in myself an Orwellian doublethink about “can and can’t” I have to make sure that the “can’t” is an actually constructive one – something that is trying to head off real problems now and down the road. You can’t say “Yes” to everything (and those that do kill projects) – but the “can” has to outnumber the “can’t” (and organizations have to provide a framework to make this possible)

As my grandfather used to say “can’t never could do anything”

So, Beth’s comment hit home – hard. I didn’t want to lose the “can do it” attitude – but I wanted to respond why it wasn’t there – at least on that day. And blogs and IM’s didn’t seem to cut it. I didn’t to respond in the same system by which the comment came. I wanted to Twitter that it was due to “unclear directions and whiplash priorities, just a typical day in paradise”. (it would have been a particularly bitter twitter – but that really was a tough meeting)

But I couldn’t – because I didn’t twitter.

So I started twittering. (with far more fun things, because the meeting was over and done, and we had new things to focus on)

And it turns out, and as our team’s quote page suggests – there is a certain allure in completely random, pithy one-liners. It’s therapeutic – kinda like making silly comics. I don’t know that it’s the culmination of my belief in presence. I imagine that I’ll dump twitter when they come up with a business model that starts tweating me ford commercials or something. I’ll probably tweat like I blogged – in fits and starts – probably more fascinated by how to see how the development of the tool itself could be applicable to projects that I’m working on.

But for now – I’ll enjoy the fun – staying up with the colleagues – exercising the brain reading between the lines in tweats – and maybe even watching tweats of strangers – but only when they write incredibly funny things

But mostly, it’s about making Alice’s Restaurant references when doing OmniGraffle flowcharts of your data flow processes. No more, no less.

But really – isn’t that what life’s about? 🙂

Unsubscribed

I really like the content for the about.com photography blog.

However, I’m really annoyed by the fact that they don’t have full feeds.

Actually, I might could handle the lack of full feeds (I do handle this for arstechnica) when the content is good enough to click through from the teaser. However, about.com’s links take me to a fairly hefty ad-laden page – and it generates the firefox pop-up warning every time. I’ve seen sites with far worse ads – but it’s actually as hard when you land at about.com to tell the difference between the content of the article and all the junk surrounding it at initial glance as any other site on the net. I’ve always avoided all about.com results in Google for this very reason.

It’s a shame too, because the photography content is/was good.

But I just can’t deal with it anymore – I unsubscribed from the blog.

Good content ruined by partial feeds and an all-out extraneous advertisement and poor design assault just isn’t worth reading.

Peeling the Onion

And no, I don’t mean The Onion – which would have been far more entertaining.

Through Joi Ito’s blog I have recently become aware of the phrase “Yak Shaving.” Joi wrote about it in 2005, here’s WikiPedia’s take – and here’s an etymology from Alexandra Samuel.

When I first read Joi’s blog. I took Yak Shaving to mean a pointless activity (Joi writes in a bit more layered fashion than most folks). It’s part that of course (read Alexandra’s post). But it’s more about good problem solving. Especially when you go and read the entry in the Jargon File

Any seemingly pointless activity which is actually necessary to solve a problem which solves a problem which, several levels of recursion later, solves the real problem you’re working on.

One of the things that I’ve spent my entire career doing is looking for the root causes of problems. Yes, just like every other system administrator/developer, there are times that I’ll defer the problem to another day (to this day, I’m still avoiding a mime types/magic file problem with Red Hat and PHP and MediaWiki that I’ve spent too much time on already). But I recognized a long time ago that digging into something, rather than stopping when the problem was mitigated was going to be much better for everyone. I spent a lot of long nights doing this early on, and still do occasionally – and I’m thankful for some early examples from mentors that encouraged this. It’s made me a much, much better troubleshooter over the years for doing so.

The latest peeling the onion activity came last thursday. I arrived at work, with every intent of taking the example OpenID rails server in the ruby-openid distribution and beginning to pull it in our internal authentication tool. Doing that is very much a “Yak Shaving” activity. There’s some other more pressing problems, but doing OpenID in the internal application solves part of 2 or 3 other problems.

Well that fell by the wayside by mid-morning. We have a preview version of our public website. Most days I’m actually surprised that it works given our Rube-Goldberg esque technical framework for taking mediawiki pages and getting them to the public web site. But it’s been a real benefit internally to have the preview site. Making it happen also made the public application more flexible too.

Well, mid-morning thursday, there was a report that content wasn’t updating on the preview site. At first it was thought this might be a by-product of the previous day’s activity – pulling out a rewrite rule that was eating encoded “?” characters (i.e. %3F) in MediaWiki page titles and causing 404’s by the time those URL links made it into a Flash application and a Rails application. In the process of fixing that, we actually fixed another problem where the source URL for each page in our atom update feed was wrong.

Making that URL correct was what broke things the next day. It turns out that Problem #1 was that the process that pulled in the atom feed keyed in on the URL as the unique identifier for the content (as a fake unique identifier actually, it wasn’t enforced by MySQL). Since the URLs changed when they were fixed – voila! duplicate content – and of course the Find was doing a Find first – and pulling the original, un-updated article.

There was a whole lot of scratching our heads (okay, there was a lot of cursing) about that unique key. The URLs in question are “internal” and pretty much guaranteed to change. Keying off that certainly wasn’t great design. I guess it goes back to that original issue – it solved the problem for a day, but no one gave any future thought as to what it would impact.

So we needed to key off the page title. Well, the page titles weren’t unique either. Which was also a head scratching/cursing problem. MediaWiki page titles are unique in a namespace, and our find functions assume they’ll be unique as imported, but that uniqueness was not enforced.

Well, MySQL folks can guess what happened next. We’ve never actually ever dealt with the collation issues with our MySQL server (there’s a lot we haven’t dealt with with our MySQL server – but that’s another post for another day).

For good or for bad, I really didn’t understand why our collation was set to “latin1_swedish_ci” – and thought that I had made a major mistake setting up the server defaults in the first place, that no dev ever caught when thinking about their schemas. I was pretty relieved to find out that that was just the default for MySQL in the first place.

James’ absolutely groaner of a quote?

Well at least we didn’t bork it up

Well, MediaWiki titles are case sensitive, and it made sense for that column to be case sensitive too – so in went the migration. This actually gave the additional benefit that searches for articles titles would actually be accurate now (even though we have some content that differs only in case that needs to be fixed).

execute "DROP INDEX title_idx ON wiki_articles" execute "alter table wiki_articles change title title text character set latin1 collate latin1_general_cs null default null" execute "alter table wiki_articles add unique title_idx (title(255))"

(p.s. select somecolumn,count(*) as n from table group by somecolumn having n > 1 is a wonderful tool to stick in the tool belt)

After all this is done, we had to import the content again. It’s about 25MB of an atom file – 5,000+ pages of content dating back to last september. Our standard processes of trying to pull this data in with a HTTP GET takes too long to run with the HTTP timeouts in the libraries we use – so a long time ago I modified our code to read the data from a file when needed.

Well, when a contractor modified the code to stop using the FeedTools library and just do some simplified parsing for our internal purposes they took out the “read from file” functionality and didn’t replace it. Which generated some more head scratching and cursing. So that had to go back in to get all the content back into and corrected.

A simple problem of content not being updated highlighted 4 different problems: wrong key for the content, no unique enforcement for any keys, wrong collation on the column, and data import from files missing.

We could have stopped early on by just killing the duplicated content with the wrong URL, updating it, and reimporting the latest changes. But we didn’t. The application we fixed didn’t matter for public use – but our fixes prevented some future problems.

I guess we shaved a few yaks that day. And proved yet again how important it is to get to the root of problems. And how painful it is later when you have to go back in behind yourself and others because it wasn’t done originally.

Designing Content for the Web

Shelly Powers resurrected a post a few days ago on javascript “widgets” (without much thought you could extend this to any blob of javascript doing http requests and fun little rendering things with local and server data without refreshing the page).

It’s a long piece, that I think boils down to “web designers should use them responsibly”

I’m going to extend this in another way. Shelly said something in particular that I want to point out:

The same person who wrote the comment about widgets also mentioned how browsers load top to bottom, left to right. It’s been a long established policy to make your web pages more accessible to screen readers by putting the changing material first in the page layout, and then the more static elements. In a weblog, this means putting the content first, and then the sidebars. The bars may appear to the left, but in the actual physical layout design, they should physically exist after the content, so that the content is always first.

You can’t design content for the web unless you know how the web works. I mean at least at a high level – about the fact that browser software uses HTTP to request data, web server(s) return that data, and the browser software is responsible for beginning to parse that data as soon as it arrives back from the web server(s), and follow its instructions – either make more requests, start rendering text and graphics in the data, or start running code embedded in the data in whatever code parser/compiler the browser supports. And add on top of that how other software (xml clients/”feed readers”, embedded browsers, screen readers, the code that browsers execute, other web servers even) also will request your data and follow its instructions.

Well, I mean you really can design content for the web without knowing any of this – but if and only if you trust the guidance of the people that do.

[Updated: The more I read and re-read my original post, the more I realized that it was one big long “duh” So I just decided to edit the thing and get to the heart of the point I wanted to make.]