<?xml version="1.0" encoding="utf-8"?>
<rss version="2.0">
    <channel>
        <title>SeanKerwin.org</title>
        <link>http://seankerwin.org/</link>
        <description>Not the kind of woolly-headed liberal thinking that leads to being eaten.</description>
        <language>en</language>
        <copyright>Copyright 2008</copyright>
        <lastBuildDate>Tue, 02 Dec 2008 15:47:38 -0500</lastBuildDate>
        <generator>http://www.sixapart.com/movabletype/</generator>
        <docs>http://www.rssboard.org/rss-specification</docs>
        
        <item>
            <title>Programming For the Easily Amused: Part 1</title>
            <description><![CDATA[<div><span class="Apple-style-span" style="font-style: italic;"><span class="Apple-style-span" style="font-family: -editor-proxy;">For a long time I've semi-seriously joked about writing a humorous non-theoretical book about computer science that isn't a total crock. &nbsp;I think I've finally decided that the easiest thing to do is break the entire field down into a random series of blog posts that will in all likelihood never actually be collected into a book. &nbsp;So here's the preamble.</span></span></div><div><br /></div><div>So what is programming? &nbsp;Programming a computer is all about explaining to a computer how to do things. &nbsp;The big secret all programmers share is that computers are dumb. &nbsp;Really dumb. &nbsp;But -- and this is why we keep them around -- infallibly obedient and really, really good at math. &nbsp;Have you ever called a tech support or customer service line and had to put up with a dolt who can't do anything but follow the script in the big-honking-binder his supervisor gave him on his first day? &nbsp;Now imagine that guy with a calculator, and you're imagining someone maybe ten or twenty times smarter than a computer. &nbsp;The job of the programmer is to write that big-honking-binder.</div><div><br /></div><div>So if you're writing a big-honking-binder for your new friend the phone drone, what kind of instructions can you put in there? &nbsp;If your employees were intelligent and informed, you wouldn't need much: just a single sheet of paper that says 'handle customer issues' at the top, and voila, you're done! &nbsp;But remember, the computer you're programming here is stupid, and he's not going to understand high-level instructions like that. &nbsp;You need to break it down:</div><div><br /></div>
<ol>
<li>
	Ask the caller whether he or she currently owns a product.
	<ol style="list-style-type: upper-alpha;">
		<li>If the caller says 'yes', go to step 2.</li>
		<li>If the caller says 'no', go to step 5.</li>
	</ol>
</li>
<li>
	Ask the caller what is wrong with the product.
	<ol style="list-style-type: upper-alpha;">
		<li>If the caller says 'it will not turn on', go to step 3.</li>
		<li>If the caller says 'it will not turn off', go to step 4.</li>
	</ol>
</li>
<li>Tell the caller how to turn the product on.  Hang up.</li>
<li>Tell the caller how to turn the product off.  Hang up.</li>
<li>Tell the caller where to buy the product.  Hang up.</li>
</ol>
<br />
<div>If you look at that script, you'll probably notice a few things (besides the fact that we're training our human computer to be unbearably rude). &nbsp;What if the answer to step one is "I don't know"? &nbsp;Different people (computers) might handle that differently. &nbsp;Some might hang up; some might continue on to step two; some might ask question one over and over until the frustrated customer picks 'yes' or 'no' (we'd call that "undefined behavior", which basically means we don't know in advance what's going to happen; undefined behavior is a source of some pretty nasty bugs). What if there's something wrong with the product besides an inability to turn it on or off? &nbsp;What if the product won't turn on because it's broken?</div><div><br /></div><div>Those are all bugs in our program, and they all serve to show part of what makes programming difficult; decomposing a task ('make customers happy') into tiny little steps that a mindless automaton can follow is very difficult. &nbsp;You can't count on a computer to make a value judgement; you can't count on a computer to recognize that 'yeah' or 'yup' also mean 'yes'; you can't count on a computer to do much of anything except for faithfully following whatever flawed or incomplete script you give it. &nbsp;That and math.</div> ]]></description>
            <link>http://seankerwin.org/archives/2008/12/programming_for_the_easily_amu.shtml</link>
            <guid>http://seankerwin.org/archives/2008/12/programming_for_the_easily_amu.shtml</guid>
            
            
                <category domain="http://www.sixapart.com/ns/types#tag">book</category>
            
                <category domain="http://www.sixapart.com/ns/types#tag">programming</category>
            
            <pubDate>Tue, 02 Dec 2008 15:47:38 -0500</pubDate>
        </item>
        
        <item>
            <title>The Morning After</title>
            <description><![CDATA[<div>As a conservative, I understand that our rights are not free; &nbsp;we pay for them every day.</div><div><br /></div><div>The price of capitalism is that the vicissitudes of the market may leave good men down or line the pockets of the undeserving. &nbsp;The price of free speech is that I must endure the discordancy of lies mingling with truth. &nbsp;And the price of democracy is the ever-present possibility that the candidate I favor may fall before the candidate I do not.</div><div><br /></div><div>And, having spent eight years deafened by the omnipresent din of untrue and unfair accusations against President Bush, I recognize that the price I pay for resenting those slings and arrows is the ethical obligation to give President Elect Obama a fair chance to prove himself in office.</div><div><br /></div><div>All of these prices, and the many other prices I must pay for my freedom, I pay willingly.</div> ]]></description>
            <link>http://seankerwin.org/archives/2008/11/the_morning_after.shtml</link>
            <guid>http://seankerwin.org/archives/2008/11/the_morning_after.shtml</guid>
            
            
            <pubDate>Wed, 05 Nov 2008 10:03:22 -0500</pubDate>
        </item>
        
        <item>
            <title>Wikis Are Irish Roadsigns</title>
            <description><![CDATA[My family has a running joke that the road signs in Ireland are there to remind you of things you already know rather than to direct you to places you've never been. &nbsp;If you've never been to Ireland you'll think I'm exaggerating, but the signs really are just a step short of saying "the place where Mary brought that lovely jacket - 12 km".<div><br /></div><div>It's even worse if you stop and ask for directions - "Oh, you'll be wantin' to take that lane just past Brian's house, you know, Brian with the dog!" &nbsp;It takes about fifteen minutes to convince your erstwhile directioneer that you don't in fact know Brian, having spent the majority of your life on the other side of the Atlantic Ocean, and that you'll really need your instructions expressed in a format that doesn't presuppose knowledge of the local geography so intimate as to render directions unnecessary. &nbsp;At which point you'll be given a confusing array of rights, lefts, and reverses, followed inevitably by "and then it's right up the road, you can't miss it!", which you'll dutifully follow in a large circle before returning, six hours later, to ask the exact same gentleman for directions.<div><br /></div><div>In general he'll pretend to have never before made your&nbsp;acquaintance&nbsp;and eventually (after repeating the previously-rendered fifteen-minute protestation of your non-acquaintance&nbsp;with Brian's dog) give you an entirely new and completely&nbsp;dissimilar&nbsp;set of instructions that will culminate in your accidental arrival in Paris.<div><br /></div><div>I'm not entirely sure how one manages to drive from Ireland to mainland Europe, but it happens, I <span class="Apple-style-span" style="font-style: italic;">swear</span>.</div><div><br /></div><div>But anyhoo, my point is that Irish road signs are designed to remind you of things you already know, or provide you with details about subjects on which you already have high-level understanding. &nbsp;And Wikis are exactly the same way; they make excellent references, but they're largely terrible as first-order sources or methods of communicating information, primarily because of the structure they inspire - the same disaggregated, freeform organization that makes it possible to deep-dive into related matters as a reference makes it very difficult to arrange information in the sort of cohesive sequence necessary to teach people something new.</div><div><br /></div><div>Learning something new is akin to recording every lecture in a college course and playing them back in a random sequence - even though every note may be hit, it's not precisely musical.</div><div><br /></div><div>All of which means precisely nothing, except that I haven't posted in a while and this was the only interesting and non-proprietary thought in my head.</div><div><br /></div></div></div>]]></description>
            <link>http://seankerwin.org/archives/2008/10/wikis_are_irish_roadsigns.shtml</link>
            <guid>http://seankerwin.org/archives/2008/10/wikis_are_irish_roadsigns.shtml</guid>
            
            
            <pubDate>Mon, 06 Oct 2008 18:33:19 -0500</pubDate>
        </item>
        
        <item>
            <title>New Names For the Final Harry Potter Movie</title>
            <description><![CDATA[Now that they've <a href="http://www.marketwatch.com/news/story/harry-potter-half-blood-prince-moves/story.aspx?guid=%7BF4F52B7F-D1B1-4DC0-BF8A-AD0D9252BE7A%7D&amp;dist=hppr">shifted the release schedule back</a> even further, I think they're going to need some new titles for the movies.<div><br /><div><ul><li>Harry Potter and the Rascal Mobility Scooter</li><li>Harry Potter and the Mystery of the Kids Blasting Their New-Fangled Rock and Roll Music</li><li>Harry Potter and the Lawn You Damned Kids Better Get Off Of</li><li>Harry Potter and the Colonscopy of Terror</li><li>Harry Potter and the Bottle of Lipitor</li><li>Harry Potter and the Order of the AARP</li><li>Harry Potter and the Cauldron of Prunes</li></ul><br /></div><div>Half of those are Hoggle's fault.  In fact, I'm going to blame them all on him.</div></div>]]></description>
            <link>http://seankerwin.org/archives/2008/08/new_names_for_the_final_harry.shtml</link>
            <guid>http://seankerwin.org/archives/2008/08/new_names_for_the_final_harry.shtml</guid>
            
            
            <pubDate>Thu, 14 Aug 2008 17:58:16 -0500</pubDate>
        </item>
        
        <item>
            <title>Wow, MobileMe really is Exchange for the rest of us!</title>
            <description><![CDATA[In that it doesn't work and it's pissing me off.<div><br /></div><div>Web rollouts don't always go as planned; this I very much know.  But there comes a point when you hit the rollback plan, take stock of the situation, and try again tomorrow night.  You don't get point for perseverance when your customers are without service.</div>]]></description>
            <link>http://seankerwin.org/archives/2008/07/wow_mobileme_really_is_exchang.shtml</link>
            <guid>http://seankerwin.org/archives/2008/07/wow_mobileme_really_is_exchang.shtml</guid>
            
            
            <pubDate>Thu, 10 Jul 2008 21:41:08 -0500</pubDate>
        </item>
        
        <item>
            <title>MacBook Air Meets Blizzard</title>
            <description><![CDATA[For the benefit of Google: &nbsp;The MacBook Air runs Warcraft III just fine, and seems to run World of Warcraft acceptably for my purposes. &nbsp;Of course I've never been a heavy WoW player, and I suspect serious raiders may find it lacking.]]></description>
            <link>http://seankerwin.org/archives/2008/07/macbook_air_meets_blizzard.shtml</link>
            <guid>http://seankerwin.org/archives/2008/07/macbook_air_meets_blizzard.shtml</guid>
            
            
            <pubDate>Wed, 09 Jul 2008 14:21:51 -0500</pubDate>
        </item>
        
        <item>
            <title>Supposition: Is This How Google Works?</title>
            <description><![CDATA[<p> The one thing that has always puzzled me when I've tried to figure out how the Google search engine works is results ordering.  Everyone knows that Google works with massive clusters of low-end hardware, and that their algorithms are uniquely designed for this architecture.  That's the big clue, and I can figure out how (at a conceptual level) every aspect of a search can be parallelized except for the final and crucial result-ordering step.  It's been bothering me.</p>

<p>But today in the course of doing something entirely unrelated I stumbled over a <a href="http://www.wired.com/science/discoveries/magazine/16-07/pb_sorting">layman-level summary of the MapReduce algorithm</a> that somehow made the whole thing click in my head.  What's funny there is that I've read about <a href="http://labs.google.com/papers/mapreduce.html">MapReduce</a> plenty in the past; I just never made the connection until reading <span class="Apple-style-span" style="font-style: italic;">Wired</span>'s decidedly nontechnical summary.</p>

<p>So, long story short, I think I have a general idea how Google works now.  I'm never really sure I understand something until I can explain it though, so in a fit of irony I'm going to explain my guesses about Google on a page that will probably only be read by Google's indexing bot.  So hurray for antiphrasis[1]!</p>

<p>But first, some context is in order:</p>

<p>Calling what you do with Google a 'search' is somewhat misleading; it evokes the image of Google's servers actively navigating the internet looking for you content.  With a little thought it becomes obvious that that's not how it works; not only would individual Google searches take hours, but the traffic generated by even a small subset of Google's normal users would grind the entire internet to a standstill.</p>

<p>Someone far more clever than I once pointed out that "almost all programming can be viewed as an exercise in caching"[2], and modern search has taken that maxim to heart.  Just about every search engine -- from Google all the way down to Spotlight -- works by generating an index of the content and conducting actual queries by examining the index.  Think of the index as a great big map (or hash or dictionary, whatever term you prefer) between tokens and sets of documents.  If you look in the index under 'marklar' you'll get back a set of every document containing that 'word'.</p>

<p>Notice that I'm using the word 'set' instead of 'list'.  There are two reasons for that -- first, it is indeed a set rather than a list, in the strict mathematical sense that each item (a document in this case, or more accurately a URL, ID, GUID, or other form of document identifier) appears in the collection exactly zero or one times (<a href="http://www.imdb.com/title/tt0071853/quotes">five is right out!</a>).  The second reason I'm being careful to say 'set' has to do with complex queries.</p>

<p>So if you search for 'marklar' you look in the index for the set of documents containing that term, you order them (more on that later!) and you show them to the user.  Bada-bing, bada-boom, Bob's your uncle, and so on and so forth.  But what if you want to search for 'smurfing marklar'?  Ideally that would show you every document that contains both 'smurfing' and 'marklar', right?  Well, if our index can give us the result sets for each individual term, then we can find the desired combined result set by taking the intersection of those two sets.</p>

<p>Finding the intersection of two lists is a pain in the butt -- O(n*m) in general, O(n+m) if you can guarantee both are sorted, but either way not the sort of thing that scales well to searching large numbers of documents (such as, for instance, the entire freaking internet).  But if you store your sets in something clever called a <a href="http://en.wikipedia.org/wiki/Treap">treap</a> (which is, exactly as it sounds, the bastard child of a <a href="http://en.wikipedia.org/wiki/Tree_data_structure">tree</a> and a <a href="http://en.wikipedia.org/wiki/Heap_%28data_structure%29">heap</a>) you can get it to work in O(m log n/m), and you can even parallelize things to work in O(log m/n) if you happen to have m cores lying around, which is, in a word, fast as hell.  So: Sets?  Important.</p>

<p>Once you look at search as an exercise in set combination, you start to see that the search term is itself a statement in a very loose and forgiving programming language: "<span class="Apple-style-span" style="font-style: italic;">smurfing marklar </span>OR<span class="Apple-style-span" style="font-style: italic;"> foo </span>NOT<span class="Apple-style-span" style="font-style: italic;"> bar</span>" can be parsed into a syntax tree which can then be expressed as a series of set operations: ( (<span class="Apple-style-span" style="font-style: italic;">smurfing</span> <span class="Apple-style-span" style="font-weight: bold;">n</span> <span class="Apple-style-span" style="font-style: italic;">marklar</span>) <span class="Apple-style-span" style="font-weight: bold;">u</span> <span class="Apple-style-span" style="font-style: italic;">foo</span> ) <span class="Apple-style-span" style="font-weight: bold;">n</span> ~<span class="Apple-style-span" style="font-style: italic;">bar</span>, for instance.  If you know the performance characteristics well enough to devise appropriate heuristics, you can even write an optimizer to tweak the abstract syntax tree for your query into a more performant form.  Yet another victory for the thesis that everything in computer science is a compiler, I guess.&nbsp; </p><p>Point: <a href="http://steve-yegge.blogspot.com/2007/06/rich-programmer-food.html">Yegge</a>.</p>

<p>There's a problem there, though: how do you distribute it?  Merging sets can actually be parallelized pretty nicely on a single machine (like I said, m cores get you O(log m/n) time), but there's no good way to find the intersection of a set on machine A and a set on machine B; the communication overhead would dwarf any gain you got by distributing the problem.  So you really need to have the full index available on the same machine to run a query, which might not work so well when you're indexing, say, the freaking entire internet.  It also means your scaling solution is going to be 'adding more and more powerful machines' instead of 'adding <i>lots</i> of <i>wimpy</i> machines', so even if that approach were viable it's not explaining Google.  I'll come back to this in a few paragraphs.  Wait for it.</p>

<p>So anyway, now you've got a set: a great big pile of documents that match your query.  That's actually the 'easy' part -- Yahoo was doing that long before Google showed up and ate their lunch, and they'll probably just keep on doing it until the end of time if Microsoft doesn't buy them and turn them into Facebook 2 (subtitle: 'The Facebookening') first.  The really tricky part, and the part that Google got so right, is putting that pile of search results into a meaningful order.  There are traditional ways of calculating the relevance of a search result, and most of them involve looking at the incidence of a term within a document and trying to calculate how 'important' the word is to that semantic meaning of the document, and then making the assumption that if 'marklar' is heavily important to document A (40% of the tokens in the document are marklar or related terms) and minimally important to document B (the word 'marklar' only shows up once) then A is a more relevant result than B (there's more to it, but that's the conceptual gist, at least as I choose to understand it).</p>

<p>There are two big problems with that approach.</p>

<p>Problem one is that it just doesn't work very well; it's easy to game the heuristic by just repeating the target keyword over and over again through the page, a manipulation that spawned the field of <a href="http://en.wikipedia.org/wiki/Search_engine_optimization">search engine 'optimization'</a> and lead to a cartload of annoying crap on the internet.  You can make your heuristic smarter, but ultimately you're in an arms race against a whole host of people with strong economic incentive to break your system, and you're not going to win.</p>

<p>The second big problem with the sets-then-sorts approach is performance.  To do this effectively you need to complete the entire set operation phase prior to sorting and displaying results, and you need to sort the entire result set before you can reliably return the first page of results to the user.  That's clearly suboptimal; ideally you'd like to front-load the effort and get the first page of results out there ASAP, and then finish page 2 at your leisure while the user looks through page 1.  That's the part that's always given me a headache, because I <i>know</i> (or at least, <i>think</i> I know) that Google doesn't compute the whole result set before I see page one; if that were case you'd see a vast performance disparity between loading page 1 and page 2 of the results; the first page would take longer, and subsequent pages would be almost instant, which I certainly haven't ever noticed.  It would also be the case that Google knew on page 1 exactly how many pages of results there would be, which often isn't the case (more on <i>this</i> later, too!).</p>

<p>So how do I think Google does it?  They cheat!  Sort of.</p>

<p>I don't think Google actually uses MapReduce for searching (it's used for indexing, but not searching), but the design of MapReduce is the key; it's a two-stage algorithm.  As the first step each distributed node produces an intermediate result set, <i>in the form of a map</i>.  That's the important bit -- the first phase produces not a set, not a list, but a map.  So back to searching: What if each page had an absolute relevance score that has nothing to do with the search, but is in fact and intrinsic property of the page -- some sort of a <a href="http://en.wikipedia.org/wiki/PageRank">Page Rank</a>, let's say.  That way instead of generating a result set as one big set, you could create a bunch of different sets: the rank 1 set, the rank 0.9 set, and so on.  Just break things down by rank ranges and throw results into buckets.</p>

<p>What does that get you?  Distribute your index to multiple machines, such that every document is on exactly one node (in reality there'd be redundant, backup and failover nodes, but that's not relevant to the core algorithm).  Further, have each node maintain multiple indices, one for all the rank 1 pages, one for all the rank 0.9 pages, et cetera (this could also be done by having multiple indices to the same data, but I don't think it makes a conceptual difference).  Then sprinkle in a handful of master/controller nodes to manage all these index nodes.  When a master node gets a search request, broadcast the search down to every slave and let them get started.</p>

<p>Program slaves to front-load the high-ranking working, and have them signal the controller as each bucket is filled.  When a controller has heard a completion signal for rank 1 pages from every slave node, have it pull back just those rank 1 nodes, order them based on old-fashioned relevance ranking, and feed them back to the user.  Meanwhile the slave nodes can keep chugging away on the rank 0.9 - 0.1 indices.</p>

<p>This would mean a couple of things: firstly, you wouldn't know exactly how many total results you had when your first page of results were ready; you could approximate it by checking the number of matches vs. documents checked on each node (node A has found 12 matches after scanning 10,000 pages, so since node A contains a total of 100,000 results we probably have about 120 matches on node A).  If that guess is off your pager will be incorrect; a user might click page seven only to find himself on page five, with no indication six or seven exist, or a user might click the last link in the pager to find there are suddenly many more pages.  Secondly, this model would mean that relevance is a second-order consideration when ranking results; a highly-ranked page that mentions 'marklar' in passing might outrank a minimally-linked post on a free hosting site that happens to be entirely about marklars.  Both of those predicted outcomes match behavior I've seen from Google.</p>

<p>So is this how Google works?  Probably not.  And even if I got some of the stuff right at a high level, there are doubtless innumerable details I've missed out on.</p>

<p>But it was fun to think about.</p>

<p>[1] Antiphrasis is a word that means irony; of course my usage here is not according to the standard definition of irony (<a href="http://www.tv.com/futurama/the-devils-hands-are-idle-playthings/episode/165490/trivia.html">as Bender explained</a>, "The use of words expressing something/Other than their literal intention") but according to the far-more-common yet far-less-correct <a href="http://en.wikipedia.org/wiki/Irony#Usage_controversy">Alanis Morissette meaning</a> ("It's like rain/On your wedding day").  Ironically enough (Alanis), that means that my incorrect usage of the word antiphrasis here is in fact ironic (Bender).  So by using it incorrectly, I'm using it correctly. &nbsp;Noodle on that one for a while,&nbsp;<a href="http://en.wikipedia.org/wiki/Kurt_Gödel">Gödel</a>.</p><p>[2] Terje Mathisen.  He doesn't seem to have a blog or home page, or I'd link to him.  Sorry, semantic web. &nbsp;I'll <span class="Apple-style-span" style="text-decoration: line-through;">never</span> do it again.</p>]]></description>
            <link>http://seankerwin.org/archives/2008/07/supposition_is_this_how_google.shtml</link>
            <guid>http://seankerwin.org/archives/2008/07/supposition_is_this_how_google.shtml</guid>
            
            
            <pubDate>Tue, 08 Jul 2008 15:25:38 -0500</pubDate>
        </item>
        
        <item>
            <title>Posted From My Shiny New MacBook Air</title>
            <description><![CDATA[Shiny it is.<div><br /></div><div>Took two tries: first one was a dud with a non-charging battery; took it back the next day and they swapped it out with a minimum of fuss.  Xcode, TextMate, and Clan Lord run well; I'm currently download WoW more out of curiosity than any actual desire to play.</div><div><br /></div><div>The screen is gorgeous (very bright), the keyboard is very nice thus far -- the gaps between the keys don't impair my touch-typing any and they keys have a nice spring and click to them, and the solid state hard disk is speedy as hell thus far.</div><div><br /></div><div>The biggest thing I wanted out of the Air was a cool lower case that could be used on my lap without jeopardizing the continuation of the Kerwin name, and on that it has thus far delivered.</div><div><br /></div><div>So far, it has the Skirwan seal of approval.  Go forth and purchase, my hypothetical legions of nonexistent readers who hang on my every capricious opinion.</div>]]></description>
            <link>http://seankerwin.org/archives/2008/07/posted_from_my_shiny_new_macbo.shtml</link>
            <guid>http://seankerwin.org/archives/2008/07/posted_from_my_shiny_new_macbo.shtml</guid>
            
            
            <pubDate>Mon, 07 Jul 2008 19:14:59 -0500</pubDate>
        </item>
        
        <item>
            <title>Testing Image Uploads</title>
            <description><![CDATA[<div><div style="text-align: left;"><br /></div><img alt="BSG Chat.png" src="http://seankerwin.org/images/BSG%20Chat.png" width="448" height="198" class="mt-image-right" style="text-align: left;float: right; margin-top: 0px; margin-right: 0px; margin-bottom: 20px; margin-left: 20px; " /></div><div><div style="text-align: left; ">Actually I just wanted still another demonstration of the sort of insanity that dominates my IM window.</div><div style="text-align: left; "><br /></div><div style="text-align: left; "><span class="mt-enclosure mt-enclosure-image" style="display: inline; border-style: initial; border-color: initial; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; border-top-width: 0px; border-right-width: 0px; border-bottom-width: 0px; border-left-width: 0px; border-style: initial; border-color: initial; padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; font-size: 1em; font-weight: normal; ">Whoo, BSG!</span></div><br /></div>
]]></description>
            <link>http://seankerwin.org/archives/2008/04/testing_image_uploads.shtml</link>
            <guid>http://seankerwin.org/archives/2008/04/testing_image_uploads.shtml</guid>
            
            
            <pubDate>Fri, 04 Apr 2008 16:51:24 -0500</pubDate>
        </item>
        
        <item>
            <title>Ch-ch-ch-ch-changes</title>
            <description>I&apos;ve just migrated seankerwin.org to an actual hosting service (up till now it had been hosted on an old G4 in a closet at home).  Hopefully nothing went wrong with the blog migration.  iClan and Stylunk will get transferred eventually.</description>
            <link>http://seankerwin.org/archives/2008/03/chchchchchanges.shtml</link>
            <guid>http://seankerwin.org/archives/2008/03/chchchchchanges.shtml</guid>
            
            
            <pubDate>Tue, 25 Mar 2008 22:24:48 -0500</pubDate>
        </item>
        
        <item>
            <title>A Real Blogger Said My Name!</title>
            <description><![CDATA[<p><a href="http://instapundit.com/archives2/016395.php">My name on Instapundit.com</a>.</p>

<p>It's like I've made it or something.</p>

<p>Except for my not putting a link in my email and consequently not picking up the oodles and oodles of Google juice.  Which is actually a good thing, because the fan in this computer is pretty bad and there's a real possibility that an Instalanche could literally set my computer on fire.  Or cause Comcast to send a hit squad.</p>]]></description>
            <link>http://seankerwin.org/archives/2008/03/a_real_blogger.shtml</link>
            <guid>http://seankerwin.org/archives/2008/03/a_real_blogger.shtml</guid>
            
            
            <pubDate>Tue, 11 Mar 2008 17:35:29 -0500</pubDate>
        </item>
        
        <item>
            <title>If Only I Could Write Cocoa in Visual Studio...</title>
            <description><![CDATA[<p>The new version of Xcode included with the iPhone SDK has a nifty new feature that causes the autocompletion placeholders to display in a more integrated fashion.  When using Xcode autocomplete Objective-C message parameters end up in source in the form <code>"&lt;#(NSStringCompareOptions)options#&gt;"</code>; now those surrounding tags are omitted and the whole placeholder is displayed as a single encapsulated item, similarly to email addresses in Mail.app.</p>

<p>Unfortunately, the old way of doing things had one big benefit: I could command-double-click on <code>NSStringCompareOptions</code> to jump straight to the declaration of that symbol in NSString.h, which is really the only way I could ever remember that <code>NSAnchoredSearch</code> is the setting I need to make this particular line work properly.</p>

<p>On the one hand, I'm glad to see Apple working on features that compete more directly with Intelliesense, and I've got to think that that's exactly what integrating placehodlers into the text renderer is meant to be.  On the other hand, this pretty-but-not-terribly-useful feature has now broken something I used to use regularly and have often found myself wishing Visual Studio did.</p>

<p>Whoops.</p>]]></description>
            <link>http://seankerwin.org/archives/2008/03/if_only_i_could.shtml</link>
            <guid>http://seankerwin.org/archives/2008/03/if_only_i_could.shtml</guid>
            
            
            <pubDate>Sun, 09 Mar 2008 00:41:29 -0500</pubDate>
        </item>
        
        <item>
            <title>Thoughts on the iPhone SDK</title>
            <description><![CDATA[<ul><li>It's going to feel much more like 'real' Cocoa once they've got Interface Builder in the mix.  Building a UI programmatically feels incredibly primitive - even more so than building a web page, in fact, because there at least you have the ability to add the main elements and then tweak them in CSS.  As of right now, I'm spending far more time on the UI than I'd really like.</li><li>The lack of CoreData doesn't really bother me, but then I'm pretty handy with SQL.  I can see this being annoying for some people.</li><li>The $99 fee to get a signing certificate is reasonable, but personally it's annoying because it means that if I want to produce - for example - iClan Touch, I have to either charge for it or eat the expense myself.</li><li>In removing CoreData they've <span class="Apple-style-span" style="font-style: italic;">also</span> removed NSPredicate, and with it filteredArrayUsingPredicate:, one of my favorite NSArray categories.  Which, you know, sucks.</li></ul><div>So at the moment I'm trying to build iClan Touch as a simple teeth-whetting exercise.  After that I may have to see how much of Clieunk/Xclan is salvageable on the new platform - watching the demos from the SDK announcement event today gave me a few ideas for how to make Clan Lord work in a touch-based UI.</div>]]></description>
            <link>http://seankerwin.org/archives/2008/03/thoughts_on_the.shtml</link>
            <guid>http://seankerwin.org/archives/2008/03/thoughts_on_the.shtml</guid>
            
            
            <pubDate>Fri, 07 Mar 2008 02:47:12 -0500</pubDate>
        </item>
        
        <item>
            <title>The I&apos;m-My-Own-Grandpa Design Pattern</title>
            <description><![CDATA[<p>I wrote about this <a href="http://seankerwin.org/archives/2007/11/returning_a_sub.shtml">a few months back</a>, and today it proved the solution to an otherwise insoluble problem yet again.<br />
<code><pre class="code"><br />
class BaseMarklar&lt;T&gt; where T:BaseMarklar&lt;T&gt; {<br />
	public T Self() { return (T)this; }<br />
}</p>

<p>class BlueMarklar : BaseMarklar&lt;BlueMarklar&gt; {<br />
	public void Frob() { ... }<br />
}</p>

<p>class RedMarklar : BaseMarklar&lt;RedMarklar&gt; {<br />
	public void Frizzle() { ... }<br />
}<br />
</pre></code><br />
I've decided to call it the "I'm My Own Grandpa" design pattern.  I've always regarded design patterns as primarily solutions to failures of the underlying language, and hence I consider the designation appropriate here.</p>]]></description>
            <link>http://seankerwin.org/archives/2008/01/the_immyowngran.shtml</link>
            <guid>http://seankerwin.org/archives/2008/01/the_immyowngran.shtml</guid>
            
            
            <pubDate>Mon, 14 Jan 2008 19:15:22 -0500</pubDate>
        </item>
        
        <item>
            <title>Like Social Security For Coders</title>
            <description><![CDATA[<p>I had an interesting discussion with a coworker today about zero-based indexing.  It all started with a method signature for accessing a range of a collection; I had specced the interface to be in the form:<br />
<code><pre><br />
int GetMarklarCount( );<br />
IEnumerable&lt;Marklar&gt; GetMarklars( int skip, int count );<br />
</pre></code><br />
I intended this to mean 'skip this number of items, and then give me the next this many' -- the first page would be ( 0, 10 ) and the second would be ( 10, 10 ), for instance (assuming ten items per page).  I had referred to this idiom as 'pagination support', because the key goal of the design was to allow clients of the service to support pagination in the UI.</p>

<p>Other folks had duplicated the idiom but misinterpreted it, assuming that the <code>skip</code> parameter represented the number of <i>pages</i> to skip, and the count parameter specified the size of a page, so fetching those same two pages would be ( 0, 10 ) and ( 1, 10 ).</p>

<p>We had a discussion and my meaning won out, but it quickly segued into a discussion of the relative merits of <code>GetMarklars( int skip, int count )</code> and the similar <code>GetMarklars( int first, int count )</code>.  I favored the signature using 'skip' as I feel it's less ambiguous: saying 'first' brings up the obvious question of whether the first element is element zero or element one, and I'd prefer not to have that degree of uncertainty in an interface.</p>

<p>All of which merely serves to get to my main point: the fact that programming today uses zero-based indexing is a <i>really</i> weird anachronism.</p>

<p>In assembly language zero-based indexing makes perfect sense, because the index is actually an offset from the base address.  C, being intended as a slightly-higher-level portable assembly language, maintained this convention as a way of keeping old-school assembly hackers happy (and to support the perpetual question of whether indexing an array is faster or slower than pointer arithmetic, of course).  That (to my mind) was still reasonable, because you simply can't be an effective C programmer without understanding memory management at a low enough level that this makes perfect sense.</p>

<p>The strange part is that in today's far-higher-level languages there's no need to index from zero, and in fact indexing from zero is a point of confusion for new programmers.  A new programmer learning Python may never understand the hardware at a low enough level to recognize that an array index is an offset from an address (and in my experience lots of people who <i>do</i> have the hardware background to understand this never manage to make the connection).</p>

<p>So why do we still index from zero?  Why, because we're used to it!  A new language with collections indexed from one would be instantly regarded as a toy by most established coders -- look at how 'pro' coders regard VisualBasic (not that there aren't plenty of other, more valid, things to criticize about VB).  So new languages index from zero to keep the established coders happy, or they fail to index from zero and the established coders never look at them and they die off.  New programmers are eventually conditioned to think that indexing from zero is just the natural way to do things, and sooner or later the cycle repeats.</p>

<p>But you have to wonder: how many people out there would be entirely capable of getting good work done in Python or Ruby or whatever were it not for this single little nit?  I know there are decent programmers out there who always allocate one array space more than they need and simply ignore index zero, and it's not totally clear to me that these people are any worse at their job if you ignore their indexing eccentricity.</p>

<p>I'm forced to wonder: are today's programmers unwittingly allowing their anachronistic indexing preference to render the field less accessible to newcomers?  Are we selling out the next generation?</p>

<p>How do you solve this problem?  Obviously just launching a language with one-based indexing will fix exactly nothing, because nobody's going to use such a language.  The only real solution I can see is to launch a language with no indexing.  The <code>foreach</code> constructs that are showing up in more and more languages are a good first step, but they have a few problems, mainly that they're an all-or-nothing affair.  It's hard to imagine a <code>foresome</code> construct that works as intuitively.</p>

<p>Iterators are another solution, but they're still not perfect.  First, an iterator is a very abstract object, as objects go.  'The current state of the process of iterating over the contents of a collection' isn't something most people can think of as a 'thing', and for better or for worse people prefer to model objects as nouns (at least in <a href="http://steve-yegge.blogspot.com/2006/03/execution-in-kingdom-of-nouns.html">most kingdoms</a>).  Iterators are a complex and abstract concept and many programmers find them baffling.  And even after you've got the concept down, iterators can still make for some really hairy code:<br />
<pre><code>for_each( c.begin( ), find_if( c.begin( ), c.end( ) SomePredicate( ) ), SomeFunctor( ) )</code></pre><br />
Doesn't exactly roll off the tongue, does it?</p>

<p>So what's the answer?  Probably to do what we (where by 'we' I mean 'people who are smarter than me') are already doing -- leave our arrays zero-indexed to keep compatibility with the installed base of programmer-brains, and keep searching for cleaner syntax to describe declarative array iteration.  In the mean time, train new programmers to think in terms of <code>map</code> and <code>reduce</code> instead of in terms of looping, which probably means functional programming at an early age, and try to move that way ourselves.</p>]]></description>
            <link>http://seankerwin.org/archives/2008/01/like_social_sec.shtml</link>
            <guid>http://seankerwin.org/archives/2008/01/like_social_sec.shtml</guid>
            
            
            <pubDate>Fri, 04 Jan 2008 17:00:48 -0500</pubDate>
        </item>
        
    </channel>
</rss>
