tag:blogger.com,1999:blog-76096585259411941262024-03-12T23:55:54.574-04:00Flavor-ifficKeep your head down and keep coding.Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.comBlogger17125tag:blogger.com,1999:blog-7609658525941194126.post-8026378666520819082012-03-14T12:22:00.001-04:002012-03-14T12:24:55.841-04:00Y U NO GEMSPEC!?<h2>tl;dr</h2>
<ol>
<li>Team Nokogiri are not 10-foot-tall code-crunching robots, so <code>master</code> is usually unstable.</li>
<li>Unstable code can corrupt your data and crash your application, which would make everybody look bad.</li>
<li>Therefore, the <em>risk</em> associated with using unstable code is severe; for you <em>and</em> for Team Nokogiri.</li>
<li>The absence of a gemspec is a risk mitigation tactic.</li>
<li>You can always ask for an RC release.</li>
</ol>
<h2>Why Isn't There a Gemspec!?</h2>
<p>OHAI! Thank you for asking this question!</p>
<p>Team Nokogiri gets asked this pretty frequently. Just a sample from
the historical record:</p>
<ul>
<li><a href="https://github.com/tenderlove/nokogiri/issues/274">Issue #274</a></li>
<li><a href="https://github.com/tenderlove/nokogiri/issues/371">Issue #371</a></li>
<li><a href="https://github.com/tenderlove/nokogiri/commit/7f17a643a05ca381d65131515b54d4a3a61ca2e1#commitcomment-667477">A commit removing nokogiri.gemspec</a></li>
<li><a href="http://groups.google.com/group/nokogiri-talk/browse_thread/thread/4706b002e492d23f">A nokogiri-talk thread</a></li>
<li><a href="http://groups.google.com/group/nokogiri-talk/browse_thread/thread/0b201bb80ea3eea0">Another nokogiri-talk thread</a></li>
</ul>
<p>Sometimes people imply that we've forgotten, or that we don't how to
properly manage our codebase. Those people are super fun to respond
to!</p>
<p>We've gone back and forth a couple of times over the past few years,
but the current policy of Team Nokogiri is to <strong>not</strong> provide a
gemspec in the Github repo. This is a conscious choice, not an
oversight.</p>
<h2>But You Didn't Answer the Question!</h2>
<p>Ah, I was hoping you wouldn't notice. Well, OK, let's do this, if
you're serious about it.</p>
<p>I'd like to start by talking about <em>risk</em>. Specifically, the risk
associated with using a known-unstable version of Nokogiri.</p>
<h3>Risk</h3>
<p>One common way to evaluate the <em>risk</em> of an incident is:</p>
<pre><code>risk = probability x impact
</code></pre>
<p>You can read more about this on <a href="http://en.wikipedia.org/wiki/Risk_Matrix">the internets</a>.</p>
<p>The <em>risk</em> associated with a Nokogiri bug could be loosely defined by
answering the questions:</p>
<ul>
<li>"How likely is it that a bug exists?" (probability)</li>
<li>"How severe will the consequences of a bug be?" (impact)</li>
</ul>
<h3>Probability</h3>
<p>The <code>master</code> branch should be considered unstable. Team Nokogiri are
not 10-foot-tall code-crunching robots; we are humans. We make
mistakes, and as a result, any arbitrary commit on <code>master</code> is likely
to contain bugs.</p>
<p>Just as an example, Nokogiri <code>master</code> was unstable for about five
months between November 2011 and March 2012. It was unstable not
because we were sloppy, or didn't care, but because the fixes were
hard and unobvious.</p>
<p>When we release Nokogiri, we test for memory leaks and invalid memory
access on all kinds of platforms with many flavors of Ruby and lots of
versions of libxml2. Because these tests are time-consuming, we don't
run them on every commit. We run them often when preparing a release.</p>
<p>If we're releasing Nokogiri, it means we think it's rock solid.</p>
<p>And if we're not releasing it, it means there are probably bugs.</p>
<h3>Impact</h3>
<p>Nokogiri is a gem with native extensions. This means it's not pure
Ruby -- there's C or Java code being compiled and run, which means
that there's always a chance that the gem will crash your application,
or worse. Possible outcomes include:</p>
<ul>
<li>leaking memory</li>
<li>corrupting data</li>
<li>making benign code crash (due to memory corruption)</li>
</ul>
<p>So, then, a bug in a native extension can have much worse downside
than you might think. It's not just going to do something unexpected;
it's possibly going to do terrible, awful things to your application
and data.</p>
<p><strong>Nobody</strong> wants that to happen. Especially Team Nokogiri.</p>
<h3>Risk, Redux</h3>
<p>So, if you accept the equation</p>
<pre><code>risk = probability x impact
</code></pre>
<p>and you believe me when I say that:</p>
<ul>
<li>the probablility of a bug in unreleased code is high, and</li>
<li>the impact of a bug is likely to be severe,</li>
</ul>
<p>then you should easily see that the <em>risk</em> associated with a bug in
Nokogiri is quite high.</p>
<p>Part of Team Nokogiri's job is to try to mitigate this risk. We have a
number of tactics that we use to accomplish this:</p>
<ul>
<li>we respond quickly to bug reports, particularly when they are possible memory issues</li>
<li>we review each others' commits</li>
<li>we have a thorough test suite, and we test-drive new features</li>
<li>we discuss code design and issues on a core developer mailing list</li>
<li>we use valgrind to test for memory issues (leaks and invalid
access) on multiple combinations of OS, libxml2 and Ruby</li>
<li>we package release candidates, and encourage devs to use them</li>
<li><strong>we do NOT commit a gemspec in our git repository</strong></li>
</ul>
<p>Yes, that's right, the absence of a gemspec is a risk mitigation
tactic. Not only does Team Nokogiri not want to imply support for
<code>master</code>, we want to <strong>actively discourage</strong> people from using
it. Because it's not stable.</p>
<h2>But I Want to Do It Anyway</h2>
<p>Another option, is to email the <a href="http://groups.google.com/group/nokogiri-talk">nokogiri-talk
list</a> and ask for a
release candidate to be built. We're pretty accommodating if there's a
bugfix that's a blocker for you. And if we can't release an RC, we'll
tell you why.</p>
<p>And in the end, nothing is stopping you from cloning the repo and
generating a private gemspec. This is an extra step or two, but it has
the benefit of making sure developers have thought through the costs
and risks involved; and it tends to select for developers who know
what they're doing.</p>
<h2>In Conclusion</h2>
<p>Team Nokogiri takes stability very seriously. We want everybody who
uses Nokogiri to have a pleasant experience. And so we want to make
sure that you're using the best software we can make.</p>
<p>Please keep in mind that we're trying very hard to do the right thing
for all Nokogiri users out there in Rubyland. Nokogiri loves you very
much, and we hope you love it back.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com2tag:blogger.com,1999:blog-7609658525941194126.post-49175734855818135452011-05-18T12:16:00.005-04:002011-05-18T15:44:47.454-04:00Fairy-Wing Wrapup: Nokogiri Performance<h2 id='tldr'>TL;DR</h2>
<ul>
<li>Nokogiri’s DOM parser was extremely way faster than either the SAX or Reader parsers, in this particular real-world example.</li>
<li>ActiveSupport <code>Hash#from_xml</code>, I am dissapoint.</li>
<li>On JRuby, Nokogiri 1.5.0 is extremely way faster than Nokogiri 1.4.4, in this particular real-world example.</li>
</ul>
<p><img class="block" src='https://lh5.googleusercontent.com/_Ve7bb1LcoGY/TdPjpIiZgYI/AAAAAAAAA88/UhmVNIHi92c/s800/dix.png' alt='Artists Pre-enactment' /></p>
<p>(Shout-out to <a href='https://twitter.com/jonathanpberger'>@jonathanpberger</a> for the Artist’s Pre-enactment of Paul Dix wearing the fairy wings.)</p>
<h2 id='previously_on_the_fairy_wing_throwdown_'>Previously, on the Fairy Wing Throwdown …</h2>
<p>So, you might remember that a few months back, <a href='https://twitter.com/pauldix'>@pauldix</a> bet me that JSON parsing is an <a href='http://en.wikipedia.org/wiki/Order_of_magnitude'>order of magnitude</a> faster than XML parsing. (If you’re not in the loop, you can <a href='http://blog.flavorjon.es/2011/03/json-vs-xml-fairy-wing-throwdown.html'>read the dramatization of the bet</a>).</p>
<p>TL;DR, Paul lost that bet, and so will be wearing my daughter’s dress-up fairy wings during his <a href='http://en.oreilly.com/rails2011/public/schedule/speaker/2721'>RailsConf 2011 talk on Redis</a> on Thursday. Awesome!</p>
<p>You can <a href='https://gist.github.com/898920'>view the winning benchmark here</a>.</p>
<h2 id='i_want_to_go_to_there'>I want to go to there.</h2>
<p>The bet revolved around a real-world use case (Paul and I both work at Benchmark Solutions, a stealth financial market data startup in NYC).</p>
<p>You can view the data structure at the Offical Fairy-Wing Throwdown Repo™, <a href=''>https://github.com/flavorjones/fairy-wing-throwdown</a>, but the summary is that it’s 54K when serialized as JSON, and is comprised (mostly) of an array of key-value stores (i.e., hashes).</p>
<p>Because I wanted to not just win, but to <strong>destroy</strong> Paul, I implemented the same parsing task using Nokogiri’s DOM parser, SAX parser, and Reader parser, expecting that code complexity and performance would correlate, somehow. In my mind, the graph looked like this:</p>
<p><img class="block" src='https://lh4.googleusercontent.com/_Ve7bb1LcoGY/TdPdl8qRFKI/AAAAAAAAA84/jVWZBfFAzz0/s640/fairy-wing-expected.png' alt='Expected complexity and performance' /></p>
<p>But I was shocked and dismayed to see the real results:</p>
<p><img class="block" src='https://lh3.googleusercontent.com/_Ve7bb1LcoGY/TdPdlzbcd0I/AAAAAAAAA80/IfIwHxjv06I/s640/fairy-wing-actual.png' alt='Reality bites' /></p>
<h2 id='what_the_what'>What the WHAT?</h2>
<p>Yes, that’s right. My payback for increasing the complexity of the code was a <strong>reduction</strong> in performance. The DOM parser was extremely way faster than either the Reader or SAX parsers.</p>
<p>Let me say that again: <strong>the DOM parser implementation was compellingly faster (1.3x) than the SAX parser implementation</strong>.</p>
<p>Why would that be? Good question, which I’ll deep-dive into in my next post. But suffice to say, the SAX parser is bottlenecked on making lots of callbacks from C into Ruby-land.</p>
<h2 id='activesupport_i_am_dissapoint'>ActiveSupport, I am dissapoint.</h2>
<p>Another big wow for me was how slow ActiveSupport’s <code>Hash#from_xml</code> method is. <a href='https://gist.github.com/898920'>The benchmark</a> shows that it’s about 40 times slower than the partial implementation using Nokogiri’s DOM parser.</p>
<p>Somebody should work on that! It wouldn’t be tough to hack an alternative implementation of <code>Hash#from_xml</code> on top of Nokogiri. If anybody’s looking for an interesting project, there it is.</p>
<h2 id='you_can_be_my_yokolet'>You can be my @yokolet</h2>
<p>Here’s a chart of how the DOM parser implementation works on various platforms:</p>
<p><img class="block" src='https://lh5.googleusercontent.com/_Ve7bb1LcoGY/TdPuObBoisI/AAAAAAAAA9A/t8zbUdgZT3A/s640/dom-parser.png' alt='DOM parser on various platforms' /></p>
<p>Holy cow! The pure-Java implementation on Nokogiri 1.5.0.beta.4 is 4 times faster than the FFI-to-C implementation on Nokogiri 1.4.4 (28s vs 117s). That’s crazytown!</p>
<p>Thanks to everyone who’s committed to the pure-Java code, notably <a href='https://twitter.com/headius'>@headius</a>, <a href='https://twitter.com/yokolet'>@yokolet</a>, <a href='https://github.com/pmahoney'>@pmahoney</a> and <a href='https://github.com/serabe'>@serabe</a>.</p>
<h2 id='chart_notes'>Chart Notes</h2>
<p>The “expected performance” line chart is in imaginary units.</p>
<p>The “actual performance” line chart renders performance in number of records processed per second, so bigger is better. The Saikuro and Flog scores were normalized on their values for <code>#transform_via_dom</code>.</p>
<p>The “DOM parser on various platforms” bar chart renders total benchmark runtime, so smaller is better.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com6tag:blogger.com,1999:blog-7609658525941194126.post-84829491869689840992011-03-31T08:19:00.004-04:002011-03-31T09:08:56.019-04:00JSON vs XML: The Fairy-Wing Throwdown<h2 id='tldr'>TL;DR</h2>
<ol>
<li>Is XML parsing more than an order of magnitude (i.e., 10x) slower than JSON parsing in real world situations?</li>
<li>Both <a href='http://twitter.com/pauldix'>@pauldix</a> and <a href='http://twitter.com/flavorjones'>@flavorjones</a> think XML parsing is slower than than JSON parsing.</li>
<li><a href='http://twitter.com/pauldix'>@pauldix</a> says XML parsing is <em>more</em> than an order of magnitude slower than JSON parsing.</li>
<li><a href='http://twitter.com/flavorjones'>@flavorjones</a> says XML parsing is <em>less</em> than an order of magnitude slower than JSON parsing.</li>
<li>The loser must wear <a href='http://twitter.com/flavorjones'>@flavorjones</a>’s daughter’s dress-up fairy wings <strong>on stage</strong> throughout <a href='http://twitter.com/pauldix'>@pauldix</a>’s RailsConf 2011 presentation.</li>
<li>Benchmarks must be performed by close of business Friday, April 1. (No, this is not an April Fool's joke.)
</ol>
<p>And man, I hope Confreaks is filming it.</p>
<h2 id='the_fairywing_throwdown'>The Fairy-Wing Throwdown</h2>
<p>or</p>
<p><strong>JSON v XML</strong></p>
<p>or</p>
<p><strong>How much slower, exactly, is XML in the real world?</strong></p>
<p>(A One Act Drama)</p>
<h3 id='dramatis_personae'>Dramatis Personae</h3>
<ul>
<li>Chorus</li>
<li>Mike (<a href='http://twitter.com/flavorjones'>@flavorjones</a>), Nokogiri Mother, Righter of Wrongs.</li>
<li>Paul (<a href='http://twitter.com/pauldix'>@pauldix</a>), A Knave.</li>
<li>John (<a href='http://twitter.com/jvshahid'>@jvshahid</a>), An Instigator.</li>
</ul>
<h3 id='act_i_scene_i'>Act I, Scene I</h3>
<p><em>Cast is gathered together, drinking beverages, nerding.</em></p>
<p><strong>John</strong>: Hark! My love for Scala knows no bounds. Also, Ruby is Not Half Bad.</p>
<p><strong>Mike</strong>: Rememberest thou when we first met? You had time and love for naught but Java and its dear-lov’d cousin, strong static typing.</p>
<p><strong>Chorus</strong>: And don’t forget XML!</p>
<p><strong>Paul</strong>: Ha ha! Java doth go nowhere without bountiful XML following it around like a little puppy.</p>
<p><strong>Mike</strong>: Ha ha! And aided by Spring’s alchemy you wrote Java in XML!</p>
<p><strong>Chorus</strong>: Ha ha!</p>
<p><strong>John</strong>: A cold and drowsy humour to hear you mock XML so.</p>
<p><strong>Chorus</strong>: Why dost thou wring thy hands?</p>
<p><strong>John</strong>: Because Nokogiri hath been brought forth from his loins, and he hath intimate knowledge of XML.</p>
<p><strong>Mike</strong>: Aye, I know it well, and thus my disaffection has measure and reason. In particular, namespaces are really quite broken.</p>
<p><strong>Paul</strong>: Plus, it’s SO SLOW.</p>
<p><strong>John</strong>: Gentle Dix, put thy rapier up.</p>
<p><strong>Paul</strong>: I do protest I never injur’d thee! I would wager that XML has got to be an order of magnitude slower than JSON, at least!</p>
<p>(Pause.)</p>
<p><strong>Mike</strong>: (to John) Forbear this outrage? For shame.</p>
<p>(John shrugs.)</p>
<p><strong>Mike</strong>: Knowest I libxml2 so well, it is mos def dishonorably slow. But an order of magnitude? I will take that bet.</p>
<p><strong>Paul</strong>: I am not affrighted, nor have I need for your money.</p>
<p><strong>Mike</strong>: Then … let’s make it … interesting.</p>
<p>Exeunt omnes.</p>
<h2 id='tentative_conditions'>Tentative Conditions</h2>
<ol>
<li>Benchmarks must be performed on Ruby 1.8.7 with any standard compiled extensions / gems.</li>
<li>Objective is a specific data structure actually used by these Gentlemen at their place of business, Benchmark Solutions.</li>
<li>Code must take a string (JSON or XML), and return an inflated Ruby data structure exactly matching the objective.</li>
<li>Timing should encompass only in-memory operations (not IO).</li>
<li><a href='http://twitter.com/jvshahid'>@jvshahid</a> will be the arbiter of whether the implementations violate the spirit of “real worldiness”.</li>
</ol>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com1tag:blogger.com,1999:blog-7609658525941194126.post-54418797297984261212009-06-08T02:59:00.003-04:002009-06-29T22:12:23.633-04:00Easily Valgrind & GDB your Ruby C Extensions<p><strong>Update:</strong> John Barnette (<a href="http://twitter.com/jbarnette">@jbarnette</a>) has packaged these rake tasks up as a <a href="http://blog.zenspider.com/2009/06/hoe-2-electric-boogaloo.html">Hoe plugin</a>: <a href="http://github.com/jbarnette/hoe-debugging">hoe-debugging</a>.</p>
<p>When developing <a href="http://nokogiri.rubyforge.org/nokogiri/">Nokogiri</a>, the most valuable tool I use to track down memory-related errors is <a href="http://valgrind.org/">Valgrind</a>. It <a href="http://github.com/tenderlove/nokogiri/blob/92b5d8f86bd4e2aaf3349988459d29ac39fe2808/ext/nokogiri/xml_node_set.c#L346">rocks</a>! <a href="http://tenderlovemaking.com/">Aaron</a> and I run the entire Nokogiri test suite under Valgrind before releasing any version.</p>
<p>I could wax poetic about Valgrind all day, but for now I'll keep it brief and just say: if you write C code and you're not familiar with Valgrind, <em>get familiar with it</em>. It will save you countless hours of tracking down <a href="http://catb.org/jargon/html/H/heisenbug.html">heisenbugs</a> and memory leaks some day.</p>
<p>In any case, I've been meaning to package up my utility scripts and tools for quite a while. But they're so small, and it's so hard to make them work for every project ... it's looking pretty likely that'll never happen, so blogging about them is probably the best thing for everyone.</p>
<h2>Basics</h2>
<p>Let's get to it. Here's how to run a ruby process under valgrind:</p>
<pre><code># hello-world.rb
require 'rubygems'
puts 'hello world'
# run from cmdline
valgrind ruby hello-world.rb
</code></pre>
<p>Oooh! But that's not actually what you want. The Matz Ruby Interpreter does a lot of funky things in the name of speed, like using uninitialized variables and reading past the ends of <code>malloc</code>ed blocks that aren't on an 8-byte boundary. As a result, something as simple as <code>require 'rubygems'</code> will give you 3800 lines of error messages <a href="http://gist.github.com/125669">(see this gist for full output)</a>.</p>
<p>Let's try this:</p>
<pre><code>valgrind --partial-loads-ok=yes --undef-value-errors=no ruby hello-world.rb
==15535== Memcheck, a memory error detector.
==15535== Copyright (C) 2002-2007, and GNU GPL'd, by Julian Seward et al.
==15535== Using LibVEX rev 1804, a library for dynamic binary translation.
==15535== Copyright (C) 2004-2007, and GNU GPL'd, by OpenWorks LLP.
==15535== Using valgrind-3.3.0-Debian, a dynamic binary instrumentation framework.
==15535== Copyright (C) 2000-2007, and GNU GPL'd, by Julian Seward et al.
==15535== For more details, rerun with: -v
==15535==
hello world
==15535==
==15535== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)
==15535== malloc/free: in use at exit: 10,403,440 bytes in 138,986 blocks.
==15535== malloc/free: 420,496 allocs, 281,510 frees, 155,680,688 bytes allocated.
==15535== For counts of detected errors, rerun with: -v
==15535== searching for pointers to 138,986 not-freed blocks.
==15535== checked 10,654,020 bytes.
==15535==
==15535== LEAK SUMMARY:
==15535== definitely lost: 21,280 bytes in 1,330 blocks.
==15535== possibly lost: 27,368 bytes in 1,840 blocks.
==15535== still reachable: 10,354,792 bytes in 135,816 blocks.
==15535== suppressed: 0 bytes in 0 blocks.
==15535== Rerun with --leak-check=full to see details of leaked memory.
</code></pre>
<p>Ahhh, much better. We don't see any spurious errors.</p>
<p>Without going too far off-topic, I'd should just mention that those "leaks" aren't really leaks, they're characteristic of how the Ruby interpreter manages its internal memory. (You can see this by running this example with <code>--leak-check=full</code>.)</p>
<h2>Rakified!</h2>
<p>Here's an easy way to run Valgrind on your gem's existing test suite. This rake task assumes you've got Hoe 1.12.1 or higher.</p>
<pre><code>namespace :test do
# partial-loads-ok and undef-value-errors necessary to ignore
# spurious (and eminently ignorable) warnings from the ruby
# interpreter
VALGRIND_BASIC_OPTS = "--num-callers=50 --error-limit=no \
--partial-loads-ok=yes --undef-value-errors=no"
desc "run test suite under valgrind with basic ruby options"
task :valgrind => :compile do
cmdline = "valgrind #{VALGRIND_BASIC_OPTS} ruby #{HOE.make_test_cmd}"
puts cmdline
system cmdline
end
end
</code></pre>
<p>Those basic options will give you a decent-sized stack walkback on errors, will make sure you see every error, and will skip all the BS output mentioned above. You can read Valgrind's documentation for more information, and to tune the output.</p>
<p>If you're not testing a gem, or don't have Hoe installed, try this for <code>Test::Unit</code> suites:</p>
<pre><code>def test_suite_cmdline
require 'find'
files = []
Find.find("test") do |f|
files << f if File.basename(f) =~ /.*test.*\.rb$/
end
cmdline = "#{RUBY} -w -I.:lib:ext:test -rtest/unit \
-e '%w[#{files.join(' ')}].each {|f| require f}'"
end
namespace :test do
# partial-loads-ok and undef-value-errors necessary to ignore
# spurious (and eminently ignorable) warnings from the ruby
# interpreter
VALGRIND_BASIC_OPTS = "--num-callers=50 --error-limit=no \
--partial-loads-ok=yes --undef-value-errors=no"
desc "run test suite under valgrind with basic ruby options"
task :valgrind => :compile do
cmdline = "valgrind #{VALGRIND_BASIC_OPTS} #{test_suite_cmdline}"
puts cmdline
system cmdline
end
end
</code></pre>
<p>Getting this to work for <code>rspec</code> suites is left as an exercise for the reader. :-\</p>
<h2>A Note for OS X Users</h2>
<p>Valgrind isn't just for Linux. You can make Valgrind work on your fancy-pants OS, too! Check out <a href="http://www.sealiesoftware.com/valgrind/">http://www.sealiesoftware.com/valgrind/</a> for details.</p>
<h2>GDB FTW!</h2>
<p>Another thing I find myself doing pretty often is running the test suite under the <code>gdb</code> debugger:</p>
<pre><code>gdb --args ruby -S rake test
</code></pre>
<p>or in your Rakefile:</p>
<pre><code>namespace :test do
desc "run test suite under gdb"
task :gdb => :compile do
system "gdb --args ruby #{HOE.make_test_cmd}"
end
end
</code></pre>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-1836560600649541432008-11-17T14:28:00.001-05:002008-11-17T14:29:54.594-05:00Nokogiri, Your New Swiss Army Knife<h2>Prologue</h2>
<p>Today I'd like to talk about the use of regular expressions to parse
and modify HTML. Or rather, the <strong>misuse</strong>.</p>
<p>I'm going to try to convince you that it's a <em>very</em> bad idea to use
regexes for HTML. And I'm going to introduce you to
<a href="http://github.com/tenderlove/nokogiri/tree/master">Nokogiri</a>, my new
best friend and life companion, who can do this job way better, and
nearly as fast.</p>
<p>For those of you who just want the meat without all the starch:</p>
<ol>
<li>You don't parse Ruby or YAML with regular expressions, so don't do it with HTML, either.</li>
<li>If you know how to use Hpricot, you know how to use Nokogiri.</li>
<li>Nokogiri can parse and modify HTML more robustly than regexes, with less penalty than formatting Markdown or Textile.</li>
<li>Nokogiri is 4 to 10 times faster than Hpricot performing the typical HTML-munging operations benchmarked.</li>
</ol>
<h2>The Scene</h2>
<p>On one of the open-source projects I contribute to (names will be
withheld for the protection of the innocent, this isn't
<a href="http://thedailywtf.com/">Daily WTF</a>), I came across the following code:</p>
<pre><code>def spanify_links(text)
text.gsub(/<a\s+(.*)>(.*)<\/a>/i, '<a \1><span>\2</span></a>')
end
</code></pre>
<p>In case it's not clear, the goal of this method is to insert a
<code><span></code> element inside the link, converting hyperlinks from</p>
<pre><code><a href='http://foo.com/'> Foo! </a>
</code></pre>
<p>to</p>
<pre><code><a href='http://foo.com/'> <span> Foo! </span> </a>
</code></pre>
<p>for CSS styling.</p>
<h2>The Problem</h2>
<p>Look, I love regexes as much as the next guy, but this regex is
seriously busticated. If there is more than one <code><a></code> tag on a line,
only the final one will be spanified. If the tag contains an embedded
newline, nothing will be spanified. There are probably other unobvious
bugs, too, and that means there's a
<strong><a href="http://en.wikipedia.org/wiki/Code_smell">code smell</a></strong> here.</p>
<p>Sure, the regex could be fixed to work in these cases. But does a
trivial feature like this justify the time spent writing test cases
and playing whack-a-mole with regex bugs? <strong>Code smell.</strong></p>
<p>Let's look at it another way: If you were going to modify Ruby code
programmatically, would you use regular expressions? I seriously doubt
it. You'd use something like
<a href="http://rubyforge.org/projects/parsetree/">ParseTree</a>, which
understands all of Ruby's syntax and will correctly interpret
everything in context, not just in isolation.</p>
<p>What about YAML? Would you modify YAML files with regular expressions?
<em>Hells</em> no. You'd slurp it with <code>YAML.parse()</code>, modify the in-memory
data structures, and then write it back out.</p>
<p>Why wouldn't you do the same with HTML, which has its own nontrivial
(and DTD-dependent) syntax?</p>
<p>Regular expressions just aren't the right tool for this job. Jamie
Zawinski said it best:</p>
<blockquote>
<p>Some people, when confronted with a problem, think "I know,
I'll use regular expressions." Now they have two problems.</p>
</blockquote>
<h2>Why, God? Why?</h2>
<p>So, what drives otherwise intelligent people (myself included) to whip
out regular expressions when it comes time to munge HTML?</p>
<p>My only guess is this: A lack of worthy XML/HTML libraries.</p>
<p>Whoa, whoa, put down the flamethrower and let me explain myself. By
"worthy", I mean three things:</p>
<ul>
<li>fast, high-performance, suitable for use in a web server</li>
<li>nice API, easy for a developer to learn and use</li>
<li>will successfully parse broken HTML commonly found on the intarwebs</li>
</ul>
<p><a href="http://xmlsoft.org/">libxml2</a>
and <a href="http://libxml.rubyforge.org/">libxml-ruby</a> have been around for
ages, and they're
<a href="http://www.xml.com/pub/a/2007/05/09/xml-parser-benchmarks-part-1.html">incredibly fast</a>.
But have you seen the API? It's
<a href="http://libxml.rubyforge.org/rdoc/classes/LibXML/XML/Document.html#M000354">totally sadistic</a>,
and as a result it's inappropriate and not easily usable in simple
cases like the one described above.</p>
<p>Now, <a href="http://code.whytheluckystiff.net/hpricot/">Hpricot</a> is pure
genius. It's pretty fast, and the API is absolutely delightful to work
with. It supports CSS as well as XPath queries. I've even used it
(with <a href="http://code.google.com/p/feed-normalizer/">feed-normalizer</a>) in
a Rails application, and it performed reasonably well. But it's still
much slower than regexes. Here's a (totally unfair) sample benchmark
comparing Hpricot to a comparable (though buggy) regular expression
(see below for a link to the benchmark gist):</p>
<pre><code>For an html snippet 2374 bytes long ...
user system total real
regex * 1000 0.160000 0.010000 0.170000 ( 0.182207)
hpricot * 1000 5.740000 0.650000 6.390000 ( 6.401207)
it took an average of 0.0064 seconds for Hpricot to parse and operate on an HTML snippet 2374 bytes long
For an html snippet 97517 bytes long ...
user system total real
regex * 10 0.100000 0.020000 0.120000 ( 0.122117)
hpricot * 10 3.190000 0.300000 3.490000 ( 3.502819)
it took an average of 0.3503 seconds for Hpricot to parse and operate on an HTML snippet 97517 bytes long
</code></pre>
<p>So, historically, I haven't used Hpricot everywhere I could have, and
that's because I was overly-cautious about performance.</p>
<h2>Get On With It, Already</h2>
<p>Oooooh, if only there was a library with libxml2's speed and Hpricot's
API. Then maybe people wouldn't keep trying to use regular expressions
where an HTML parser is needed.</p>
<p>Oh wait, there is. Everyone, meet
<a href="http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/">Nokogiri</a>.</p>
<p>Check out the <a href="http://gist.github.com/25854">full benchmark</a>,
comparing the same operation (spanifying links and removing
possibly-unsafe tags) across regular expressions, Hpricot and
Nokogiri:</p>
<pre><code>For an html snippet 2374 bytes long ...
user system total real
regex * 1000 0.160000 0.010000 0.170000 ( 0.182207)
nokogiri * 1000 1.440000 0.060000 1.500000 ( 1.537546)
hpricot * 1000 5.740000 0.650000 6.390000 ( 6.401207)
it took an average of 0.0015 seconds for Nokogiri to parse and operate on an HTML snippet 2374 bytes long
it took an average of 0.0064 seconds for Hpricot to parse and operate on an HTML snippet 2374 bytes long
For an html snippet 97517 bytes long ...
user system total real
regex * 10 0.100000 0.020000 0.120000 ( 0.122117)
nokogiri * 10 0.310000 0.020000 0.330000 ( 0.322290)
hpricot * 10 3.190000 0.300000 3.490000 ( 3.502819)
it took an average of 0.0322 seconds for Nokogiri to parse and operate on an HTML snippet 97517 bytes long
it took an average of 0.3503 seconds for Hpricot to parse and operate on an HTML snippet 97517 bytes long
</code></pre>
<p>Wow! Nokogiri parsed and modified blog-sized HTML snippets in under 2
milliseconds! This performance, though still significantly slower than
regular expressions, is still fast enough for me to consider using it
in a web application server.</p>
<p>Hell, that's as fast (faster, actually) than BlueCloth or RedCloth can
render Markdown or Textile of similar length. If you can justify using
<em>those</em> in your web application, you can certainly afford the overhead
of Nokogiri.</p>
<p>And as for usability, let's compare the regular expressions to the Nokogiri operations:</p>
<pre><code>html.gsub(/<a\s+(.*)>(.*)<\/a>/i, '<a \1><span>\2</span></a>') # broken regex
html.gsub(/<(script|noscript|object|embed|style|frameset|frame|iframe)[>\s\S]*<\/\1>/, '')
doc.search("a/text()").wrap("<span></span>")
doc.search("script","noscript","object","embed","style","frameset","frame","iframe").unlink
</code></pre>
<p>The Nokogiri version is <em>much</em> clearer. More maintainable, more robust
and, for me, just fast enough to start jamming into all kinds of
places.</p>
<h2>Where Else Can I Use Nokogiri?</h2>
<p>You can use Nokogiri anywhere you read, write or modify HTML or
XML. It's your new swiss army knife.</p>
<p>What about your test cases? <a href="http://www.merbivore.com/">Merb</a> is using
Nokogiri extensively in their controller tests, and they're
reportedly much faster than before. And those Merb dudes are S-M-R-T.</p>
<p>Have you thought about using Nokogiri::Builder to generate XML,
instead of the default Rails XML template builder? Boy, I
have. Upcoming blog post, hopefully.</p>
<p>Let me know where else you've found Nokogiri useful! Or better yet,
join the <a href="http://rubyforge.org/mailman/listinfo/nokogiri-talk">mailing list</a> and tell
the community!</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com1tag:blogger.com,1999:blog-7609658525941194126.post-90576205463637193512008-10-31T11:36:00.000-04:002011-05-18T15:45:12.581-04:00Nokogiri: World's Finest (XML/HTML) Saw<p>Yesterday was a big day, and I nearly missed it, since I spent nearly all of the sunlight hours at the wheel of a car. Nine hours sitting on your butt is no way to ... oh wait, that's actually how I spend every day. Just usually not in a rental Hyundai. Never mind, I digress.
</p>
<p>It was a big day because <a href='http://nokogiri.rubyforge.org/nokogiri/'>Nokogiri</a> was released. I've spent quite a bit of time over the last couple of months working with <a href='http://tenderlovemaking.com/'>Aaron Patterson</a> (of <a href='http://rubyforge.org/projects/mechanize/'>Mechanize</a> fame) on this excellent library, and so I'm walking around, feeling satisfied.
</p>
<p>"What's Nokogiri?" Good question, I'm glad I asked it.
</p>
<p>Nokogiri is the best damn XML/HTML parsing library out there in Rubyland. What makes it so good? You can search by XPath. You can search by CSS. You can search by both XPath <i>and</i> CSS. Plus, it uses <a href='http://xmlsoft.org/'>libxml2</a> as the parsing engine, <a href='http://www.xml.com/pub/a/2007/05/09/xml-parser-benchmarks-part-1.html'>so it's fast</a>. But the best part is, it's got a dead-simple interface that we shamelessly lifted from <a href='http://code.whytheluckystiff.net/hpricot/'>Hpricot</a>, everyone's favorite delightful parser.
</p>
<p>I had big plans to do a series of posts with examples and benchmarks, but right now I'm in <a href='http://www.google.com/search?q=dst+hell'>DST Hell</a> and don't have the quality time to invest.
</p>
<p>So, as I am wont to do, I'm punting. Thankfully, Aaron was his usual prolific self, and has kindly provided lots of documentation and examples:
<ul>
<li><a href='http://tenderlovemaking.com/2008/10/30/nokogiri-is-released/'>Aaron's blog post</a>
<li><a href='http://nokogiri.rubyforge.org/nokogiri/'>Documentation (RDoc)</a>
<li><a href='http://github.com/tenderlove/nokogiri/wikis'>Nokogiri-the-Wiki</a>
<li><a href='http://rubyforge.org/projects/nokogiri'>Nokogiri on Rubyforge</a>
<li><a href='http://gist.github.com/18533'>Benchmarks</a>
<li><a href='http://github.com/tenderlove/nokogiri/'>Git repository</a>
</ul>
</p>
<p>Use it in good health! Carry on.</p>
<p>P.S. Please start following Aaron on <a href='http://twitter.com/tenderlove'>Twitter</a>. :)</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-52673719459211330752008-08-26T01:28:00.000-04:002008-09-14T12:17:51.889-04:00Rails Model Firewall Mixin<p>At my company, <a href="http://www.pharos-ei.com/">Pharos</a>, we're about to
launch a new product which will contain sensitive data for multiple
firms in a single database. This is essentially a lightweight version
of our flagship product, which was built for a single client.</p>
<p>Of course, as a result, I had to refactor like crazy to get rid of the
implicit "one-firm" assumption that was built into the code and database
schemas.</p>
<p>The essential task was to add "firm_id" to each of the private table
schemas, and then make sure that all the code that accesses the model
specifies the firm in the query. The two access idioms that were being
widely used (unsurprisingly):</p>
<pre><code>results = ClassName.find(:all, :conditions => [....])
</code></pre>
<p>and</p>
<pre><code>results = ClassName.find_by_entity_id_and_hour(...)
</code></pre>
<p>I was able to make minimal changes to the code by supporting the
following new idioms through a mixin (the mixin code is at the end of
the article):</p>
<pre><code>results = ClassName.find_in_firm_scope(firm_id, :all, :conditions => [....])
results = ClassName.with_firm_scope(firm_id) do |klass|
klass.find_by_entity_id_and_hour(...)
end
</code></pre>
<p>(The second idiom I found easier to make (and the diff easier to read) than:</p>
<pre><code>ClassName.find_by_firm_id_and_entity_id_and_hour(firm_id, ...)
</code></pre>
<p>but really, that's a matter of taste.)</p>
<p>But I was still nervous. What if I missed an instance of a database
lookup that wasn't specifying firm, and as a result one client saw
another client's records? That would be a Really Bad Thing
<sup>TM</sup>, and I want to explicitly make sure that can't
happen. But how?</p>
<p>After a half hour of poking around and futzing, I came up with a
<code>find()</code>-and-friends implementation that will check <code>with_scope</code>
conditions as well as the <code>:conditions</code> parameter to the find() call:</p>
<pre><code>>> My::PrivateModel.find_by_entity_id(1)
RuntimeError: My::PrivateModel PrivateRecord find() did not specify firm_id
</code></pre>
<p>Without further ado, here's the mixin:</p>
<pre><code># lib/private_record.rb
module PrivateRecord
def self.included(base)
base.validates_presence_of :firm_id
base.extend PrivateRecordClassExtendor
end
end
module PrivateRecordClassExtendor
def find_every(*args)
check_for_firm_id(*args)
super(*args)
end
# the DRY idiom here is: results = ClassName.with_firm_scope(firm) {|klass| klass.find(...) }
def with_firm_scope(firm, &block)
with_scope(:find => {:conditions => "firm_id = #{firm}"}, :create => {:firm_id => firm}) do
yield self
end
end
def find_in_firm_scope(firm, *args)
with_firm_scope(firm) do
find(*args)
end
end
private
FIRM_ID_RE = /firm_id =/
def check_for_firm_id(*args)
ok = false
if scoped_methods
scoped_methods.each do |j|
if j[:find] && j[:find][:conditions] && j[:find][:conditions] =~ FIRM_ID_RE
ok = true
break
end
end
end
if !ok
args.each do |j|
if j.is_a?(Hash) && j[:conditions]
if (j[:conditions].is_a?(String) && j[:conditions] =~ FIRM_ID_RE) \
or (j[:conditions].is_a?(Hash) && j[:conditions][:firm_id])
ok = true
break
end
end
end
end
raise "#{self} PrivateRecord find() did not specify firm_id" if !ok
end
end
</code></pre>
<p>The magic is all in the <code>check_for_firm_id()</code> method. To use this, simply:</p>
<pre><code>include PrivateRecord
</code></pre>
<p>and go to town.</p>
<p>Oh, and lest ye be skeptical, here are the test cases:</p>
<pre><code>require File.dirname(__FILE__) + '/../test_helper'
class PrivateModelTest < ActiveSupport::TestCase
fixtures :isone_da_schedules
def test_privaterecord_disallow_find_requirement
assert_raises(RuntimeError) { My::PrivateModel.find(1) }
assert_raises(RuntimeError) { My::PrivateModel.find_by_entity_id(1) }
assert_raises(RuntimeError) { My::PrivateModel.find_all_by_entity_id(1) }
assert_raises(RuntimeError) { My::PrivateModel.find(:all, :conditions => 'entity_id = 1') }
assert_raises(RuntimeError) { My::PrivateModel.find(:first, :conditions => 'entity_id = 1') }
end
def test_privaterecord_allow_find_requirement
assert_nothing_thrown { My::PrivateModel.find_in_firm_scope(1, 1) }
assert_nothing_thrown { My::PrivateModel.with_firm_scope(1) {|k| k.find_by_entity_id(1) } }
assert_nothing_thrown { My::PrivateModel.with_firm_scope(1) {|k| k.find_all_by_entity_id(1) } }
assert_nothing_thrown { My::PrivateModel.find_in_firm_scope(1, :all, :conditions => 'entity_id = 0') }
assert_nothing_thrown { My::PrivateModel.find_in_firm_scope(1, :first, :conditions => 'entity_id = 0') }
assert_nothing_thrown { My::PrivateModel.find_by_firm_id_and_entity_id(1, 1) }
end
end
</code></pre>
<p>Let me know in the comments if you found this at all useful! Keep coding.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com1tag:blogger.com,1999:blog-7609658525941194126.post-53643560491288236012008-08-24T18:23:00.000-04:002008-08-27T12:08:46.981-04:00Freezing Deep Ruby Data Structures<p>On one of my current ruby projects, I'm reading in a YML file and
using the generated data structure as a hackish set of global
configuation settings:</p>
<pre><code>firm_1:
departments:
sales: 419
executive: 999
IT: 232
locations:
NY: 19
WV: 27
CA: 102
firm_2:
...
</code></pre>
<p>Because these should be treated as constants, they should not be overwritten (accidentally, of course).
I wanted to go ahead and freeze them:</p>
<pre><code>global_conf = YAML.load_file("...")
global_conf.freeze
global_conf['firm_1'] = {'foo' => 'bar'}
=> TypeError: can't modify frozen hash
</code></pre>
<p>But, as you probably know, Ruby's <code>freeze</code> doesn't affect the objects in a container.</p>
<pre><code>global_conf['firm_1']['departments'] = {'foo' => 'bar'}
=> {"foo"=>"bar"}
</code></pre>
<p>That's bad.</p>
<p>So I hacked up a quick monkeypatch (or whatever the duck punchers call it these days) to recursively freeze containers:</p>
<pre><code>#
# allow us to freeze deep data structures by recursively freezeing each nested object
#
class Hash
def deep_freeze # har, har ,har
each { |k,v| v.deep_freeze if v.respond_to? :deep_freeze }
freeze
end
end
class Array
def deep_freeze
each { |j| j.deep_freeze if j.respond_to? :deep_freeze }
freeze
end
end
</code></pre>
<p>After loading these patches, calling <code>deep_freeze</code> does what we want:</p>
<pre><code>global_conf = YAML.load_file("...")
global_conf.deep_freeze
global_conf['firm_1']['departments'] = {'foo' => 'bar'}
=> TypeError: can't modify frozen hash
</code></pre>
<p>Nice!</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com3tag:blogger.com,1999:blog-7609658525941194126.post-75791135203396287732008-05-22T15:23:00.000-04:002008-05-22T15:45:56.984-04:00Flash and the Firefox Reframe Problem<p>So I spent the last two days trying to figure out why Firefox insists
on reloading flash content whenever I flip around in my
tasty javascript-y tabbed interface.</p>
<p>You haven't seen this? I'm not surprised, it really only occurs if
you're embedding flash into a web page whose layout is being managed by
javascript. Some examples of UI libraries like this are
<a href="http://script.aculo.us/">Scriptaculous</a>, <a href="http://extjs.com/">Ext-JS</a>,
and my latest BSO, <a href="http://jquery.com/">jQuery</a>.</p>
<p>All of these libraries modify the style of the flash object's parent
<code><div></code> in ways (usually, <code>display:none</code>, but <code>position:absolute</code> will
do it, too) that somehow goads Firefox into helpfully reloading the
Flash from scratch. Reportedly it's not just swfobjects -- any generic
<code><object></code> or <code><embed></code>, including Java applets, will get reloaded.</p>
<p>For flash charting components (we're playing with
<a href="http://amcharts.com">amCharts</a> at my company,
<a href="http://www.pharos-ei.com">Pharos</a>), this problem is multiplied by the
fact that the flash application will re-download whatever historical
data you're trying to present, delaying the presentation and using up
more bandwidth. (Hey, how's your
<a href="http://intertwingly.net/blog/2006/06/05/Elevator-Pitch">ETag</a> support
looking?)</p>
<p>It actually took me about two hours to find what the root problem is,
and you're not going to believe it:</p>
<blockquote>
<p><a href="https://bugzilla.mozilla.org/show_bug.cgi?id=90268">https://bugzilla.mozilla.org/show_bug.cgi?id=90268</a></p>
</blockquote>
<p>This bug has been open since <a href="http://en.wikipedia.org/wiki/July_2001">July
2001</a>! That's <a href="http://www.mozilla.com/en-US/firefox/releases/0.9.html">Firefox
0.9</a>! Holy
cripey!</p>
<p>Worse, it's still not fixed, even in the brand-spanking-new <a href="http://www.mozilla.com/en-US/firefox/all-rc.html">Firefox
3</a>.</p>
<p>The good news is, there's a relatively easy way to get around this, if
your JS library is using CSS for hiding elements. What I mean by that
is, the javascript code is hiding elements by adding a class to them
(in Ext-JS, this class name defaults to <code>.x-hide-display</code>), and is not
setting <code>display:none</code> directly on your DOM elements. (You'll probably
need to look at the implementation of <code>hide()</code> and <code>show()</code> for your
specific library to know for sure.)</p>
<p>So if hiding DOM elements is done via style classes, the low-hanging
fruit is to redefine the CSS rule to look like this:</p>
<pre><code>.x-hide-display {
display:block!important; /* overrides the display:none in the original rule */
height:0!important;
width:0!important;
border:none!important;
visibility:hidden!important;
}
</code></pre>
<p>(this is exactly what I did to make my flash charts work in Ext-JS).</p>
<p>You can probably override the <code>hide()</code> and <code>show()</code> functions in your
particular library to do something like this, as well. YMMV.</p>
<p>Now, you're saying to yourself, "Dude, you must be breaking something
else that used to depend on the <code>display:none</code> behavior." Well, you're
probably right, but I haven't found it yet. If you know, or if you
find out, let me know in the comments.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com5tag:blogger.com,1999:blog-7609658525941194126.post-75347882573011240732008-05-15T17:07:00.000-04:002008-05-16T15:48:25.842-04:00jQuery UI and Closable Tabs<p>
So last week I decided (at my company, <a href="http://www.pharos-ei.com/">Pharos</a>) to dump <a href="http://extjs.com/">Ext-JS</a> in favor of <a href="http://jquery.com/">jQuery</a>.
</p><p>
The short version is that Ext-JS is hard to style with CSS, plus I was getting odd sizing of objects in my (pretty complicated) layouts that I just couldn't figure out. (The longer version has to do with how easy (hard) it is to write and find contributed extensions.)
</p><p>
Anyway, I'm getting off-topic. <a href="http://www.google.com/search?q=jquery+rocks">jQuery rocks</a>. And the <a href="http://ui.jquery.com/">jQuery-UI</a> project is really coming along, in terms of functionality. They're pushing hard to get the 1.5 release candidate out the door.
</p><p>
There are some missing pieces, though, as in any young GUI project. But because I'm betting on jQuery, I'm willing to work to make it do what I want. Until today, this meant contributing some (very) minor bugfixes.
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://3.bp.blogspot.com/_Ve7bb1LcoGY/SCyo4FJ81pI/AAAAAAAAAQg/n4RX06bIUvY/s1600-h/jquery.ui.tabs.closable-all.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="http://3.bp.blogspot.com/_Ve7bb1LcoGY/SCyo4FJ81pI/AAAAAAAAAQg/n4RX06bIUvY/s320/jquery.ui.tabs.closable-all.png" alt="" id="BLOGGER_PHOTO_ID_5200717351116134034" border="0" /></a>
</p><p>
But this afternoon, I implemented closable tabs. Check out <a href="http://dev.jquery.com/ticket/2470">jQuery trac 2470</a> for the patch and working examples (including CSS).
</p><p>
Here's a couple of screenshots showing the closable tabs in 'all' mode, and 'selected' mode. And you can play around with it on the <b><a href='http://pharos-ei.com/mike/jquery/examples/index.html'>demo page!</a></b>
</p><p>
General description:
<ul>
<li>A clickable "button" (really an A tag) appears on the tab. When the button is clicked, the tab is removed.
<li> LI tags are dynamically modified to contain a second tag:
<pre>
<a onclick="return false;"><span>#{text}</span></a>
</pre>
<li> The #{text} snippet will be replaced by the configuration option closeText (which is '(x)' by default), and the snippet itself can be set via the configuration option closeTemplate.
</ul>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_Ve7bb1LcoGY/SCypCVJ81qI/AAAAAAAAAQo/NEBWJC5geGI/s1600-h/jquery.ui.tabs.closable-selected.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="http://4.bp.blogspot.com/_Ve7bb1LcoGY/SCypCVJ81qI/AAAAAAAAAQo/NEBWJC5geGI/s320/jquery.ui.tabs.closable-selected.png" alt="" id="BLOGGER_PHOTO_ID_5200717527209793186" border="0" /></a>
<p>
Some specifics:
<ul>
<li>New creation option closable can be set to false, 'all' or 'selected'
<ul>
<li> default is false, meaning no closable tabs.
<li> 'all' means all tabs have are closable.
<li> 'selected' means only the selected tab is closable.
</ul>
<li> New creation options closeTemplate and closeText allow overriding default markup.
<li> When a tab is closable, a second A is dynamically added to the tab LI after the normal tab anchor
<ul>
<li> this tag is only added to the DOM if options.closable is non-false
<li> this tag is hidden in unselected tabs if options.closable is 'selected'
</ul>
<li> CSS / styles
<ul>
<li> Note that this patch is backwards-compatible with CSS as long as the closable option is not turned on.
<li> Close-button tag has class ui-tabs-close
<li> However, existing CSS will probably need to be modified to support the new close button.
<li> A new class, ui-tabs-tab is associated with the normal A to allow differentiation for themes/styles.
<li> see examples.tar.gz for example CSS support
</ul>
</ul>
So, if you find the code useful, let me know! It's attached to <a href="http://dev.jquery.com/ticket/2470">jQuery trac 2470</a>, along with the sample CSS and code in the snapshot. And don't forget to test drive it at the <b><a href='http://pharos-ei.com/mike/jquery/examples/index.html'>demo page!</a></b>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com4tag:blogger.com,1999:blog-7609658525941194126.post-83424971381348671762008-05-06T12:36:00.001-04:002009-09-14T12:33:24.322-04:00Managing Git Submodules With git.rake<p><em>Update 2008-05-21: Tim Dysinger and Pat Maddox pointed out that git submodules are inherently not well-suited for frequently updated projects. <strong>Read the comments for more details</strong>, and please use submodules with caution on projects where you can't guarantee a shared repository has not changed between 'pull' and 'push' operation.</em>
<p>Today I'm releasing git.rake into the wild under an open-source license. It's a rakefile for managing multiple git submodules in a shared-server development environment.</p>
<p>We've been using it internally at my company, <a href='http://www.pharos-ei.com'>Pharos Enterprise Intelligence</a>, for the last 5 months and it's been a huge timesaver for us. Read below for a detailed description of the features and its use.</p>
<p>The code is being released under the MIT license and the git repository is being hosted on <a href='http://github.com'>github</a>. Take a look:</p>
<blockquote>
<a href='http://github.com/flavorjones/git-rake'>http://github.com/flavorjones/git-rake</a>
</blockquote>
<h3>What git.rake Is</h3>
<p>A set of rake tasks that will:</p>
<ul>
<li><p>Keep your superproject in synch with multiple submodules, and vice
versa. This includes branching, merging, pushing and pulling to/from a
shared server, and committing. (Biff!)</p></li>
<li><p>Keep a description of all changes made to submodules in the commit
log of the superproject. (Bam!)</p></li>
<li><p>Display the status of each submodule and the superproject in an
easily-scannable representation, suppressing what you don't want or
need to see. (Pow!)</p></li>
<li><p>Execute arbitrary commands in each repository (submodule and
superproject), terminating execution if something fails. (Whamm!)</p></li>
<li><p>Configure a rails project for use with git. (Although, you've seen
that elsewhere and are justifiably unimpressed.)</p></li>
</ul>
<h3>Prerequisites</h3>
<p>If you're not sure how to add a submodule to your repo, or you're not
sure what a submodule is, take a quick trip over to <a href="http://git.or.cz/gitwiki/GitSubmoduleTutorial">the Git Submodule
Tutorial</a>, and then
come back. In fact, even if you ARE familiar with submodules, it's
probably worth reviewing.</p>
<h3>The Problem We're Trying to Solve Here</h3>
<p>Let's start with stating our basic assumptions:</p>
<ol>
<li>you're using a shared repository (like github)</li>
<li>you're actively developing in one or more submodules</li>
</ol>
<p>This model of development can get very tedious very quickly if you
don't have the right tools, because everytime you decide to
"checkpoint" and commit your code (either locally or up to the shared
server), you have to:</p>
<ul>
<li>iterate through your submodules, doing things like:
<ul>
<li>making sure you're on the right branch,</li>
<li>making sure you've pulled changes down from the server,</li>
<li>making sure that you've committed your changes,</li>
<li>and pushed all your commits</li>
</ul></li>
<li>and then making sure that your superproject's references to the
submodules have also been committed and pushed.</li>
</ul>
<p>If you do this a few times, you'll see that it's tedious and
error-prone. You could mistakenly push a version of the superproject
that refers to a <em>local</em> commit of a submodule. When people try to
pull that down from the server, all hell will break loose because that
commit won't exist for them.</p>
<p>Ugh! This is monkey work. Let's automate it.</p>
<h3>Simple Solution</h3>
<p>OK, fixing this issue sounds easy. All we have to do is:</p>
<ul>
<li>develop some primitives for iterating over the submodules (and
optionally the superproject),</li>
<li>and then throw some actual functionality on top for sanity checking, pulling,
pushing and committing.</li>
</ul>
<h3>The Tasks</h3>
<p>git-rake presents a set of tasks for dealing with the submodules:</p>
<pre><code> git:sub:commit # git commit for submodules
git:sub:diff # git diff for submodules
git:sub:for_each # Execute a command in the root directory of each submodule.\
Requires CMD='command' environment variable.
git:sub:pull # git pull for submodules
git:sub:push # git push for submodules
git:sub:status # git status for submodules
</code></pre>
<p>And the corresponding tasks that run for the submodules PLUS the superproject:</p>
<pre><code> git:commit # git commit for superproject and submodules
git:diff # git diff for superproject and submodules
git:for_each # Run command in all submodules and superproject. \
Requires CMD='command' environment variable.
git:pull # git pull for superproject and submodules
git:push # git push for superproject and submodules
git:status # git status for superproject and submodules
</code></pre>
<p>It's worth noting here that most of these tasks do pretty much just
what they advertise, in some cases less, and certainly nothing more
(well, maybe a sanity check or two, but no destructive actions).</p>
<p>The exception is <code>git:commit</code>, which depends on <code>git:update</code>, and that has
some pixie dust in it. More on this below.</p>
<p>Leaving only the following specialty tasks to be explained:</p>
<pre><code> git:configure # Configure Rails for git
git:update # Update superproject with current submodules
</code></pre>
<p>The first is simple: configuration of a rails project for use with
git.</p>
<p>The other, <code>git:update</code>, does two powerful things:</p>
<ol>
<li><p>(Only if on branch 'master') Submodules are pushed to the shared
server. This guarantees that the superproject will not have any
references to local-only submodule commits.</p></li>
<li><p>For each submodule, retrieve the git-log for all uncommitted (in
the superproject) revisions, and jam them into a superproject commit
message.</p></li>
</ol>
<p>Here's an example of such a superproject commit message:</p>
<pre><code> commit 17272d53c298bd6a8ccee6528e0bc0d62104c268
Author: Mike Dalessio <mike@csa.net>
Date: Mon May 5 20:48:13 2008 -0400
updating to latest vendor/plugins/pharos_library
> commit f4dbbce6177de4b561aa8388f3fa9f7bf015fa0b
> Author: Mike Dalessio <mike@csa.net>
> Date: Mon May 5 20:47:46 2008 -0400
>
> git:for_each now exits if any of the subcommands fails.
>
> commit 6f15dee8c52ced20c98eef63b3f3fd1c29d91bbf
> Author: Mike Dalessio <mike@csa.net>
> Date: Fri May 2 13:58:17 2008 -0400
>
> think i've got the tempfile handling correct now. awkward, but right.
>
</code></pre>
<p>Excellent! Not only did <code>git:update</code> automatically generate a useful log
message for me (indicating that we're updating to the latest submodule
version), but it's also <strong>embedding original commit logs</strong> for all the
changes included in that commit! That makes it much easier to find a
specific submodule commit in the superproject commit log.</p>
<h3>A Note on Branching and Merging</h3>
<p>Note that there are no tasks for handling branching and merging. This
is intentional! It could be very dangerous to try to read your mind
about actions on branches, and frankly, I'm just not up to it today.</p>
<p>For example, let's say I invented a task to copy the current branch
<code>master</code> to a new branch <code>foo</code> (the equivalent of <code>git checkout -b foo
master</code>) in all submodules, but one of the submodules already has a
branch named <code>foo</code>!</p>
<p>Do we reduce this action to a simple <code>git checkout foo</code> for that
submodule? That could yield unexpected results if we a) forgot we had
a branch named <code>foo</code> and b) that branch is very different from the
<code>master</code> we expected to copy.</p>
<p>Well, then -- we can delete (or rename) the existing <code>foo</code> branch and
follow that up by copying <code>master</code> to <code>foo</code>. But then we're silently
renaming (or deleting) branches that a) could be upstream on the
shared server or b) we intended to keep around, but forgot to
git-stash.</p>
<p>In any case, my point is that it can get complicated, and so I'm
punting. If you want to copy branches or do simple checkouts, you
should use the <code>git:for_each</code> command.</p>
<h3>Everyday Use of git:rake</h3>
<p>In my day job, I've taken the vendor-everything approach and
refactored lots of common code (across clients) into plugins, which
are each a git submodule. My current project has 14 submodules, of
which I am actively coding in probably 5 to 7 at any one time. (Plenty
of motivation for creating git:rake right there.)</p>
<p>Let's say I've hacked for an hour or two and am ready to commit to
my local repository. Let's first take a look at what's changed:</p>
<pre><code> $ rake git:status
All repositories are on branch 'master'
/home/mike/git-repos/demo1/vendor/plugins/core: master, changes need to be committed
# modified: app/models/user_mailer.rb
# public/images/mail_alert.png (may need to be 'git add'ed)
WARNING: vendor/plugins/core needs to be pushed to remote origin
/home/mike/git-repos/demo1/vendor/plugins/pharos_library: master, changes need to be committed
# deleted: tasks/rake/git.rake
</code></pre>
<p>You'll notice first of all that, despite having 14 submodules, I'm
only seeing output for the ones that need commits, and even that
output is minimal, listing only the specific files and not all the
cruft in the original message. It tells me that all submodules are on
the same branch. It's smart enough to tell me that a file may need to
be git-added. It will even alert me when a repo needs to be pushed to
the origin.</p>
<p>I'll have to manually <code>cd</code> to the submodule and git-add that one
file, but once that's done, I can commit my changes by running:</p>
<pre><code> $ rake git:commit
</code></pre>
<p>which will run <code>git commit -a -v</code> for each submodule, fire up the
editor for commit messages along the way, push each submodule to the
shared server, and then automagically create verbose commit logs for
the superproject.</p>
<p>To pull changes from the shared server:</p>
<pre><code> $ rake git:pull
</code></pre>
<p>When you run this command, you'll notice that the output is filtered,
so if no changes were pulled, you'll see no output. Silence is golden.</p>
<p>To push?</p>
<pre><code> $ rake git:push
</code></pre>
<p>Not only will this be silent if there's nothing to push, but the rake
task is smart enough to not even attempt to push to the server if
master is no different from origin/master. So it's silent and fast.</p>
<p>Let's say I want to copy the current branch, <code>master</code>, to a new
branch, <code>working</code>.</p>
<pre><code> $ rake git:for_each CMD='git checkout -b working master'
</code></pre>
<p>If the command fails for any submodules, the rake task will terminate
immediately.</p>
<p>Merging changes from 'working' back into 'master' for every submodule
(and the superproject)?</p>
<pre><code> $ rake git:for_each CMD='git checkout master'
$ rake git:for_each CMD='git merge working'
</code></pre>
<h3>What git.rake Doesn't Do</h3>
<p>A couple of things that come quickly to mind that git.rake should
probably do:</p>
<ul>
<li><p>Push to the shared server for ANY branch that we're tracking from a
remote branch.</p></li>
<li><p>Be more intelligent about when we push to the server. Right now, the
code pushes submodules to the shared server every time we want to
commit the superproject. We might be able to get away with only
pushing the submodules when we push the superproject.</p></li>
<li><p>Parsing the output from various 'git' commands is prone to breakage
if the git crew starts modifying some of the strings.</p></li>
<li><p>There should probably be some unit/functional tests. See previous
item.</p></li>
</ul>
<p>Anyway, the code is all up on github. Go hack it, and send back patches!</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com15tag:blogger.com,1999:blog-7609658525941194126.post-51547933268318089782008-04-21T16:08:00.000-04:002008-05-06T13:23:23.300-04:00JS development on IE is busted.<p>My git commit for this afternoon, following 3 hours of debugging and work, contained the following description:</p>
<blockquote>IE7 fixes. DAMN that browser is busted.</blockquote>
<p>Look, I'm not going to go off on a rant, but there are lots of things that can be done to make debugging Javascript in the browser easier, and Microsoft (and the windows community) has done exactly none of them.</p>
<b>1. Javascript console</b>
<p>Hello? I'd like to see what the error is, and where it's happening. By default, all that IE gives you is the Gray Box of Doom that tells you the problem is on line 24696, but won't tell you <i>which file</i> it's referring to.</p>
<a onblur="try {parent.deselectBloggerImageGracefully();} catch(e) {}" href="http://4.bp.blogspot.com/_Ve7bb1LcoGY/SAz20U0stVI/AAAAAAAAAPg/XsPlm6vQUGc/s1600-h/gray-box-of-doom.png"><img style="margin: 0pt 0pt 10px 10px; float: right; cursor: pointer;" src="http://4.bp.blogspot.com/_Ve7bb1LcoGY/SAz20U0stVI/AAAAAAAAAPg/XsPlm6vQUGc/s320/gray-box-of-doom.png" alt="" id="BLOGGER_PHOTO_ID_5191795849254712658" border="0" /></a>
<p>A quick Google query for <a href="http://www.google.com/search?q=ie7+javascript+console">IE7 javascript console</a> does a good job at showing the general level of pain about this out there.</p>
<p>Firefox has a basic Javascript console built in. Open source 1, Microsoft 0.</p>
<b>2. Javascript debugger</b>
<p><a href="http://www.microsoft.com/downloads/details.aspx?FamilyID=2f465be0-94fd-4569-b3c4-dffdf19ccd99&displaylang=en">Microsoft Script Debugger</a> is the only standalone tool available, and it's no longer supported by MS. The other options require installation of either Front Page or Visual Studio. Puh-lease.</p>
<p><a href="http://www.getfirebug.com/">Firebug</a> is free for Firefox. Open source 2, Microsoft 0.</p>
<p>I did find a nice tool called <a href="http://www.debugbar.com/">DebugBar</a>, but it's only available freely for personal use. Even when I test-drove it, though, most functionality doesn't work properly for dynamically-created DOM elements. So, anything you've created or updated via AJAX calls are not going to be debuggable by DebugBar. Lose! This is basically everything that excellent javascript frameworks and libraries like <a href="http://extjs.com/">ExtJS</a>, <a href="http://dojotoolkit.org/">Dojo</a> and <a href="http://script.aculo.us/">Scriptaculous</a> have been working towards for years.</p>
<b>3. Basic EcmaScript extensions</b>
<p>Array.forEach() doesn't work? That's been around since <a href="http://developer.mozilla.org/en/docs/Core_JavaScript_1.5_Reference:Objects:Array:forEach">Ecmascript 1.6!</a> That's right, IE7 still doesn't implement any of the crafty Array iterator methods.</p>
<p>I'm just going to point you at <a href="http://erik.eae.net/archives/2006/04/26/23.23.02/">this terrific blog entry</a> detailing the changelist for Microsoft Javascript support since 2001. (Hint: the changelist is empty.)</p>
<p>Got that? In seven years, IE has not improved its Javascript support <i>one whit</i>. Where were you in 2001?</p>
<p>Just. Effing. Boggling.</p>
<p>At the end of all that, which quite frankly made me more dumb than when I started, I found myself asking the question: "Can I get away without supporting IE in my product?"</p>
<p>The realistic answer is obvious, but doesn't the fact that I'm asking the question in the first place tell you that something is seriously busticated?</p>
<p>The details behind the Pit of Despair Known As Internet Explorer have been covered in <a href="http://alex.dojotoolkit.org/?p=536">way more detail</a> (and by more knowledgable people) than I can hope to do. I'm just adding my voice to the chorus of "WTF?"s that are already out there.</p>
<p>If anyone knows of better tools for debugging a rich javascript application on IE7, puh-lease let me know.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-73619168406355093862008-04-17T13:52:00.000-04:002008-05-06T13:24:34.467-04:00(Re-) Starting Up<p>Hey there. It's been a while. Sorry about that. Thankfully, the interweb (that's you!) hasn't gone anywhere.</p>
<p>I've recently started up my own software company with a close friend from college, so I figured I'd resurrect my blog to record for posterity how the startup is going. We're doing some interesting software development (by some people's standards, anyway), so there'll be some articles in that vein, as well as anecdotes about running the business, deep thoughts on existential topics like "What kind of monkey is best?" (answer: they're <em>all</em> the best) and just plain old me-being-me (my Mom tells me I'm funny).</p>
<p>My company is <a href="http://www.pharos-ei.com/">Pharos Enterprise Intelligence</a>, and just this week our alpha test site (invitation only) went live with <a href="http://engineyard.com/">Engine Yard</a>. I'll talk more about the product, the technology and our business model in later posts. You'll just have to wait.</p>
<p>And to top it off, our public-facing <a href="http://www.pharos-ei.com/">intarnets site</a> went live this week. If you really loved me, you'd subscribe to our <a href="http://www.pharos-ei.com/?q=rss.xml">RSS feed</a>.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-14599164873281415992006-10-31T13:47:00.000-05:002008-05-06T13:25:06.528-04:00Evidence of Things Not Seen<p>OK, so earlier today I was complaining to my friend Jordan about the <a href="http://www.amazon.com/West-Wing-Complete-Sixth-Season/dp/B000EGEJI4/sr=8-3/qid=1163719069/ref=pd_bbs_sr_3/102-0983709-0892908?ie=UTF8&s=dvd">"West Wing: Sixth Season" DVD set.</a> (Yes, the same Jordan who promised years ago he'd run the Marathon with me, but has since completely reneged on the deal.)</p>
<p>I was hoping to be able to get through season 6 during my treadmill workouts. Without commercials, each episode runs about 42 minutes, which is about how long my workouts are taking.</p>
<p>I got through some of season 5 during my previous brief stint of jogging (in March). Because my treadmill is a little loud, I turned on subtitles so as to not miss any of the dialogue.</p>
<p>So here's the problem with the season 6 DVDs: no English subtitles.</p>
<p>At first, you might think, "So what?" Well, let's stop and consider a couple of things.</p>
<p>1) They talk fast in the West Wing.</p>
<p>Ok, so Toby would roll his eyes and correct me (he talks <span style="font-style: italic;">quickly</span>), but you know what I mean. Aaron Sorkin practically invented the walk-and-talk, and sometimes you just miss something. Subtitles help with that. A lot. (And yes, fanboy, I know Sorkin didn't write season 6.)</p>
<p>2) They talk about obscure aspects of government.</p>
<p>Sorkin is a policy wonk. So are the current writers for the show. Sometimes you miss what the hell they're talking about <span style="font-style: italic;">now </span>because you're still trying to absorb some fact from 15 seconds ago. Like why the Secretary of Defense is trying to torpedo a uranium transfer from the Republic of Georgia for budgetary reasons. Subtitles help with that, too.</p>
<p>3) The plot and character development are ENTIRELY dialogue-driven.</p>
<p>In the West Wing, we hardly ever get to see anything <span style="font-style: italic;">actually happen</span>. We usually hear third-hand about what happened somewhere far from the White House. Sometimes we hear the CIA director talk about how it happened. Then some guys from Foggy Bottom talk some more about what to do about it. Sometimes people talk loudly. But mostly they talk quickly. While walking around. About obscure aspects of government.</p>
<p>4) One of the recurring characters is played by an Oscar-winning actress <a href="http://www.imdb.com/name/nm0559144/">who also happens to be deaf.</a></p>
<p>Savor the irony.</p>
<p>What logic would lead someone to include subtitles for French and Spanish, but not English, on a North American release? Boggling, just boggling. And it's interfering with my running.</p>
<p>Well, not so much "interfering" as "making a little less fun." But as Josh would say, that's an important ... thing.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-5984247677883551052006-10-30T13:46:00.000-05:002008-05-06T13:25:48.968-04:00Day Zero<p>I'm totally out of shape, and now I have given myself exactly two years to pull it all together so I can run the 2008 New York City Marathon. Yes, it's only 734 days until the marathon, and I'm going to record how I'm doing right here, for the next two years.</p>
<p>You'll see at least a couple of running themes in this diary:</p>
<p><span style="font-weight: bold;">1) Handling the time commitment</span></p>
<p>Training for a marathon isn't something to be taken lightly. It will be an enormous time commitment, which means I'll have less time to spend with my wife. For some people that may be a side bonus -- but I really enjoy being with my sweetie, so I consider this a sacrifice.</p>
<p>On the other hand, if I stay healthy, I'll live longer, which will mean <span style="font-style: italic;">more </span>time with my wife in the long run. I don't want to be penny-wise and pound-foolish. So I'm going to be keeping track of how much time I spend exercising. I'm really hoping to be surprised when I look at time investment vs. immediate health benefits. Which brings me to ...</p>
<p><span style="font-weight: bold;">2) Physical health</span></p>
<p>Physical health encompasses a lot of factors. Weight loss and body fat are the most obvious -- I'll be tracking these daily, as measured by my cheap-o bathroom scale. I'll also be tracking my resting pulse rate and blood pressure (when it gets checked, which will be very sporadically).</p>
<p>I fully expect to get injured at some point. The constant pounding of a daily run is bound to cause problems. The trick will be to manage my training schedule to minimize the risk of injury, and to allow myself to heal when I get injured. In the beginning, this is going to mean slowly ramping up my mileage and intensity to allow my body to adjust to the beating. I'll be reading up on injury prevention, and hopefully will have something more intelligent to say on this soon.</p>
<p><span style="font-weight: bold;">3) Races</span></p>
<p>In order to qualify for the NYC Marathon, you have to run in a certain number of races organized by the NYC Road Runners' Club. I'll be writing about my training leading up to each race, race-day routine, race strategy, how I finished and how I felt doing it.</p>
<p><span style="font-weight: bold;">4) Personal Notes (Yawn)</span></p>
<p>I'll also be trying to keep a log of miscellaneous facts. "How I'm Feeling" is one thing I'll be logging. For example, I've had mild back pain for the last few months. Will running affect this? I'll let you know.</p>
<p>I'll also be paying attention to such earth-shaking information as: when do I get hungry? How much am I eating? How much sleep am I getting? Do the workouts "feel" difficult or easy?</p>
<p>Prepare to be bored. Unless you're a grad student in physiology. Or a nutritionist.</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-88888649624264760472006-10-30T13:41:00.000-05:002008-05-06T13:26:16.235-04:00Marathon: Introduction<p>I blame it all on the Guinness.</p>
<p>Many years ago, when I was in my mid-twenties, living and working in the city, <a href="http://bloomberg.com/">hunched over my keyboard</a> for long hours every day and drinking uncountable pints of stout in <a href="http://newyork.citysearch.com/profile/7158623/">smoky bars</a> every night, when I was full of myself and knew no limits and conquered all obstacles, when irrational exuberance was the phrase of the day and <a href="http://web.archive.org/web/20001120123500/www.survivorsucks.com/haiku.html"> Survivor</a> was cutting-edge television entertainment, I rashly made a promise to myself: someday, I would complete the <a href="http://www.nycmarathon.org/">New York City Marathon</a>.</p>
<p>Yes, that was many years ago. Before I <a href="http://www.webmd.com/hw/muscle_problems/hw124403.asp">blew out my knee</a> and began showing grey in my beard, before I got married and moved to Jersey, before I gained 30 pounds and started eating too much cheese, this seemed like a really good idea. I'm not kidding. Sounds crazy, right? Like I said, I've chosen to blame it on the Guinness.</p>
<p>At some point, I realized that the probability of running the marathon was approaching zero unless I started some serious training. I hadn't done any prolonged physical exercise in years. Sure, there are lots of reasons to get healthy, mostly related to, um, not dying. But even that's tough to rationalize at 5am or at the end of a long day spent hunched over your keyboard. If only there was some way to get encouragement and support from someone going through the same thing ...</p>
<p>Then it hit me, like a ton of <a href="http://www.imdb.com/title/tt0137523/">Fight Club soap</a>: A support group! Super idea. I enlisted my friends in the effort immediately.</p>
<p>Or, at least I tried to. My buddy Jordan swore he would run it with me. Of course, that was back when he was in <emph style="font-style: italic;">his</emph> mid-twenties, so he can't really be held accountable for that decision.</p>
<p>(Jordan was also drinking Guinness at the time. If I had paid attention to that detail, you might be reading posts about the 2006 NYC Marathon, instead of the 2008 version. More on this later.)</p>
<p>Instead, I got older and rounder. But I remembered what I looked like, and felt like, in my younger days. Periodically I would resurrect the idea of running the Marathon with some of my friends. I was met with kind indulgence -- like I was the beauty queen on a reality adventure show: adorable, precocious, and no way in hell going to make it very far. It looked as if the Marathon might never happen.</p>
<p>And then, in a textbook case of serendipity, I stumbled upon the answer (which should have been obvious from the day Jordan made his drunken oath): I talked up the idea with my friends again, but this time I did it <span style="font-style: italic;">while we were on a pub crawl</span>.</p>
<p>Ah, the wonders of a pub crawl. In particular, we were at <a href="http://swiftbarnyc.com/">Swift's Hibernian Lounge</a> on the lower east side of Manhattan. I brought up the race, expecting to be dismissed again. But Lo and Behold! a small group agreed to take part. I was skeptical. But all my friends are thirty-something and therefore take responsibility for even their stupid decisions.</p>
<p>(Yes, I'll grant you: a pub crawl seems like a boorish and immature way to spend an afternoon. But we're polite, responsible people, I swear. I'll discuss the pub crawl in a later post, but for now, let's agree that it's no more or less boorish and immature than your video games, <a href="http://en.wikipedia.org/wiki/Make_Love,_Not_Warcraft"> dwarf warrior</a>, and we'll move on.)</p>
<p>Within three days, all five of us had joined the <a href="http://www.nyrr.org/">New York Road Runners' Club</a> with a goal of running the 2008 New York City Marathon. This was such an unbelievable chain of events, that I decided the two-year run-up to the run-down must be recorded for posterity. Otherwise, who would believe it?</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0tag:blogger.com,1999:blog-7609658525941194126.post-13926980397194635682004-10-21T13:37:00.000-04:002008-05-06T13:29:32.903-04:00Attention Yankee Fans: You Can't Be My Friend<p>True story:</p>
<p>This morning, standing in line for my bagel, a Yankee Fan walks by, whistling. "You waiting in line?" I nod. He looks at my hat, the Boston "B" staring him in the face. "Sox Fan, huh? Nice job." and he winks at me, smiling.</p>
<p>What the ... ?</p>
<p>I have some serious moral objections to this "Good show, old chap!" public display of affection for the enemy. For years I watched the Yankees beat the Red Sox, in the most painful and inhumane ways devised by man, and I hated every minute of it. I had to listen to the chants of "1918" and "Boston Sucks!" every time I went to the Stadium. I had random people knock my Boston cap off my head on the street. Now, all of a sudden, the Yankee Fan is telling me, "Hey, I</p>
<p>didn't mean it. Come here, let me give you a noogie. Let's be friends."</p>
<p>I repeat: What the ... ?</p>
<p>Now, when I say I hate the Yankees, I don't necessarily mean that I hate Jorge Posada (although I do). I really mean that I hate the Yankee Fan. So much so that I sometimes have trouble reconciling my hatred with the fact that some of my friends, who are otherwise completely decent human beings, inexplicably root for the Yankees.</p>
<p>Mixed with the hate, though, there was always fear and loathing. I was terrified to see Jeter come up to the plate in late innings. I had nightmares about Mo Rivera, complete with a Metallica soundtrack and everything. And I never, ever, bet against them.</p>
<p>But today it's a different story. This series was cathartic, for two reasons:</p>
<ol>
<li>the Sox beat the Yanks</li>
<li>they did it in the most painful and embarassing way anybody has ever seen in baseball, ever. Ouch.</li>
</ol>
<p>So, cleansed, I walked out into the Great Big World this morning, fearing no team or fan, with the mystique of the Yankees laying in smoking ruins in the Bronx. Afraid no more, I'm now confident that the ghosts have been cast out and that the Sox may now commence a decade of dominance in the AL East. And that's a nice feeling.</p>
<p>But that fear of the opponent isn't gone. Fandom is a zero-sum-game. Like an immutable law of thermodynamics, fan anxiety can't be destroyed, it can only be converted -- and I can see it on the face of every Yankee Fan on the train and on the street. The smile, the wink: it's a cover for the paralyzing fear that for their team, which was so good to them for so long, the beginning of the end has finally come.</p>
<p>It's like the Germans saying, "Hey, Allies, nice job there with that Normandy thing, you guys really deserved to win one," all the while calling frantically for another Panzer division and calculating routes of retreat.</p>
<p>So: I am not buying this phony-baloney hail-fellow-well-met schtick.</p>
<p>Now I'm torn as to how I should react to the Yankee Fan standing in front of me, mugging like a chimp. My options, as far as I can tell, are:</p>
<ol type='a'>
<li>ignore the Yankee Fan completely,</li>
<li>tell him exactly how much I hate him and his team, and how long I've been waiting for EXACTLY this chance to tell him all about it, punctuated with four-letter words,</li>
<li>start a rousing chant of "Yankees Suck!" or</li>
<li>act smugly superior.</li>
</ol>
<p>In the end, I went with (d), and wiped the grin off his face with a patronizing, "You'll get 'em next year."</p>Mikehttp://www.blogger.com/profile/01750555903559042849noreply@blogger.com0