Leap second 2009
Delays
I’m sitting in the airport at O’Hare after the longest plane taxi ride I can remember. It doesn’t really matter how long it took since the flight left late, arrived late, and I have even longer to wait to get on a flight to Seattle. I was supposed to be in Seattle this morning, but a mix-up on tickets (that was my own fault) meant I couldn’t leave until this evening. If things go very well, I’ll get to sleep by midnight Pacific time. Plus someone called me on the phone today to ask a favor and started yelling when I didn’t say yes.
This is as bad as my life gets.
Meanwhile, in Mumbai …
Meanwhile, in the Congo …
Meanwhile, in Sudan …
Meanwhile, in Detroit …
I don’t want to sound maudlin, but I feel like I have to record this to get out of my snit. Maybe I’ll be able to sleep on the plane, and maybe I’ll feel great tomorrow.
Blogging?
Some of proclaimed the “end of blogging,” and it’s certainly the case that I haven’t written as much for this weblog recently. I do a lot of the teaching at our church, and more recently have been trying to write out my teaching notes and to send them to people who are interested or just unable to attend on a given Sunday. It’s a lot easier to do a teaching from an outline than to write everything down so that it’s semi-coherent, and that takes a significant amount of time. We’re trying to be more deliberate about posting these teachings on the church website as well as on Facebook. And that, too takes time.
And about Facebook and Twitter. The 140 character limit of a Twitter ‘tweet’ is just right for a lot of the statusey things I write, and often attracts comments (both on Facebook and in person) that posting on the weblog doesn’t. For example, yesterday I twitted: “Thanks to Brad & Karen Bahler & others, I got to sing “Hoboken style” with the David & Johnny Lee in Indiana yesterday.”, and got two comments, one from a friend in Canada (”What exactly is Hoboken style”) and one from California, “Oooo… you lucky dog!” And, writing code takes a lot of time and effort.
So, I find myself writing either much longer things than I typically blog about as church teaching, or much shorter, conversational tweets on Facebook and Twitter. But I like the blog format, and am somewhat inspired to be more deliberative about it. A weblog post about blogging doesn’t really count, though.
I’d better go off and think of something to say.
Tweet creed
Because of the call for a Tweet Creed (Christian belief in 140 characters or less):
We follow Jesus, who lived, died, and rose again to save us. He tells us to love God and to love our neighbor as ourselves.
The corrosive nature of political contributions
From the Washington Post, “Big donors drive Obama’s money edge“:
Lost in the attention given to Obama’s Internet surge is that only a quarter of the $600 million he has raised has come from donors who made contributions of $200 or less, according to a review of his FEC reports. That is actually slightly less, as a percentage, than President Bush raised in small donations during his 2004 race, although Obama has pulled from a far larger number of donors.
Mark my words: this will cause problems for President Obama.
Kalamazoo to Sitka?
I’ve been thinking about traveling to Sitka, Alaska using my frequent flyer miles. But here’s what Northwest suggests a trip would look like there-and-back-again:
And that’s just to get to Juneau! I guess I won’t be going to Sitka. Unless Sarah Palin is offering rides.
The gospel bin

[Amazon.com] MP3 Albums > Christian & Gospel > Country Gospel > Sort Low to High
- God’s Gonna Ride a Harley
- Golden Bowls of Compassion
- Radiant Redbird
- When Santa Claus Met Jesus
- Pale Rider
- Time is a Space
- Mama was Judged
- Tha Chronicles of Junglistic Hip Hop (EP)
- Jesus Is My Spotter - the Race Car Driver’s Anthem
- Orca Chantress Beluga Shaman
No customer reviews yet. Be the first!
Who knew? Balloons in the movies
I had no idea balloons played such a role in the movies: see VSL.
100 most common English words
First, take this fun quiz.
And then …
Continue reading the article
parallel line-oriented file processing
At work, I’ve been doing a lot of line-oriented file processing, for example, of the tabbed-separated value files produced by the Freebase project (downloads). This is similar in spirit to Tim Bray’s ‘wide finder’ project, and I’ve leveraged his popularity to find a useful utility created by Preston l. Bannister called “feed-workers” that implements the ‘map’ part of map-reduce (but over a large file, rather than a large set of files).
Initial tests look good; for example, a nearly 3 times speedup on a processing loop over the 81 million lines in the Freebase tsv file.
$ time ./feed-workers -n 8 -r /usr/bin/ruby -s ~/just_names.rb /bfd/dv/freebase_download/current/freebase-datadump-quadruples.tsv > /tmp/n1
real 4m20.682s
user 13m52.671s
sys 0m42.477s
$ time cat /bfd/dv/freebase_download/current/freebase-datadump-quadruples.tsv | ruby ~/just_names.rb > /tmp/n2
real 11m58.470s
user 11m32.207s
sys 0m27.628s
O(log(N)) array insertion in Ruby
>> require 'bdb'
>> x = BDB::Btree.open('/tmp/foo.db',nil, 'w+', {'set_bt_compare' => lambda {|a,b| (a.to_i) < => (b.to_i)}})
=> #
>> (0..9).to_a.sort_by{rand}.each{|i| x[i] = i};true
=> true
>> x.keys.map{|i| i.to_i}
=> [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
A few personal notes …
It’s been too long since my last update, but life has been busy, especially with the Microsoft purchase. But a few notes:
- I enjoyed singing Sacred Harp at the Michiana and Kalamazoo annual singings this weekend, and James Nelson-Gingerich gave me the *first* copy of the print version of 26th edition of the Harmonia Sacra for my work on the Harmonia Sacra website.
- I’m looking forward to a family reunion of all my brothers (five of us!) this coming weekend.
- I’ve been off to California a couple of times to meet about the Microsoft purchase, and I got my first “Microsoft Live!” tee-shirt
- Summer in Michigan is a wonderful thing this year.
- We just had our one-year anniversary of living in ‘the new house.’ So I guess it isn’t the new house, especially since we finally sold the old one.
Thoughts on the Microsoft acquisition
(The usual disclaimers: my opinion only, not my current or future employers)
When Powerset began a couple of years ago, a lot of commentators called us — and still do call us — a would be Google killer. This, despite repeated comments by senior staff that this wasn’t what we were about. As a company, Google is hard to beat. Our goal was audacious, but not that audacious. Our goal was to build a better search experience: to use natural language technology to provide better search results, both by having a better understanding of web documents as well as user queries.
But natural language technology has always only been part of the mix. We have, from the beginning, seen ourselves as doing “keywords plus”; that is, we have always planned to do what the other search engines do (keyword search, link analysis, blah, blah, blah), but add on top of this signals coming from parsing and semantic understanding. For example, we’d like to do as good a job as Google (say) on queries like ‘powerset microsoft’, but do even better on queries such as ‘Who acquired Powerset?’ and ‘Which company did Microsoft just buy?’ and everything in between.
What I didn’t realize when I joined the company is how some of the same technology would create innovations in the user interface, too. Powerset’s ‘Factz’ are a nice addition to the standard search page, and our ’snippets’ are the best in the business. When I first typed in ‘stars of BSG‘ in the Powerset search box, I was floored by the beauty of the results.
So I think we met our audacious goal: a better search experience. Microsoft seems to think so; after all, they bought the company.
And here’s the thing: we were bought by Microsoft. Microsoft’s market cap is still 90 billion dollars greater than Google. If anyone is able to capitalize a little ol’ startup like Powerset to make us a big player in search, it’s Microsoft. In fact, it’s clear (to me at least) we have a new mission, which is just the old mission the pundits wrongly labeled us with at the start: As a search company, our mission is now to beat Google.
Interesting times ahead.
Who acquired Powerset?
“Buy a house, sell a home?”
This might not be worth a post, but anyway:
Arnold Zwicky has a post at Language Log on “home” vs. “house” in (American) English, citing (among other things) the commentators in the Merriam-Webster’s Dictionary of English Usage saying:
A final note on home ownership: MWDEU reports that “a number of commentators have remarked on the tendency to buy a home and sell a house”.
I was curious to see whether this ‘tendency’ was real, so I checked the Google 3-gram data (see announcement). I think for this tendency to be real, the following should be true:
(1) buy a home >> buy a house
(2) sell a house >> sell a home
(3) (buy a home/buy a house) >> (sell a home/sell a house)
Here are the bare facts:
buy a home : 328,584
buy a house: 235,019
sell a home: 193,088
sell a house: 25,632
and the ratios:
buy a home/buy a house: 1.40
sell a home/sell a house: 7.53
buy/sell ratio of ratios 0.19
(buy a home+sell a home)/(buy a house+sell a house): 2.00
As you can see, (1) is true, but (2) and (3) are not (2 and 3 are related, of course). According to the Google data, ‘home’ is twice is likely than ‘house’ in both these contexts, and ’sell a home/sell a house’ is much greater than the ‘buy a home/buy a house’ ratio.
It must be all those realtors (oh, excuse me, ‘REALTOR®s’).
Mark’d
A response to Mark’s tagging me.
Seven factz about me:
- Will Fitzgerald didn’t use a computer until he was 23. It had cards.
- It is difficult for Will Fitzgerald to refer to himself in the third person.
- Despite his descriptivist linguistic training, Will Fitzgerald tries to distinguish ‘between’ and ‘among’.
- Will Fitzgerald was known as “Bill” until he was 33.
- Will Fitzgerald has half of a house named in his honor. He doesn’t know why.
- Will Fitzgerald once made his ESL class erupt in laughter by referring to ‘green Jewesses’ instead of ‘green beans’. (judias verdes vs. judías verdes).
- Will Fitzgerald (who has moved around a lot) has been a member or active participant of a Methodist church(United Methodist, Roseville), a Southern Baptist church (Calvary Baptist, Roseville), an independent fundamentalist church (First Church, Wellston), a Reformed Church in America congregation (University Reformed, East Lansing), an Evangelical Presbyterian church (Evangelical Presbyterian, Carbondale, now affiliated with the Presbyterian Church of America), a Spanish Baptist church (Iglesia Evangélica Bautista de Gracia, Barcelona), a Presbyterian USA church (North Presbyterian, Kalamazoo), a Christian Reformed church (Immanuel CRC, Kalamazoo), a Mennonite USA/Brethren bi-affiliated church (Reba Place Church, Evanston, now just Mennonite-affiliated), two independent Christian communities (Reba Place Fellowship, Evanston and Church of the Sojourners, San Francisco), a Canadian Anglican church (Christ Church Cathedral, Fredericton) and two Mennonite USA churches (Pine Grove, Battle Creek and Kalamazoo Mennonite Fellowship. He has more or less made up his mind.
That was fun, Mark. Rather than tag seven others, I present to you this fish tank. Click on the tank to feed the fish.
The Book of Psalms
Ockham’s Razor is Dull
It’s all (well, mostly) about representation. Peter Turney:
[F]iguring out how to represent the problem is 95% of the work. By the time you have the representation right, the tool that you use to finish the remaining 5% is not terribly important.
How many ways to win the election with nothing to spare?
Over at FiveThirtyEight, the following ‘homework assignment’ was given:
How many unique ways are there to acquire at least 270 electoral votes without any excess?
I figured it would be a ‘large’ number, but I was surprised at the actual total: 51,199,463,116,367 (or, fifty-one trillion and change). about 2.3% of all possible combinations (This exact number is based on the simplifying assumption of treating Maine and Nevada as giving up their electoral votes in the same way as all the other states). The answer was given by Isabel Lugo, a mathematician.
If you blindly checked all the possibilities at a rate of 1000 per second, it would take over 17,000 centuries. That’s older than John McCain.
Reviewing my Prediction—I was wrong.
I was wrong. And I’ve very glad.

The article has
no responses yet