Archive for the ‘Science and Tech’ Category

Driving while using a cell phone

Monday, August 31st, 2009

There are two questions: Is driving while talking on a cell phone safe? and Is it as dangerous as driving while drunk? After reviewing some earlier studies, and looking at a new Virginia Tech study, I’d say the answer is a definite yes to the first question; and a probably not ...

Zipf’s law and city size

Thursday, May 21st, 2009

Olivia Judson writes a column for the New York Times, and she had a guest column by mathematician Steven Strogatz, a math professor at Cornell, "Math and the City." Strogatz writes: One of the pleasures of looking at the world through mathematical eyes is that you can see certain patterns that would ...

Language Identification: A Computational Linguistics Primer

Saturday, April 25th, 2009

Slides and results from a talk I gave at Kalamazoo College on language identification. My co-worker at Powerset, Chris Biemann, has a nice paper on Unsupervised Language Identification .

Things that are wrong with Python (3): destructive functions

Thursday, February 12th, 2009

Pythons's sort and append functions mutate the sequence they work on. This is wrong. Rather than write x = list.append(1).sort() you have to write: x = list[:] x.append(1) x.sort()

Things that are wrong with Python (2): return values

Thursday, February 12th, 2009

Python doesn't return values by default. This is wrong. The last value should always be returned.

Things That Are Wrong with Python (1): True and False

Thursday, February 12th, 2009

The following values are all false in Python programming language. None False numbers equal to 0: 0, 0L, 0.0, 0j empty sequences:e.g., '"", (), [], array.array('i') empty maps: e.g., {} any object that defines a __nonzero__ or __len__ method (returning False or len=0, respectively This is a Thing that is Wrong with Python. (The Right Way: false is ...

Epoch odometer

Thursday, February 12th, 2009

My epoch odometer.

Leap second 2009

Wednesday, December 31st, 2008



parallel line-oriented file processing

Thursday, July 31st, 2008

At work, I've been doing a lot of line-oriented file processing, for example, of the tabbed-separated value files produced by the Freebase project (downloads). This is similar in spirit to Tim Bray's 'wide finder' project, and I've leveraged his popularity to find a useful utility created by Preston l. Bannister ...

O(log(N)) array insertion in Ruby

Thursday, July 24th, 2008

>> require 'bdb' >> x = BDB::Btree.open('/tmp/foo.db',nil, 'w+', {'set_bt_compare' => lambda {|a,b| (a.to_i) < => (b.to_i)}}) => # >> (0..9).to_a.sort_by{rand}.each{|i| x[i] = i};true => true >> x.keys.map{|i| i.to_i} => [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

How many ways to win the election with nothing to spare?

Tuesday, June 10th, 2008

Over at FiveThirtyEight, the following 'homework assignment' was given: How many unique ways are there to acquire at least 270 electoral votes without any excess? I figured it would be a 'large' number, but I was surprised at the actual total: 51,199,463,116,367 (or, fifty-one trillion and change). about 2.3% of all possible ...

The evolution of a Ruby programmer

Thursday, June 5th, 2008

# The evolution of a Ruby programmer def sum(list) total = 0 for i in 0..list.size-1 total = total + list[i] end total end def sum(list) total = 0 list.each do |item| total += item end ...

Positive Predictive Value

Thursday, April 10th, 2008

Mark "Language Log" Liberman is taking Steven D. "Freakonomics" Levitt to task for either misunderstanding the language of statistics, or the underlying statistics theory itself. In a blog post, "Medicine and Statistics Don't Mix," Levitt tells the story of friends of his who spent $5,000 on Preimplantation Genetic Diagnosis (PGD) -- ...

OpenDMAP paper

Tuesday, March 25th, 2008

For the few who might be interested: OpenDMAP: An open source, ontology-driven concept analysis engine, with applications to capturing knowledge regarding protein transport, protein interactions and cell-type-specific gene expression (PDF). OpenDMAP advances the performance standards for extracting protein-protein interaction predications from the full texts of biomedical research articles. Furthermore, this ...

UAVs at Wired Science

Thursday, March 13th, 2008

A nice Wired Science video on UAVs (unmanned autonomous vehicles), including some video of the UAV helicopter I worked on at NASA. I miss doing that cool stuff. Of course, I like doing the cool stuff I'm working on now. Autonomous flying search engines! Yes, that's the ticket! (via lemondor, who's ...

Powerset in Ruby

Monday, March 3rd, 2008

I actually needed to use a powerset function (set of all subsets of a set) today in some Ruby testing code I was writing. So I share it with you: class Array def powerset if empty? [[]] else ...

Glitches

Monday, February 25th, 2008

Back when I was a child, I had two choices for watching moving images: watching live television or going to the movies. Now, we have a large array of choices, and this weekend we used quite a few of them, but it was interesting (and frustrating) to see the large ...

More better spec for Arc

Saturday, February 2nd, 2008

A DESCRIBE form now creates a procedure, which, when run, returns a set of results, which can be printed. Also, errors in test forms are caught. See spec.arc Can an HTML format be far behind? (= test-basics (describe "Basic ARC list functions" (prolog ...

First experiment with Arc

Friday, February 1st, 2008

I decided to try to write a simple but useful program in Arc, the programming language recently released by Paul Graham. One thing I've liked about using Ruby is a testing framework called Rspec, in which you describe the behavior of your program in a kind of narrative form. So I ...

arc is here

Tuesday, January 29th, 2008

arc> (map [+ _ 1] '(1 2 3)) (2 3 4) arc> `(arc is ,@'(finally) here) (arc is finally here) arclanguage.org I should have mentioned: via lemonodor.