blog post : lawsuit :: book : movie
Any interested developers who've found themselves wanting to match users with their congressman — while simultaneously not wanting to pay hundreds of dollars to do so — should head over to the EchoDitto blog, where I just opensourced a database scraped from the house.gov servers.
I'll be interested to see if we get any C&D letters. There aren't any terms of use attached to the house.gov DB, so far as I know. And, in general, the materials produced by congress aren't able to be copyrighted. However, it seems possible to me that house.gov actually got their data from another vendor (who in turn produced it by chewing through the monstrously big and freely available census tract databases). So it may just be a database licensed by house.gov that they aren't able to give away.
Here's where it gets tricky: my understanding is that, in the US, you can copyright a database in its entirety, or the content of a database, but not individual facts contained in a database. For example, a database containing a collection of poems and their authors' names could be copyrighted as a whole, and the poems themselves could also be protected by copyright. But if you learn from the database that author A wrote poem B, you can freely redistribute that factual information.
In this case there's nothing within the database that can be copyrighted — it's just district numbers, zip codes and state names. And I'm not grabbing the database in its entirety. Instead, I have a specific list of zipcodes, freely obtained from a third party, that I then gather information on from the congressional database, one at a time. I've taken a subset of the database — a collection of uncopyrightable facts — rather than the entire thing.
So, to summarize: I don't think I'll be sued. But we'll see. Also: intellectual property law in this country is seriously fucked up. But then, you already knew that.

Comments
i agree - you should be in the clear. i think if someone did try to sue they would lose. and god bless our ip laws...
just kidding
I wonder if there are lawyers running searches on Technorati for bloggers talking about "C&D letters" -- then approaching the copyright holders in hopes of landing a freelance litigation gig? Maybe I'm just paranoid.
Nice work on the database, though. You are a credit to our democracy.
There were a host of copyright cases concerning telephone books back in the day, but damned if I can remember exactly how they turned out. Individual phone numbers are non-copyrightable facts, but their collection and arrangement might be copyrightable.
Yellow pages are subject to copyright, white pages aren't.
So why don't the phone book people just print everything on yellow?
Post A Comment