More on the Republicans' Open Data LetterApril 29, 2011 - by Donny Shaw
As David mentioned earlier, the House Republican leadership’s letter directing the Clerk of the House to improve how they release legislative data online is a big deal. It means that the House is serious about catching up with the standards and expectations of modern information users, both developers and consumers. It’s also a sign that Congress is becoming more comfortable with loosening its grip on information about its activities and beginning to appreciate the value of unleashing it as public data into the wilds of the internet.
Currently, information about bills must be liberated from Congress’s antiquated publishing system through an ad hoc network of tools designed by citizen hackers. Congress first publishes bills in a non-machine-readable, non-bulk format on the Library of Congress’ THOMAS website, though sometimes they publish bills on the websites of committees or individual members instead. Everything that goes onto THOMAS as a standalone bill is then scraped by GovTrack and made available in an open format to developers. OpenCongress and dozens of other sites then pull from Govtrack and begin running their own processes. The whole thing takes more than 24 hours, which means that under the House’s weak three-calendar-day rule bills can be (and are) voted on before they are publicly available in open formats. And everything that’s not a stand-alone bill on THOMAS (i.e. regular-order amendments on THOMAS, manager’s amendments from committees, substitutes for leadership, etc.) stays buried in the congressional record or hidden somewhere on one of hundreds of scattered websites where only lobbyists and other D.C. insiders can readily find it.
Publishing bills in open, machine-readable formats will make it easier for the public to review bills before they are voted on. More importantly, though, it will mean that the open government community around the U.S. Congress can move forward from getting access to the data to exploring new reuses and remixing that will enhance the value of the information and feed democracy. Congress is the people’s lawmaking platform; congressional information should be a platform for the people, who are online, to do what they will without any restrictions beyond the laws themselves.
Under the data enhancements being pushed forward by the Republicans’ letter, all legislative items to be voted on or debated by any House body should be made available in bulk XML from a single source. Right now, for legislation, Congress is making a false distinction with how they are publishing between stand-alone bills introduced in regular order (i.e. via the hopper) and everything else (amendments, substitutes, committee marks, rules, etc.). If this distinction is allowed to continue and only stand-alone bills are included in this data enhancement, then anything that Congress wants to do without public oversight will be done outside of stand-alone bills. This is already a problem (remember when that loophole letting AIG get huge bonuses right after receiving TARP funds was inserted into the stimulus conference committee report?) and enhancing transparency of some legislative items and not others would only make it worse.