Congressional Data is Defective By DesignSeptember 16, 2009 - by David Moore
This week’s launch of the “Baucus Bill”— aka “America’s Healthy Future Act”, aka one “mark” of many versions of the health care reform bill currently in the Senate — is a perfect illustration of how the U.S. Congress’ current bill-to-law process remains fundamentally inaccessible and harmful to the public discourse.
This past Wed. morning, Sen. Baucus’ office released a 223-page .pdf version of this much-anticipated, widely-covered piece of legislation. As these bills go, it was written in relatively plain language, and there was a lot of context signaled in the buildup to its release. But as of Wed. afternoon, the document is only officially available in this bulky, intimidating, insufficiently accessible .pdf file. Seriously, if I were going to design a system that discouraged citizens to get engaged with the substance of a bill, one that ensured that only D.C. insiders & lawyers & health care lobbyists would read through all 223 pages and compare different options, I would release it as a locked-down .pdf, and not in an open-source, user-friendly online version with built-in peer-to-peer communication features.
OpenCongress has posted what we believe is the only web page with the full text of the “Baucus Bill” for sharing, searching, and review, as well as permalinks to the top-line Titles and Sections — read it here: America’s Healthy Future Act of 2009. This is definitely a work-in-progress, one whose HTML will need to be cleaned-up manually until it’s sufficiently readable, which is a major drag, if you feel me. (We already did what we could to pick off the low-hanging fruit in converting the .pdf to decently clean HTML, but obvs. there’s a long way to go yet — sub-optimal though it is, it’s the best available version on the web that we’ve seen.) Even given this quick-fast implementation, there remains a significant lack of standards-based web tools that allow users to dive in, read/skim/research, comment, compare, search, and jump around the full text of this version of the legislation, as you can elsewhere on OpenCongress (e.g., the official bill text of the House version of the major health care reform bill, H.R. 3200).
The reason is that the “Baucus Bill” is only a “mark”, not yet an official Senate bill, which means (to summarize reductively) that the digital text that constitutes the .pdf does not make its way off internal government web servers to the official website of the Library of Congress, THOMAS — and in turn, does not make its way to government transparency web resources such as GovTrack and OpenCongress. Before that happens, this mark of the health care bill needs to be reconciled with other Senate committee versions of the same, which will then be put forward for consideration to the U.S. Senate as a whole. Health care reform is leading news coverage & blog analysis of American politics right now, this is a major document in the mix, and there’s not a widely-recognized, user-friendly resource for online examination by the public at large. You should have better access to this info! You should have — at your fingertips — immediate, unrestricted digital access to the full text of any piece of legislation the very moment it’s released publicly by Congress.
This is punishingly ridiculous. Congress could immediately take steps to make all publicly-relevant legislative data comply with the community-derived Eight Principles of Open Government Data. (This could include the workings of Congressional committees and these “marks”, even if they’re not yet official bills!) That is to say, bill info from Congress could and should be available today in real time, free of charge, open-source, and licensed openly, via such open-standards technologies as XML, API’s, and regular bulk data downloads. But as things stand, again, neither OpenCongress (as government watchdog) nor you (as an interested member of the public, affected citizen, and consumer of news/blog media) has a sufficiently open and user-friendly way to read this bill other than a bulky .pdf. It’s been this way ever since we launched OpenCongress in 2007 — preliminary or “draft” versions of bills are released as .pdf’s, not in a machine-readable, structured-data way that would make it easier to distribute widely to the public for review and input. Even once they’re published on THOMAS, because Congressional data standards aren’t up to par, there is an unavoidable lag of anywhere between 6-12 hours before bill info appears on GovTrack and then on OpenCongress. It doesn’t have to be this way, but it is because the U.S. Congress has not moved to aggressively address these shortcomings. (This is a longer story, worth more examination in the future.)
The current Congressional process for publishing data is, to borrow a phrase from the Free Software Foundation, Defective By Design. As we see in many proprietary, top-down systems affecting the public interest, it’s insistently closed-off. Congress’ processes for distributing legislative info is fundamentally broken — it could and should relatively easily be fixed, starting now. Whether or not you support the Baucus markup or the House version of the health care reform bill, we hope you agree that the public has a right to read this important iteration & political volley in the process.
Since the .pdf was released this morning, the OpenCongress team has been working on creating a preliminary page on our site with a more readable version of the full text of the .pdf, but with such a long document, the process of posting clean HTML markup is not trivial. We’ve posted it here, but the best we can do is to post the text as a static web page — until it’s published through THOMAS, it won’t have the useful comment threading and paragraph-by-paragraph permalinking features as our official bill text viewer does on other OC bill pages. To be sure, once the Senate version of the health care reform bill is officially reconciled & released, it will then be published on THOMAS in full, and will arrive shortly thereafter on OC for community action. In any case, the status quo system by which Congress releases .pdf’s (for real!! A .pdf !?!? This is September 2009 !!) is an inadequate way for the public to access such important legislation. For the time being, for more info, we refer you to OpenGovData, the inspirational Carl Malamud‘s recent speech at the Gov 2.0 conference, ReadTheBill — "ReadTheBill.org’s mission is to strengthen our democracy by making sure elected officials and citizens have the chance to read and understand legislation" — not to mention, as the badge proudly displays at the very bottom of every page on our site, The Open House Project and quite relevantly, The Open Senate Project.
In the near future, OpenCongress will kick off more prominent & focused civic efforts to promote truly open government data, meeting nothing less than the very best practices of openness & transparency. We believe this is a necessary, easily accomplished and politically vital component of building public knowledge about Congress (the open Web makes it possible — we’re not talking about, say, string theory here). It’s eminently feasible to design a more widely-engaging, open-to-collaborative-input-and-scrutiny process for writing legislation, rather than handing over minute-to-minute influence to lobbyists. Until then, feel free to contact your Members of Congress — join or login to “My OpenCongress” to easily find and write your elected officials — and encourage them to endorse the 8 Principles of Open Gov’t Data today. Much more to come here on OC re: the Baucus Bill and the debate over how to accomplish health care reform, please stay tuned by subscribing to our RSS feed or by sharing this post via Twitter & Facebook. If you happen to work on Capitol Hill or have a stake in making government data widely accessible to facilitate political engagement, let us know what you think: firstname.lastname@example.org