Liberate OpenGovData NowFebruary 1, 2012 - by David Moore
Update: 5pm ET – as the #LDTC conference grinds to a close, I’m sorry to report that my (unfortunately) cynical prediction turned out to be the case – officials from the LoC & GPO (more details to come) refused to embrace serious movement on the bulk data task force that was mandated back in 2009. In other words, typical D.C. bureaucratic inertia & gridlock, complete blinders to the huge public demand on OpenCongress & across the open Web for raw legislative data & government info.
Part of Congress’s job should be to empower third party developers who are are permanent part of the infrastructure that brings legislative data to a huge slice of the public. By ignoring the public’s and Congress’s calls for bulk legislative data, administrators are ignoring part of what it means to be a responsible steward of public data. [Full post – please share.]
As a professional Congressional watchdog since ’07, one who is extremely cyncial about systemic corruption & gridlock in Congress already, it is a lightly bizarre experience to find myself flush with visceral populist (i.e., public-benefit Web booster) anger about the unaccountable bureaucrats on the panel who refused to recognize the urgency of liberating #opengovdata for OC users & the public. I arrived with every expectation that the conference would not make significant strides towards the Principles of OpenGovData, and unfortunately that turned out to be the case.
Daniel Schuman of Sunlight pressed the gov’t officials to drag their feet to accept a (long-overdue) meeting (!!) with some of the most knowledgeable #opengov developers, e.g. Josh Tauberer & Eric Mill of Sunlight Labs, so that’s the next immediate step – although in their excruciating slow-walk bureaucrat turf-defending jargon, they agreed only to be “in dialogue” and to “seek input” – literally refusing to acknowledge that they were in violation of a clear Congressional directive, one that is also extremely popular around the Web and has wide-ranging social & economic benefits (well-documented – I mean, this is ridiculous). Take a look at Reddit Politics or TPM or HuffPo or RedState or Congress Matters – there’s a massive audience & demand for Congressional information. It is shocking that the officials today (names forthcoming) refuse to move with real determination towards bulk data access for the public. More grassroots efforts to come from PPF on demanding liberation of legislative data as a bare minimum first step towards #opengov & #deliberativedemocracy. For more info & context, follow Alex Howard, Jim Harper, Harlan Yu, and the #ldtc hashtag. Despite some good incremental steps, ultimately another disappointing D.C. event for the open data cause & OpenCongress mission – I know we know, as one would expect. Onwards from this letdown.
It’s 2012 – we don’t have hover skateboards, and we don’t have #opengov. We could have the latter, at least, in the here & now, benefiting every American, if the systemically corrupt U.S. Congress was capable of reforming itself (which it is currently, unfortunately, not).
I’m writing this on the train from NYC to D.C., en route to the Conference on Legislative Data & Transparency to be held Thursday, Feb. 2nd – agenda here, webcast live here reportedly. As usual with these types of events, I’m here to rep for radical transparency & deliberative democracy.
House staffers (currently under GOP majority) will seek to gather warm plaudits from the #opengov community, but it’s less clear whether the LoC & Committee on Rules will move determinedly towards liberating raw legislative data that projects like OpenCongress could use. Of course, the primary reason that the legislative process remains closed-off – whether under Democratic or Republican House leadership – is that the arcane, inscrutable process benefits the majority party currently in power, which in turn accrues leverage & benefits in campaign contributions. (Right, “Escape From New York” art, w/ evocations of forceful liberation, you know… it’s past time, come on everyone. We have the technology.)
We’ll keep hammering this until we achieve real-world electoral reforms & Rules Committee staffers & others openly acknowledge why Congress is fundamentally broken: the systemic corruption of campaign donations and lack of full public financing of elections. We’re facing major empirical social ills in the form of a lack of consumer demand, as-yet-unpunished fraud in the banking & housing industries, catastrophic climate degradation, and more – and the U.S. Congress is incapable of bipartisan action towards enacting ameliorating bills. More recent evidence of Congressional helplessness includes the farcical rush-to-vote, refusal-of-expert tech testimony of the SOPA/PIPA push (thankfully slowed – for now – by the American Censorship coalition, of which PPF is a founding member) and recently the bipartisan rush on the STOCK Act (a just-fine common-sense bill on individual investments that purposely does nothing to address the systemic corruption of how bills move through Congress or serious campaign finance reform).
Towards some balance in the judgement here, pro’s and con’s, carrots & sticks, you know how it shakes out: my compliments to the Congressional staff & government employees & civil servants & leading members of Congress who convened this meeting for their interest in this issue. It appears to be in good faith and as such represents a step forward. There’s also been some slight but notable progress in legislative XML under 112th Congress’ House GOP leadership, as summarized below, as well as ongoing positive consultation with the Open House Project (coordinated by our partners the Sunlight Foundation), the more-or-less-unsuccessful-and-now-defunct Bulk Data Task Force, plus shout-out to Jim Harper of the Cato Institute, who has been doing admirable toiling with the heavy-lifting of tech specs - get at him for more info on that effort. Unlike those who live & work in the Beltway, however, I don’t feel obliged to see real progress on legislative transparency in the usual D.C. context. I can demand, on behalf of the OpenCongress user community, immediate bulk-data access to primary source data on legislation, and then aggressive steps towards constructing a robust open API for THOMAS (and as a potential nearer-term step, full legislation in XML data formats). Anything short of that (and realistically, there’s not a punching chance of sufficient progress) means this conference, like others before it, would fail to reach the Principles of Open Government Data. Full compliance w/ these community-generated principles is a necessary (but not even sufficient) condition of #opengov, in my view.
Under the U.S. Constitution, the work of the federal legislative branch has (as designed, in theory, if not nec. in practice) the most significant outcome on the laws & public policy that shape our daily lives (cit. E-Klein). And yet, the U.S. Congress refuses to release legislative data to the public on the open Web in ways that are compliant with the ever-evolving Principles of Open Government Data. To be reductive (more detail available upon request and – to be fair – to be discussed at this conference, certainly, see agenda), official bill info lives on closed-off gov’t servers to which #opengov developers & the public do not have (read-only file permissions, to be sure) access until it’s posted on various head-scratchingly-poorly-designed websites, e.g. the GPO & THOMAS & CRS & LIMS & LIS & LLOC et al. This has been and continues to be indefensible stonewalling, and anyone who claims to have an understanding of open-source technology and also claims to support open-gov for transparency & accountability must support the immediate liberation of open data.
PPF’s fundamental premise (in OpenCongress, OpenGovernment and other software for civic engagement) is that public data should be fully public. Full stop. Transparency is a basic virtue of any sane, modern system of representative democracy in our constitutional republic. It increases public accountability, mitigates systemic corruption, reveals government waste, and encourages public trust & civic engagement with the political process. All these factors contribute to improved public policy outcomes and a greater national happiness – for evidence supporting these claims, see Prof. Beth Noveck’s Wiki Government, Open Government & Alex Howard‘s #opengov coverage & blogging by O’Reilly, research by @participatory, David Eaves, Robert Richards, and many others (see aliied orgs. in footer of our homepage). Open data is widely accepted now as #opengov best practice, but it’s sorely lacking in robust practice.
If government at the federal, state, county, municipal level around the country were even a tad forward-thinking, they’d be rushing to embrace the suites of new open technology:
- bulk data access & an API for #opengovdata;
- Open311 integration, for constituents to report non-emergency community issues;
- legislative tracking & constituent feedback through web apps like PPF’s OpenGovernment.org;
- documents published in DocumentCloud;
- more tools, as published in the Civic Commons Marketplace,
- even, dare we dream, an API for constituent communication, resulting in a more deliberative democratic process that’s now possible with tech tools we’re developing together in the commons. (This is promising, Code For America + Chitown.)
As it is, we’re seeing a landscape of governments starved of resources for improving their own lot or investing in vital tech infrastructure, and too much slavishness towards legacy consultants & solution providers at the expense of prioritizing open data on the open Web. As we’ve seen over the past decade, with a lack of eager embracing of the Internet, governments will likely move too slowly – then pretend to be “with it” by embracing some lousy commercial social media service as an #opengov fig-leaf – then slow-walk some decent one-quarter-measures for a while (like MADISON) until the next election cycle. Come on everyone, it’s 2012, let’s get an API for Congress and then go from there to some serious cleaning-house systemic campaign-finance reform to fight back against corporate control of the political process.
Brief history of the #opengov community’s failure to compel the Library of Congress & offices like House Rules Committee to give us our data: our non-profit PPF conceived of OpenCongress as a valuable public resource during the 2004 federal elections. Simply put, it was too difficult to browse, search, track, and understand bills in Congress on THOMAS. We began building a web app that aggregated official government data (from our valued & cornerstone-important data partner GovTrack.us, which is and was obliged largely to scrape THOMAS for data updates) with news & blog coverage (from Google News & Blog Search), campaign contributions (from the very-ncessary OpenSecrets), a daily Blog covering Congress in plain language, and public comment forums (in the style of Slashdot & other p2p communities). (Left: political caricature by Thomas Nast against systemic corruption.)
In 2006, PPF obtained funding from the Sunlight Foundation to begin building OpenCongress, and launched publicly in February 2007. In November 2007, a working group of #opengov advocates gathered in Sebastopol, CA w/ O’Reilly Media to draft the Principles of OpenGovData - participants at bottom of page here, including Donny & me from PPF, as well as Prof. Lessig & Carl Malamud & Josh Tauberer & many others – four and a half years later, we’re still waiting & working
As PPF developed OC as a free, libre, and open-source not-for-profit web app, we added engagement features, as well as new data sources: more campaign finance analysis from MAPLight, video from Metavid, a semantic MediaWiki previously called Congresspedia, issue group ratings from VoteSmart, social media mentions, Wikipedia bios, Bing News results, streaming online video from official Congressional YouTube hubs, Contact-Congress features, and more. But data access remains stuck in the past.
Writing today in 2012, the process for obtaining bill data on OC is, sadly, status quo. Bills appear on government sites, they’re obtained by GovTrack and others, and then sent via automatic processes to OpenCongress and others – sometimes many hours after they’re first available online at some far-flung primary source. Before that, they’re often published in useless (actually, intentionally-closed-off) .PDF formats, which our small non-profit team has struggled over the years to liberate into open standards. We’ve had tools & demand for years to move without delay towards liberation.
PPF gave up some mild praise of the MADISON project around Rep. Issa‘s (R-CA) OPEN Bill – though guys, to be honest, just give us the data and let us design the site & its user interface, as this thing isn’t about to win any design compliments. Think of it like a “market solution” type of approach – let the market decide the best interface, but put it out there on the level playing field of the open Web. Last month’s long-awaited unveiling of Docs.House.Gov is a remarkable & praiseworthy step forward – big-ups to XML of upcoming legislation – but it’s not comprehensive enough for us to change the basic processes by which OpenCongress obtains data. Our demands for true legislative transparency are as follows:
I. Bulk data access – with this, OpenCongress could mash-up previous versions of bills and facilitate research of roll calls & sponsorship & other factors – and most importantly, we’d be sure we were coordinated via RSYNC network protocol (or even HTTP, golly) with the primary government publishing source. This can be arranged without delay and the Library of Congress should move aggressively towards offering this level of exhaustive read-only access.
II. Legislative XML feeds – this would be a fine middle step for bringing up-to-the-minute bill info into OC and enabling timely tracking of legislative actions. The just-launched XML feeds don’t apply, to my understanding, to any pre-2012 bills (OC has data going back to the 109th U.S. Congress via GovTrack), so significantly re-engineering OC just for this feed isn’t quite priority. We’ll certainly incorporate it into our pages though, it’s pretty cool (but, again, painfully rudimentary & insufficient given available & well-documented & elsewhere-ubiquitous technology).
III. Open API – this has been done at the state level! It can be done in D.C., if we’re capable of pulling ourselves out of the swamp. See the pioneering work by the New York State Senate’s Open Initiative – while, clearly, THOMAS would of course have different data fields and likely would need to be slightly more complex, the NY Senate’s Open Legislation API offers a starting point for development of one for the U.S. Congress. It’s actually pretty easy for a non-programmer to parse the different offerings there and get a meaningful sense of how an outside developer (civic or commercial) would pick different data fields to display on his or her website. It’s a typically American tragicomedy that we have the ability to implement this widely- and directly-useful technology, but lack the political will to do so. This development roadmap by Andrew Hoppin, former CTO, is one of the most important #opengov blog posts of the last couple years. NY Senate was able to push through these amazing reforms only b/c of a confluence of circumstances – generally speaking, NY Senate is one of the only 99 U.S. state legislative chambers (NE is unicameral yo) that practices true openness, and even that political momentum has been under attack by local changes in the political winds.
The above addresses broad data access & formats – but for a greater (yet basic) degree of #opengov transparency, how about requiring version control of legislation? That is to say, reducitvely, requiring legislative assistants & staffers & even lobbyists & members of Congress themselves have unique logins for openly writing, editing, and commenting on draft legislation. To be sure, staffers could still have off-the-record private conversations of the expected political-calculus and/or horse-trading sort – but for the public record, language or policy ideas cribbed from lobbyists could be programatically & easily identified as such. Which wouldn’t necessarily be a pejorative thing at all – but would be a basic accountability measure. To be reductive, if a member’s staffer adds funding for a bridge for her district, observers of the bill would find the individuals responsible and bring it to other parties for input. Constituents could be continually polled with ranked-choice voting about specific, significant sections of legislation. The details matter, as we know – facing down a federal election year in which the major health-care reform bill is likely to be an issue both in the Supreme Court and the Presidential campaign, wouldn’t it be helpful to know which individual staffers took responsibility for which sections of the bill, and which lobbyists & interest groups can be identified as crafting specific language? This isn’t even to address the benefits of metadata linking campaign donations to specific bill provisions, budget & spending transparency, real-time financial disclosure of senators’ campaign contributions (imagine that), and other realistic-yet-lofty ideas. (At right: OC’s custom Message Builder for Contact-Congress features, free of charge & in open-source code to contribute libre-licensed social wisdom back to the public commons.)
Congressional staff may say “we’re working with you”, but unless they’re moving determinedly towards a modern open API for THOMAS this year, we don’t share the same vision. They’ll cite expert practitioners & issues with data publishing from their end, and we can cite same from open-source community on opening up – e.g., CivicCommons, Sunlight Policy Dep’t (shouts Daniel & John), our own team & others.
The art-deco atmosphere of the conference tomorrow – and assurances that more data is coming, it’s coming, maybe after the election – will be nice enough today, but I call on the #opengov community to become more vociferous and use the word “demand” – as in “demand for our users, demand for the public interest” – in calling for full & immediate #opengovdata. But with the House GOP leadership is pursuing an explicit strategy of gridlock & hyper-partisan symbolic bills until after the Nov. 2012 elections, real data liberation is unlikely. Here’s to looking forward to comprehensive electoral reforms such as the following: score voting, non-partisan re-districting, right-to-vote laws, real-time financial disclosure of campaign contributions & lobbyist meetings, strong ethics reforms to prevent D.C.‘s endemic revolving door problem, and full public financing of elections to elect a Congress that will finally give the public access to its data. Maybe even a discussion about the undemocratic nature of the U.S. Senate and the increasingly abused filibuster process. This may seem ambitious or impatient, but this isn’t a negotiation with our public servants & elected officials – this is about liberating public legislative data via an open API and proceeding to enhancements for constituent communication (coupled with fair elections & ethics rules with teeth) for a living, breathing deliberative democracy. The public interest needs more persuasive, creative, aggressive defenders.
Questions, thoughts, comments welcome: david at opencongress.org. PPF is a 501c3 non-profit organization – we incorporated as a public charity because we believe it is the best possible foundation for reforming our contemporary representative democracy – organizations with the highest level of public-mission written into their charters just behave differently & more positively, we’ve found – public donations (tax-exempt, by the way) go directly towards paying our (considerable) server costs and keeping OpenCongress on the Web: donate, or become a recurring donor of $10 month to support our work on OC. We’re building user-friendly Web interfaces for civic engagement and we foresee a heartening surge of demand in a saner future – help us grow.
Update 12pm ET: support true #opengov legislation, like the Public = Online Act (HR 1348) from the Sunlight Foundation & allies. Email your members of Congress on OC to support it. More info on the rich public resource Transparency Hub on the OC Wiki. In the previous 111th Congress, per John Wonderlich of Sunlight Policy, there was a great bulk data access bill, H.R. 6289 (111th) – it needs a new sponsor in the 112th, though as per the above, not much is going to happen (unfort.) until after the elections, with the House in gridlock mode. Surf along w/ my updates on the popular micro-publishing service.