Views
THOMAS bulk data access
From OpenCongress Wiki
| Line 162: | Line 162: | ||
A general reduction of $1,051,000 has also been taken. The GPO,<br/>in consultation with the Joint Committee on Printing, should review those materials which are non-legislative in nature now being<br/>charged against this appropriation and determine the extent to<br/>which House or Senate can provide direct reimbursement or reduce<br/>the need for such material. | A general reduction of $1,051,000 has also been taken. The GPO,<br/>in consultation with the Joint Committee on Printing, should review those materials which are non-legislative in nature now being<br/>charged against this appropriation and determine the extent to<br/>which House or Senate can provide direct reimbursement or reduce<br/>the need for such material. | ||
</div><div style="margin-left: 40px"><br/></div></div></div></div></div> | </div><div style="margin-left: 40px"><br/></div></div></div></div></div> | ||
| − | |||
=== H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-104hrpt733/pdf/CRPT-104hrpt733.pdf 104-733] (accompanying P.L. 104-53, Legislative Branch Appropriations Act, 1997) === | === H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-104hrpt733/pdf/CRPT-104hrpt733.pdf 104-733] (accompanying P.L. 104-53, Legislative Branch Appropriations Act, 1997) === | ||
<div style="margin-left: 40px">Amendment No. 23: Deletes a provision proposed by the Senate</div><div style="margin-left: 40px">regarding an electronic information system. The managers on the</div><div style="margin-left: 40px">part of the House and Senate agree that the Congressional Research</div><div style="margin-left: 40px">Service, upon the request of the Senate Committee on Rules</div><div style="margin-left: 40px">and Administration, and in consultation with the Secretary of the</div><div style="margin-left: 40px">Senate and the heads of the appropriate offices and agencies of the</div><div><div style="margin-left: 40px">legislative branch, shall coordinate the development of an electronic</div><div style="margin-left: 40px">congressional legislative information and document retrieval</div><div style="margin-left: 40px">system to provide for the legislative information needs of the Senate</div><div style="margin-left: 40px">through the exchange and retrieval of information and documents</div><div style="margin-left: 40px">among legislative branch offices and agencies. The managers<br/></div><div style="margin-left: 40px">on the part of the House and the Senate also agree that the Library</div><div style="margin-left: 40px">of Congress shall assist the Congressional Research Service</div><div style="margin-left: 40px">in supporting the Senate in this effort, and shall provide technical</div><div style="margin-left: 40px">staff and resources as may be necessary.</div></div> | <div style="margin-left: 40px">Amendment No. 23: Deletes a provision proposed by the Senate</div><div style="margin-left: 40px">regarding an electronic information system. The managers on the</div><div style="margin-left: 40px">part of the House and Senate agree that the Congressional Research</div><div style="margin-left: 40px">Service, upon the request of the Senate Committee on Rules</div><div style="margin-left: 40px">and Administration, and in consultation with the Secretary of the</div><div style="margin-left: 40px">Senate and the heads of the appropriate offices and agencies of the</div><div><div style="margin-left: 40px">legislative branch, shall coordinate the development of an electronic</div><div style="margin-left: 40px">congressional legislative information and document retrieval</div><div style="margin-left: 40px">system to provide for the legislative information needs of the Senate</div><div style="margin-left: 40px">through the exchange and retrieval of information and documents</div><div style="margin-left: 40px">among legislative branch offices and agencies. The managers<br/></div><div style="margin-left: 40px">on the part of the House and the Senate also agree that the Library</div><div style="margin-left: 40px">of Congress shall assist the Congressional Research Service</div><div style="margin-left: 40px">in supporting the Senate in this effort, and shall provide technical</div><div style="margin-left: 40px">staff and resources as may be necessary.</div></div> | ||
=== S. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-105srpt16/pdf/CRPT-105srpt16.pdf 105-16] (accompanying Supplementation Appropriations and Rescissions Act, 1997) === | === S. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-105srpt16/pdf/CRPT-105srpt16.pdf 105-16] (accompanying Supplementation Appropriations and Rescissions Act, 1997) === | ||
<div style="margin-left: 40px">The Committee recommends the transfer of $5,000,000 from</div><div style="margin-left: 40px">funds available under the heading ‘‘Senate’’ to the Secretary of the</div><div style="margin-left: 40px">Senate, to be available through September 30, 2000, for development</div><div style="margin-left: 40px">and implementation of a comprehensive, Senatewide legislative</div><div style="margin-left: 40px">information system [LIS]. The accounts from which the transfers</div><div style="margin-left: 40px">occur are contingent upon the approval of the Committee on</div><div style="margin-left: 40px">Appropriations. Pursuant to section 8 of the Legislative Branch Appropriations</div><div style="margin-left: 40px">Act, 1997, the Secretary is required to develop and implement</div><div style="margin-left: 40px">LIS under the oversight of the Committee on Rules and</div><div style="margin-left: 40px">Administration.</div> | <div style="margin-left: 40px">The Committee recommends the transfer of $5,000,000 from</div><div style="margin-left: 40px">funds available under the heading ‘‘Senate’’ to the Secretary of the</div><div style="margin-left: 40px">Senate, to be available through September 30, 2000, for development</div><div style="margin-left: 40px">and implementation of a comprehensive, Senatewide legislative</div><div style="margin-left: 40px">information system [LIS]. The accounts from which the transfers</div><div style="margin-left: 40px">occur are contingent upon the approval of the Committee on</div><div style="margin-left: 40px">Appropriations. Pursuant to section 8 of the Legislative Branch Appropriations</div><div style="margin-left: 40px">Act, 1997, the Secretary is required to develop and implement</div><div style="margin-left: 40px">LIS under the oversight of the Committee on Rules and</div><div style="margin-left: 40px">Administration.</div> | ||
| − | |||
=== H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-105hrpt196/pdf/CRPT-105hrpt196.pdf 105-196] (accompanying H.R. 2209, Legislative Branch Appropriations Bill, 1998) === | === H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-105hrpt196/pdf/CRPT-105hrpt196.pdf 105-196] (accompanying H.R. 2209, Legislative Branch Appropriations Bill, 1998) === | ||
<div style="margin-left: 40px">An open exchange of technology, projects, plans and developments</div><div style="margin-left: 40px">is crucial to the success of a legislative branch wide information</div><div style="margin-left: 40px">system. It is expected, therefore, that the following organizations</div><div style="margin-left: 40px">will continue to participate and assist in all the efforts of the</div><div style="margin-left: 40px">Clerk of the House and the Secretary of the Senate: the Library of</div><div style="margin-left: 40px">Congress, the Government Printing Office, House Information Resources,</div><div style="margin-left: 40px">the Senate Computer Center, the General Accounting Office,</div><div style="margin-left: 40px">the Congressional Budget Office, and the Architect of the Capitol.</div><div style="margin-left: 40px"><br/></div><div><div style="margin-left: 40px">The Committee on House Oversight and the Senate Committee</div><div style="margin-left: 40px">on Rules and Administration have begun a process to develop a</div><div style="margin-left: 40px">common information dissemination system. The Legislative Information</div><div style="margin-left: 40px">System (LIS) being developed by the Congressional Research</div><div style="margin-left: 40px">Service and the Library of Congress, when completed, will</div><div style="margin-left: 40px">replace the retrieval functions for legislative information systems</div><div style="margin-left: 40px">currently being operated by House Information Resources (HIR).</div><div style="margin-left: 40px">The Library and CRS must devote sufficient resources to accomplish</div><div style="margin-left: 40px">the following during FY1998:</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 80px">Provide comparable functionality so that legacy retrieval systems</div><div style="margin-left: 80px">can be retired by 12/31/98;</div><div style="margin-left: 80px"><br/></div><div style="margin-left: 80px">Improve the productivity of Congressional staff by making significant</div><div style="margin-left: 80px">progress in implementing previously identified high</div><div style="margin-left: 80px">priority functionality; and</div><div style="margin-left: 80px"><br/></div><div style="margin-left: 80px">Improve the accuracy, usability, and timeliness of legislative</div><div style="margin-left: 80px">information retrieval.</div></div> | <div style="margin-left: 40px">An open exchange of technology, projects, plans and developments</div><div style="margin-left: 40px">is crucial to the success of a legislative branch wide information</div><div style="margin-left: 40px">system. It is expected, therefore, that the following organizations</div><div style="margin-left: 40px">will continue to participate and assist in all the efforts of the</div><div style="margin-left: 40px">Clerk of the House and the Secretary of the Senate: the Library of</div><div style="margin-left: 40px">Congress, the Government Printing Office, House Information Resources,</div><div style="margin-left: 40px">the Senate Computer Center, the General Accounting Office,</div><div style="margin-left: 40px">the Congressional Budget Office, and the Architect of the Capitol.</div><div style="margin-left: 40px"><br/></div><div><div style="margin-left: 40px">The Committee on House Oversight and the Senate Committee</div><div style="margin-left: 40px">on Rules and Administration have begun a process to develop a</div><div style="margin-left: 40px">common information dissemination system. The Legislative Information</div><div style="margin-left: 40px">System (LIS) being developed by the Congressional Research</div><div style="margin-left: 40px">Service and the Library of Congress, when completed, will</div><div style="margin-left: 40px">replace the retrieval functions for legislative information systems</div><div style="margin-left: 40px">currently being operated by House Information Resources (HIR).</div><div style="margin-left: 40px">The Library and CRS must devote sufficient resources to accomplish</div><div style="margin-left: 40px">the following during FY1998:</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 80px">Provide comparable functionality so that legacy retrieval systems</div><div style="margin-left: 80px">can be retired by 12/31/98;</div><div style="margin-left: 80px"><br/></div><div style="margin-left: 80px">Improve the productivity of Congressional staff by making significant</div><div style="margin-left: 80px">progress in implementing previously identified high</div><div style="margin-left: 80px">priority functionality; and</div><div style="margin-left: 80px"><br/></div><div style="margin-left: 80px">Improve the accuracy, usability, and timeliness of legislative</div><div style="margin-left: 80px">information retrieval.</div></div> | ||
| Line 192: | Line 190: | ||
=== H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-112hrpt511/pdf/CRPT-112hrpt511.pdf 112-511] (accompanying HR 5882, Legislative Branch Appropriations Bill for 2013) === | === H. Rept. [http://www.gpo.gov/fdsys/pkg/CRPT-112hrpt511/pdf/CRPT-112hrpt511.pdf 112-511] (accompanying HR 5882, Legislative Branch Appropriations Bill for 2013) === | ||
<div><div style="margin-left: 40px">During the hearings this year, the Committee heard testimony<br/></div><div style="margin-left: 40px">on the dissemination of congressional information products in Extensible</div><div style="margin-left: 40px">Markup Language (XML) format. XML permits data to be</div><div style="margin-left: 40px">reused and repurposed not only for print output but for conversion</div><div style="margin-left: 40px">into ebooks, mobile web applications, and other forms of content delivery</div><div style="margin-left: 40px">including data mashups and other analytical tools. The Com-</div><div><div style="margin-left: 40px">mittee has heard requests for the increased dissemination of congressional</div><div style="margin-left: 40px">information via bulk data download from non-governmental</div><div style="margin-left: 40px">groups supporting openness and transparency in the legislative</div><div style="margin-left: 40px">process. While sharing these goals, the Committee is also</div><div style="margin-left: 40px">concerned that Congress maintains the ability to ensure that its</div><div style="margin-left: 40px">legislative data files remain intact and a trusted source once they</div><div style="margin-left: 40px">are removed from the Government’s domain to private sites.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The GPO currently ensures the authenticity of the congressional</div><div style="margin-left: 40px">information it disseminates to the public through its Federal Digital</div><div style="margin-left: 40px">System and the Library Congress’s THOMAS system by the</div><div style="margin-left: 40px">use of digital signature technology applied to the Portable Document</div><div style="margin-left: 40px">Format (PDF) version of the document, which matches the</div><div style="margin-left: 40px">printed document. The use of this technology attests that the digital</div><div style="margin-left: 40px">version of the document has not been altered since it was authenticated</div><div style="margin-left: 40px">and disseminated by GPO. At this time, only PDF files</div><div style="margin-left: 40px">can be digitally signed in native format for authentication purposes.</div><div style="margin-left: 40px">There currently is no comparable technology for the application</div><div style="margin-left: 40px">and verification of digital signatures on XML documents.</div><div style="margin-left: 40px">While the GPO currently provides bulk data access to information</div><div style="margin-left: 40px">products of the Office of the Federal Register, the limitations on</div><div style="margin-left: 40px">the authenticity and integrity of those data files are clearly spelled</div><div style="margin-left: 40px">out in the user guide that accompanies those files on GPO’s Federal</div><div style="margin-left: 40px">Digital System.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The GPO and Congress are moving toward the use of XML as</div><div style="margin-left: 40px">the data standard for legislative information. The House and Senate</div><div style="margin-left: 40px">are creating bills in XML format and are moving toward creating</div><div style="margin-left: 40px">other congressional documents in XML for input to the GPO.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">At this point, however, the challenge of authenticating downloads<br/></div><div style="margin-left: 40px">of bulk data legislative data files in XML remains unresolved, and</div><div style="margin-left: 40px">there continues to be a range of associated questions and issues:</div><div style="margin-left: 40px">Which Legislative Branch agency would be the provider of bulk</div><div style="margin-left: 40px">data downloads of legislative information in XML, and how would</div><div style="margin-left: 40px">this service be authorized. How would ‘‘House’’ information be differentiated</div><div style="margin-left: 40px">from ‘‘Senate’’ information for the purposes of bulk data</div><div style="margin-left: 40px">downloads in XML? What would be the impact of bulk downloads</div><div style="margin-left: 40px">of legislative data in XML on the timeliness and authoritativeness</div><div style="margin-left: 40px">of congressional information? What would be the estimated</div><div style="margin-left: 40px">timeline for the development of a system of authentication for bulk</div><div style="margin-left: 40px">data downloads of legislative information in XML? What are the</div><div style="margin-left: 40px">projected budgetary impacts of system development and implementation,</div><div style="margin-left: 40px">including potential costs for support that may be required</div><div style="margin-left: 40px">by third party users of legislative bulk data sets in XML, as well</div><div style="margin-left: 40px">as any indirect costs, such as potential requirements for Congress</div><div style="margin-left: 40px">to confirm or invalidate third party analyses of legislative data</div><div style="margin-left: 40px">based on bulk downloads in XML? Are there other data models or</div><div style="margin-left: 40px">alternative that can enhance congressional openness and transparency</div><div style="margin-left: 40px">without relying on bulk data downloads in XML?</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The Committee directs the establishment of a task force composed</div><div style="margin-left: 40px">of staff representatives of the Library of Congress, the Congressional</div><div style="margin-left: 40px">Research Service, the Clerk of the House, the Government</div><div style="margin-left: 40px">Printing Office, and such other congressional offices as may</div><div style="margin-left: 40px">be necessary, to examine these and any additional issues it considers</div><div style="margin-left: 40px">relevant and to report back to the Committee on Appropriations</div><div style="margin-left: 40px">of the House and Senate.<br/></div></div></div> | <div><div style="margin-left: 40px">During the hearings this year, the Committee heard testimony<br/></div><div style="margin-left: 40px">on the dissemination of congressional information products in Extensible</div><div style="margin-left: 40px">Markup Language (XML) format. XML permits data to be</div><div style="margin-left: 40px">reused and repurposed not only for print output but for conversion</div><div style="margin-left: 40px">into ebooks, mobile web applications, and other forms of content delivery</div><div style="margin-left: 40px">including data mashups and other analytical tools. The Com-</div><div><div style="margin-left: 40px">mittee has heard requests for the increased dissemination of congressional</div><div style="margin-left: 40px">information via bulk data download from non-governmental</div><div style="margin-left: 40px">groups supporting openness and transparency in the legislative</div><div style="margin-left: 40px">process. While sharing these goals, the Committee is also</div><div style="margin-left: 40px">concerned that Congress maintains the ability to ensure that its</div><div style="margin-left: 40px">legislative data files remain intact and a trusted source once they</div><div style="margin-left: 40px">are removed from the Government’s domain to private sites.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The GPO currently ensures the authenticity of the congressional</div><div style="margin-left: 40px">information it disseminates to the public through its Federal Digital</div><div style="margin-left: 40px">System and the Library Congress’s THOMAS system by the</div><div style="margin-left: 40px">use of digital signature technology applied to the Portable Document</div><div style="margin-left: 40px">Format (PDF) version of the document, which matches the</div><div style="margin-left: 40px">printed document. The use of this technology attests that the digital</div><div style="margin-left: 40px">version of the document has not been altered since it was authenticated</div><div style="margin-left: 40px">and disseminated by GPO. At this time, only PDF files</div><div style="margin-left: 40px">can be digitally signed in native format for authentication purposes.</div><div style="margin-left: 40px">There currently is no comparable technology for the application</div><div style="margin-left: 40px">and verification of digital signatures on XML documents.</div><div style="margin-left: 40px">While the GPO currently provides bulk data access to information</div><div style="margin-left: 40px">products of the Office of the Federal Register, the limitations on</div><div style="margin-left: 40px">the authenticity and integrity of those data files are clearly spelled</div><div style="margin-left: 40px">out in the user guide that accompanies those files on GPO’s Federal</div><div style="margin-left: 40px">Digital System.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The GPO and Congress are moving toward the use of XML as</div><div style="margin-left: 40px">the data standard for legislative information. The House and Senate</div><div style="margin-left: 40px">are creating bills in XML format and are moving toward creating</div><div style="margin-left: 40px">other congressional documents in XML for input to the GPO.</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">At this point, however, the challenge of authenticating downloads<br/></div><div style="margin-left: 40px">of bulk data legislative data files in XML remains unresolved, and</div><div style="margin-left: 40px">there continues to be a range of associated questions and issues:</div><div style="margin-left: 40px">Which Legislative Branch agency would be the provider of bulk</div><div style="margin-left: 40px">data downloads of legislative information in XML, and how would</div><div style="margin-left: 40px">this service be authorized. How would ‘‘House’’ information be differentiated</div><div style="margin-left: 40px">from ‘‘Senate’’ information for the purposes of bulk data</div><div style="margin-left: 40px">downloads in XML? What would be the impact of bulk downloads</div><div style="margin-left: 40px">of legislative data in XML on the timeliness and authoritativeness</div><div style="margin-left: 40px">of congressional information? What would be the estimated</div><div style="margin-left: 40px">timeline for the development of a system of authentication for bulk</div><div style="margin-left: 40px">data downloads of legislative information in XML? What are the</div><div style="margin-left: 40px">projected budgetary impacts of system development and implementation,</div><div style="margin-left: 40px">including potential costs for support that may be required</div><div style="margin-left: 40px">by third party users of legislative bulk data sets in XML, as well</div><div style="margin-left: 40px">as any indirect costs, such as potential requirements for Congress</div><div style="margin-left: 40px">to confirm or invalidate third party analyses of legislative data</div><div style="margin-left: 40px">based on bulk downloads in XML? Are there other data models or</div><div style="margin-left: 40px">alternative that can enhance congressional openness and transparency</div><div style="margin-left: 40px">without relying on bulk data downloads in XML?</div><div style="margin-left: 40px"><br/></div><div style="margin-left: 40px">The Committee directs the establishment of a task force composed</div><div style="margin-left: 40px">of staff representatives of the Library of Congress, the Congressional</div><div style="margin-left: 40px">Research Service, the Clerk of the House, the Government</div><div style="margin-left: 40px">Printing Office, and such other congressional offices as may</div><div style="margin-left: 40px">be necessary, to examine these and any additional issues it considers</div><div style="margin-left: 40px">relevant and to report back to the Committee on Appropriations</div><div style="margin-left: 40px">of the House and Senate.<br/></div></div></div> | ||
| + | == Congressional Hearings == | ||
| + | |||
== Documents and Reports Prepared by Congress and Legislative Branch Support Agencies == | == Documents and Reports Prepared by Congress and Legislative Branch Support Agencies == | ||
<div>Duplication Among Legislative Tracking Systems: Findings, A Report Prepared by the Library of Congress for the House and Senate Appropriations Committees Pursuant to House Report 103-517 and House Report 104-141, July 14, 1995</div><div><br/></div><div>[http://democrats.rules.house.gov/archives/theplan.htm A Plan for a New Legislative Information System for the United States Congress], Prepared by the Library of Congress, February 16, 1996</div><div><br/></div><div><div>The Legislative Information System Strategic Objective Report, FY2012</div></div><div><br/></div> | <div>Duplication Among Legislative Tracking Systems: Findings, A Report Prepared by the Library of Congress for the House and Senate Appropriations Committees Pursuant to House Report 103-517 and House Report 104-141, July 14, 1995</div><div><br/></div><div>[http://democrats.rules.house.gov/archives/theplan.htm A Plan for a New Legislative Information System for the United States Congress], Prepared by the Library of Congress, February 16, 1996</div><div><br/></div><div><div>The Legislative Information System Strategic Objective Report, FY2012</div></div><div><br/></div> | ||
Revision as of 21:23, September 18, 2012
Introduction
This wiki gathers information concerning public bulk access to information stored on THOMAS, a comprehensive Internet-accessible database that makes federal legislative information available to the public at no cost. THOMAS is operated by the Library of Congress and was launched in January of 1995 at the inception of the 104th Congress.
Quick Facts
- At least twice as many people access congressional legislative information through third party sources than directly through the THOMAS website. Major third party sources include GovTrack.us, OpenCongress.org, and Sunlight's Congress app for Android.
- Providing “bulk access to data” means releasing an entire database for use by others.
- GPO currently publishes 6 datasets in bulk (including the Federal Register); Data.gov (launched March 2010) has 400,000 datasets; New Jersey and New Hampshire publish legislative data in bulk.
- A coalition of organizations issues the major Open House Report calling on Congress to "embrace structured data by publishing the status of legislation and other information to the Web not only as it is now, but also in structured data formats." (May 2007) (http://bit.ly/HkPycb)
- The Explanatory Statement accompanying the Committee Print of the House Committee on Appropriations for Public Law 111-9 (March 2009) articulates Congress' support for bulk access to legislative information. (http://1.usa.gov/I2UvJG p. 1770)
- In 2008, the Library of Congress says it expected to report on the resources necessary to supply the public with raw legislative data within the first part of the calendar year. It established a bulk data task force that has never completed its deliberations. (http://bit.ly/A4c5le)
- Rep. Bill Foster introduced HR 6289 (in the 111th Congress) that would require some legislative data to be made available in bulk and create a THOMAS advisory committee. (Sep. 2010) (http://1.usa.gov/HZthAp)
- Congressional Facebook Hackathon endorses bulk access to legislative data as an action item: "Release Structured Machine-Readable Legislative Data: Providing legislative data in a bulk format to enable third-party developers to create more dynamic interfaces for legislative information." (November 2011) (http://1.usa.gov/ygzQpl)
- 30 organizations and companies call for bulk access to legislative data and the creation of an advisory committee. (April 6, 2012)
Blog Posts
- "Looking Forward to the THOMAS Beta Website" by Daniel Schuman (9/14/2012)
- "How to #FreeTHOMAS: A report on implementing bulk access" by Daniel Schuman et al (8/24/2012)
- "Rep. Honda Speaks on Bulk Access on the House Floor" by Daniel Schuman (6/8/2012)
- "Major Transparency Milestone in Bulk Access Statement" by Daniel Schuman (6/6/2012)
- "Issa amendment denied, but leadership supports bulk access" by Matt Rumsey (6/6/2012)
- "Issa Offers #FreeTHOMAS Amendment to Leg Approps Bill" by Daniel Schuman (6/5/2012)
- "Media Spotlight on Congress Stalling Open Access to Legislation" by Nicko Margolies (6/5/2012)
- "Bulk Access Language Tweaked by Approps" by Daniel Schuman (6/5/2012)
- "#FreeTHOMAS" by Daniel Schuman (6/4/2012)
- "Bulk Access Developments after the H. Approps Hearing" by Daniel Schuman (6/1/2012)
- "THOMAS Talking Points" by Daniel Schuman (5/30/2012)
- "Appropriators May Undercut Legislative Transparency" by Daniel Schuman and Eric Mill (5/30/2012)
- "Full Committee Markup on Leg Approps Set for Thursday" by Daniel Schuman (5/24/2012)
- "Will the House's Leg Spending Bill Match Its Transparency Priorities?" by Daniel Schuman (5/24/2012)
- "Two Steps Forward on Improving Public Access to Legislative Information" by Daniel Schuman (5/18/2012)
- "Appropriators Should Consider Public Access to Leg Info at Friday Mark-up" by Daniel Schuman (5/17/2012)
- "News Without Transparency: House Passes Bridge BIll After an Earmark Debate" by Matt Rumsey and Melanie Buck (5/10/2012)
- "Improve Public Access to Legislative Information" by Daniel Schuman (4/10/2012)
- "Help improve public access to Congressional/legislative information #FDLP" by James Jacobs (3/28/2012)
- "GovTrack Users Want Better Transparency From Congress" by Josh Tauburer (3/16/2012)
- "Tell Congress to Open Up" by Nicole Aro (3/12/2012)
- "Government Transparency “To Do” Your Government Transparency 'To-Do'" by Jim Harper (3/12/2012)
- "Partners in Data Transparency: Parliaments and Non-Profits" by Daniel Schuman (3/1/2012)
- "Put THOMAS on the Fast Track" by Daniel Schuman (2/9/2012)
- "Benchmarks for Measuring Success for Legislative Data Transparency" by Daniel Schuman (2/2/2012)
- "Bulk Data at the House Legislative Data Conference" by John Wonderlich (2/2/2012)
- "Liberate OpenGovData Now" by David Moore (2/1/2012)
- "In #HackWeTrust - The House of Representatives Opens Its Doors to Transparency Through Technology" by Daniel Schuman (12/8/2011)
- "House Holding Wonk-a-thon on Public Access to Congressional Info This Thursday" by Daniel Schuman (12/5/2011)
- "Sunlight Testimony: Bulk Access to THOMAS and Access to CRS Reports" by Daniel Schuman (12/5/2011)
- "Read the Bill 2.0" by Daniel Schuman (11/14/2010)
- "Rep. Foster Introduces Bill To Improve THOMAS" by Daniel Schuman (9/30/2010)
- "Apps for THOMAS: 3 Wishes" by Daniel Schuman (7/29/2010)
- "Birds of a Feather: What's in the DISCLOSE Bills" by Daniel Schuman (5/3/2010)
- "Tip of the Hat to THOMAS" by Daniel Schuman (1/6/2010)
- "House Leg Branch Appropriations Review" by John Wonderlich (6/27/2009)
- "Legislative Databases recommendation makes it to House Leg Branch Appropriations markup" by Josh Tauburer (4/14/2008)
- "Congressman Honda on the Open House cause" by Josh Tauburer (2/1/2008)
- Discussion on the Open House Project email list (link) (11/14/2007)
- "Mash-ups for government transparency" by Josh Tauburer (1/25/2007)
- "Finding Bills Online" by Paul Blumenthal (1/9/2007)
Policy Documents and Gov't Resources
Government Resources
- "House Leaders Back Bulk Access to Legislative Information" Speaker Boehner Press Office (6/5/2012)
- Amendment Offered to H.R. 5882, by Rep. Issa (R-CA) (6/5/2012)
- House of Representatives Adopts Standards for Electronic posting of House and committee documents and data (committee resolution as PDF) (document naming conventions as PDF) (December 2011)
- House of Representatives launches transparency portal docs.house.gov (December 2011)
- "Annual Report of the Congressional Research Service of the Library of Congress for Fiscal Year 2009" (January 2010). See page 20.
- "Remarks from the Public Printer of the United States" (October 19, 2009)
- Library of Congress letter to Committee on House Administration on THOMAS (4/31/2008)
Civil Society Organization Resources
- 30 Organizations Send Letters to Appropriators and Rulemakers regarding bulk access to THOMAS (April 10, 2012)
- Comments Submitted for the Record by Joshua Tauburer for House Committee on Appropriations Subcommittee on the Legislative Branch regarding bulk data for legislative information (Febuary 6, 2012)
- Comments Submitted for the Record by the Sunlight Foundation for the House Committee on Appropriations Subcommittee on the Legislative Branch Hearing (February 6, 2012)
- Comments Submitted for the Record by the Sunlight Foundation for the House Committee on Appropriations Subcommittee on the Legislative Branch Hearing Regarding Bulk Access to THOMAS data (May 11, 2011)
- Open House Project Report: "Congressional Information & the Internet: A Collaborative Examination of the House of Representatives and Internet Technology" Chapter 3: Legislation Database (May 8, 2007)
News Stories
- "In Support of Legislative Transparency" Google Public Policy Blog (6/15/2012)
- "Federal News Minute" WNEW-FM - Washington, D.C. (6/8/2012)
- "Congressional data may soon be easier to use online" Washington Post (6/8/2012)
- "Rep. Crenshaw backs down, loses control over bulk data issue" GovTrack.us (6/7/2012)
- "A week ago, we wrote about Congress..." Skimmer Hat (6/6/2012)
- "Free THOMAS!" Fierce Government (6/5/2012)
- "House Appropriations trims legislative agencies budget request" Fierce Government (6/5/2012)
- "Report May Hinder Goal of Open Congress" Roll Call (6/5/2012)
- "Of, By and For: A Short Legislative History of THOMAS, the Spirit of the Law, and Elle Woods" Lulu in the Library (6/5/2012)
- "Rep. Crenshaw thinks American public can’t be trusted with overseeing Congress" GovTrack.us (6/4/2012)
- "Can we stop talking about accountability for a minute? Please?" Legal Information Institute - Cornell University Law School (6/2/2012)
- "House Appropriators May Limit Public Availability of Pending Bills" Slashdot (6/1/2012)
- "For Transparency Advocates, the Honeymoon with House Republicans May Be Over" Tech President (6/1/2012)
- "Hill may freeze THOMAS in digital past" Washington Examiner (5/31/2012)
- "Transparency group decries legislative data bulk download prohibition" Fierce Government IT (5/31/2012)
- "Congress Refuses to #FreeTHOMAS Open Congress" Open Congress (5/17/2012)
- "Open government advocates seek greater access to congressional data" Federal News Radio.Com (4/16/2012)
- "GovTrack users want better transparency from Congress" GovTrack.us (4/16/2012)
- "US Agency Takes 'Private' Approach to Streamlining IT Procurement" E-Commerce Times (4/14/2012)
- "Your Government Transparency 'To Do'" Washington Watch (4/12/2012)
- "Transparency Groups Call for THOMAS bulk downloads" Fierce Government IT (4/11/2012)
- "Transparency Groups Say THOMAS website is outdated" Federal Computer Week (4/10/2012)
- "An API for Federal Legislation? Congress Wants Your Opinion" Threat Level (3/5/2009)
- "Congressional Data Mining: Coming Soon?" Mother Jones (3/5/2009)
- "Bulk Data Downloads: A Breakthrough in Government Transparency O'Reilly Radar (3/4/2009)
- "Lawmakers favor outside access to legislative data Government Executive (1/23/2008)
Additional Resources
- "Government: Do you really need an API" by Eric Mill (3/21/2012)
- Sites that use GovTrack Data (list)
- THOMAS RSS feeds (link)
- How often is THOMAS updated (link)
- Josh Tauburer on Civic Technology (link)
- House of Representatives Adopts Standards for Electronic posting of House and committee documents and data (committee resolution as PDF) (document naming conventions as PDF)
- House of Represnetatives launches transparency portal docs.house.gov
- Library of Congress letter to Committee on House Administration on THOMAS (4/31/2008)
The History of THOMAS Generally
- "Congress on the Internet: New Web Server Organizes Online Information" Library of Congress Information Bulletin (1/25/1995) - Announces the creation of THOMAS and includes introductory remarks at Jan. 5 launch event by then-Speaker Gingrich
- Access to Government Information on the Internet Interpersonal Computing and Technology Journal (10/1993) - Discusses the precursor to THOMAS, the Library of Congress Information System (LOCIS)
- "The Hill on the Net: Congress Enters the Information Age," by Chris Casey (1996) - Has history of creation of THOMAS.
States that provide bulk access to legislative data
- New Hampshire
- New Jersey
- The Sunlight Foundation scrapes and provides bulk access to [50 of 50 state legislative data]
Historical Resources on the Development of Congress' Legislative Information Systems
Legislative Language and Committee Reports
H. Rept. 103-517 (accompanying P.L. 103-283, Legislative Branch Appropriations Act, 1995)
S. Rept. 104-114 (accompanying Legislative Branch Appropriation 1996)
H. Rept 104-141 (accompanying the Legislative Branch Appropriations Bill, 1996)
methods for increasing electronic printing of House documents. The
proposal should be coordinated with the House entities (such as
committees, legislative and law revision counsels, etc.) who require
of Assistant Clerk (FEC) has been eliminated in a reorganization
of the Clerk’s office; funds for that position, therefore, have not
been provided. Funds for subscriptions to the U.S. Code have also
been deleted from the Clerk’s budget. For those Members who require
funds. Alternatively, the Code is available in the House library, at
the Library of Congress, on Internet through the ‘Thomas’ connection,
ROM which is available from the Government Printing Office.
Closed captioning funds are not provided since the Committee has
been told that the contract will be renewed with FY 1995 funds.
Also, funds for contracting out stenographic reporting of Committee
hearings are provided in the Clerk’s budget ($800,000, a savings of
$300,000 below the amount provided in FY 1995.) It should be
noted that funding for the U.S. Code, stenographic contracting, and
newspaper subscriptions have formerly been carried in the ‘
control the use of these funds, but does the ordering or contracting
as a service to other House offices, a more convenient administrative
to determine their continued need and to fully inform the ultimate
consumers of their actual cost.
H. Rept. 104-212 (accompanying Legislative Branch Appropriations 1996)
H. Rept. 104-657 (accompanying Legislative Branch Appropriations bill 1997)
♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦
♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦
The Committee has included $81,669,000 for printing and binding of congressional documents at the Government Printing Office
for use by Congress and by-law programs. The amount provided reflects a savings of $1,050,000 by converting the permanent, bound
Congressional Record to a CD–ROM format. The daily Congressional Record will continue to be distributed in the formats preferred by the recipients, i.e., paper or microfiche, and is also available electronically via the widely-accessible Internet distribution
network through the THOMAS system and the GPO ACCESS network. The permanent, paper-based bound Record, which is delayed
in production by 8 years at the present time, is a perfect candidate
for electronic format. Each set costs almost $12,000 to print and
bind, and is made available in limited quantities. CD–ROM’s can
be provided at a fraction of this cost and will be very flexible research tools in library or office settings, where the bound paper
sets are normally utilized. The bill provides $100,000 for a more
limited number of printed, permanent Records which can be produced from the less expensive CD–ROM format data base setup.
These copies can be distributed at the direction of the Joint Committee on Printing. For those offices and institutions that cannot
do without paper copies, CD–ROM’s can be printed by commercial
printing establisments at a much smaller cost than current charges
against the Congressional printing and binding appropriation.
The Committee has been informed that the conversion to CD–
ROM will expedite the availability of the permanent version of the
Congressional Record by several years, thereby making it available
much sooner than the current 8-year delay. The GPO is directed
to develop a plan that will minimize the time necessary to distribute this record of House and Senate debate. The plan should include the objectives and a time line for achieving the time savings.
Also, the GPO, in consultation with the Library of Congress, should plan to make the CD–ROM version of the permanent Record available on Internet to the broadest possible audience.
Both plans should be presented in the fiscal year 1998 budget
submission.
A general reduction of $1,051,000 has also been taken. The GPO,
in consultation with the Joint Committee on Printing, should review those materials which are non-legislative in nature now being
charged against this appropriation and determine the extent to
which House or Senate can provide direct reimbursement or reduce
the need for such material.
H. Rept. 104-733 (accompanying P.L. 104-53, Legislative Branch Appropriations Act, 1997)
S. Rept. 105-16 (accompanying Supplementation Appropriations and Rescissions Act, 1997)
H. Rept. 105-196 (accompanying H.R. 2209, Legislative Branch Appropriations Bill, 1998)
S. Rept. 105-47 (accompanying S. 1019, Legislative Branch Appropriations for FY ending Sep. 30 1998)
♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦ ♦
S. Rept. 105-204 (accompanying Legislative Branch Appropriations Act, 1999)
H. Conf. Rept. 105-734 (accompanying HR 4112, Legislative Branch Appropriations Bill for FY ending Sept. 30, 1999)
H. Rept. 106-635 (accompanying the Legislative Branch Appropriations Bill 2001)
S. Rept. 107-37 (accompanying S. 1172, Legislative Branch Appropriations 2002)
H. Rept. 110-98 (accompanying Legislative Branch Appropriations Bill, 2008)
Improved Access to Roll Call Information.—The Committee believes
the public could benefit from more easily accessible roll call
information. To that end, the Committee requests that the Chief
Administrative Officer work with the Clerk of the House and the
Library of Congress to study how, within the public House of Representatives
website and the THOMAS website, a joint system
might be developed to allow roll call searches by specific word, and
report back to the Committee on Appropriations of the House by
December 1, 2007.
Joint Explanatory Statement, House Committee on Appropriations, Omnibus Act, 2009 (accompanying H.R. 1105 / Public Law 111-8, Omnibus Appropriations Act of 2009)
See Book G, explanatory statement on Congressional Research Service Salaries and Expenses, the paragraph starting with the phrase "Public Access to Legislative Data" (or page 10 of this PDF) (March 2009).
H. Rept. 112-511 (accompanying HR 5882, Legislative Branch Appropriations Bill for 2013)
Congressional Hearings
Documents and Reports Prepared by Congress and Legislative Branch Support Agencies
Ideas for Upgrading THOMAS
Top Suggestions
- Bulk Access to THOMAS data
- Incorporate open data principles
Meta Suggestions
- Have regular roundtable discussions with members of public and government to discuss ideas for improving THOMAS
- Create THOMAS users group (email discussion?)
- Programmer access page: for XML access, RSS feeds, email sign ups, etc.
- Work to improve parsability of all search results; more structured data
- All bills in XML
- Singe page (no pagination) that lists every bill in Congress with status; updated daily on a new page (for scraping); preferably in a feed or XML format
- Create and make public unique IDs for commonly used entities (or draw upon those created by others)
- List of all Committees and Subcommittees Members
- Incorporate Senate Amendments (See S Res. 562)
- Consider redesign of site (look at LIS, GovTrak, OpenCongress for ideas + public)
- Provide more detailed history of how THOMAS came to be
Specific Suggestions
- Make Public Laws Searchable by law number and by name
- Allow for bill alerts system (email) for bills and topics
- Add short name of bill to weekly top 5 (plus link to archives)
- Allow highlighting of "hot" bills -- where there's some kind of legislative action
- Word/Phrase vs. Bill Number
- have search box handle both;
- allow search of entire bill text
- make selection of phrase vs number sticky
- Improve "related bills" -- run comparison of bill summaries/ text -- both in this Congress and over past Congresses
- Make easier to trace bills through, especially when there is a substitute
- e.g., HR 3200 became HR 3590
- Is legislation searchable by CRS tags? (Make available list of tags). Add tags to each bill, so can search for related bills.
- Organize front page of THOMAS around what's going on today in congress; with info on yesterday and upcoming
- Permalink: "save" on share/save tab is confusing; perhaps make its own link
- Daily Digest -- when send email, include contents of daily digest, not just link
- Increase size of search fields
- 3 organizing links:
- what's going on today -- running info from floor embedded into THOMAS
- what happened yesterday
- what's upcoming this week
- order plain language search for bills by topic + frequency and tags
- Is search boolean?
- want to be able to eliminate terms from search (the "not" function, e.g. Israel not steve)
- When in search result, there's a calendar, link to it automatically
Fun Suggestions
- Create twitter account to tweet whenever a bill is introduced (see OLRC) or goes to committee, enacted, etc.; tweet top five viewed bills
- Mobile version
THOMAS bulk data access - OpenCongress Wiki
