*[http://www.speaker.gov/press-release/house-leaders-back-bulk-access-legislative-information "House Leaders Back Bulk Access to Legislative Information"] Speaker Boehner Press Office (6/5/2012)
*[http://www.speaker.gov/press-release/house-leaders-back-bulk-access-legislative-information "House Leaders Back Bulk Access to Legislative Information"] Speaker Boehner Press Office (6/5/2012)
*<div>[http://www.rules.house.gov/amendments/ISSA112xml65121559155915.pdf Amendment Offered to H.R. 5882], by Rep. Issa (R-CA) (6/5/2012)<br/></div>
*<div>[http://www.rules.house.gov/amendments/ISSA112xml65121559155915.pdf Amendment Offered to H.R. 5882], by Rep. Issa (R-CA) (6/5/2012)<br/></div>
+
*<div>House of Representatives Adopts Standards for Electronic posting of House and committee documents and data ([http://1.usa.gov/vyiRdV committee resolution as PDF]) ([http://1.usa.gov/y4HjO6 document naming conventions as PDF]) (December 2011)<br/></div>
*<div>"[http://assets.sunlightfoundation.com.s3.amazonaws.com/policy/papers/crs09_annrpt.pdf Annual Report of the Congressional Research Service of the Library of Congress for Fiscal Year 2009]" (January 2010). See page 20.<br/></div>
*<div>"[http://assets.sunlightfoundation.com.s3.amazonaws.com/policy/papers/crs09_annrpt.pdf Annual Report of the Congressional Research Service of the Library of Congress for Fiscal Year 2009]" (January 2010). See page 20.<br/></div>
*<div>"[http://www.fdlp.gov/home/repository/doc_view/1089-public-printer-remarks Remarks from the Public Printer of the United States]" (October 19, 2009)<br/></div>
*<div>"[http://www.fdlp.gov/home/repository/doc_view/1089-public-printer-remarks Remarks from the Public Printer of the United States]" (October 19, 2009)<br/></div>
+
*<div class="notranslate">Library of Congress letter to Committee on House Administration on THOMAS ([http://www.scribd.com/doc/94063191/Library-of-Congress-letter-to-Committee-on-House-Administration-on-THOMAS 4/31/2008])<br/></div>
This wiki gathers information concerning public bulk access to information stored on THOMAS, a comprehensive Internet-accessible database that makes federal legislative information available to the public at no cost. THOMAS is operated by the Library of Congress and was launched in January of 1995 at the inception of the 104th Congress.
Quick Facts
At least twice as many people access congressional legislative information through third party sources than directly through the THOMAS website. Major third party sources include GovTrack.us, OpenCongress.org, and Sunlight's Congress app for Android.
Providing “bulk access to data” means releasing an entire database for use by others.
GPO currently publishes 6 datasets in bulk (including the Federal Register); Data.gov (launched March 2010) has 400,000 datasets; New Jersey and New Hampshire publish legislative data in bulk.
A coalition of organizations issues the major Open House Report calling on Congress to "embrace structured data by publishing the status of legislation and other information to the Web not only as it is now, but also in structured data formats." (May 2007) (http://bit.ly/HkPycb)
The Explanatory Statement accompanying the Committee Print of the House Committee on Appropriations for Public Law 111-9 (March 2009) articulates Congress' support for bulk access to legislative information. (http://1.usa.gov/I2UvJG p. 1770)
In 2008, the Library of Congress says it expected to report on the resources necessary to supply the public with raw legislative data within the first part of the calendar year. It established a bulk data task force that has never completed its deliberations. (http://bit.ly/A4c5le)
Rep. Bill Foster introduced HR 6289 (in the 111th Congress) that would require some legislative data to be made available in bulk and create a THOMAS advisory committee. (Sep. 2010) (http://1.usa.gov/HZthAp)
Congressional Facebook Hackathon endorses bulk access to legislative data as an action item: "Release Structured Machine-Readable Legislative Data: Providing legislative data in a bulk format to enable third-party developers to create more dynamic interfaces for legislative information." (November 2011) (http://1.usa.gov/ygzQpl)
30 organizations and companies call for bulk access to legislative data and the creation of an advisory committee. (April 6, 2012)
Library of Congress letter to Committee on House Administration on THOMAS (4/31/2008)
Civil Society Organization Resources
30 Organizations Send Letters to Appropriators and Rulemakers regarding bulk access to THOMAS (April 10, 2012)
Comments Submitted for the Record by Joshua Tauburer for House Committee on Appropriations Subcommittee on the Legislative Branch regarding bulk data for legislative information (Febuary 6, 2012)
Open House Project Report: "Congressional Information & the Internet: A Collaborative Examination of the House of Representatives and Internet Technology" Chapter 3: Legislation Database (May 8, 2007)
Access to Government Information on the Internet Interpersonal Computing and Technology Journal (10/1993) - Discusses the precursor to THOMAS, the Library of Congress Information System (LOCIS)
The Senate Committee on Rules and Administration and House
Oversight Committee, per the recommendation of the Secretary of
the Senate and the Clerk of the House, have approved the establishment
of a data standards program, including standard generalized
markup language [SGML] for data interchange of legislative
information. The purpose of this program is to ensure that the
preparation and exchange of legislative information is made more
efficient through the use of data standards. Once published these
standards will be used by all legislative branch agencies, including
GPO in transmitting and producing information which is utilized
in the legislative process. The Secretary of the Senate and Clerk
of the House will be responsible for updating and maintaining and
publishing the data interchange standards for legislative information.
S. Rept. 105-204 (accompanying Legislative Branch Appropriations Act, 1999)
In the conference report (H. Rept. 104–733) accompanying the
fiscal year 1997 legislative branch appropriation bill (Public Law
104–197), the Congressional Research Service was directed to coordinate,
and the Library of Congress was directed to provide technical
support, for the development of a legislative information retrieval
system to serve the Senate.
The Senate has undertaken a major program to rebuild its systems
for creating and managing its legislative information. Although
this program is going to take a number of years to complete,
the Senate is already realizing benefits from this program.
The Secretary of the Senate, with the technical support of the Sergeant
at Arms, is providing Senate offices floor amendments electronically
minutes after being introduced on the floor.
The retrieval system being designed and maintained to provide
a comprehensive legislative resource by the CRS and supported by
the Library is proving to be a valued recourse for Senate and congressional
office. CRS and the Library are, therefore, directed to
continue their development of the legislative retrieval system for
the Senate and provide an annual report outlining the strategic objective
of this initiative.
H. Conf. Rept. 105-734 (accompanying HR 4112, Legislative Branch Appropriations Bill for FY ending Sept. 30, 1999)
The conferees agree with language in the House report directing
the Library to develop measurements of the extent of the collections
security problem and with language in the Senate report urging
the Library to continue efforts to assist the Senate with a legislative
information retrieval system.
H. Rept. 106-635 (accompanying the Legislative Branch Appropriations Bill 2001)
Information security is a collective responsibility within the legislative
branch. The Clerk of the House in consultation with the Secretary
of the Senate shall consult with all legislative branch entities
that create or store legislative information in electronic form
and prepare standards and procedures for ensuring the security of
such information as well as for establishing a process to routinely
assess risks to the security of legislative information.
The Clerk in consultation with the Secretary shall submit proposals
for standards and procedures for approval to the Committee
on House Administration and the Senate Committee on Rules and
Administration, respectively, on a date to be specified by those
Committees. Upon approval, the Clerk, the Secretary, and the legislative
branch entities shall provide their plans to the House Committee
on Appropriations and Senate Committee on Appropriations.
The Library of Congress and the Government Printing Office
shall work with the Clerk and the Secretary to test, develop, and
implement, no later than January 3, 2001, systems that will enable
them to confirm the authenticity of such legislative information.
S. Rept. 107-37 (accompanying S. 1172, Legislative Branch Appropriations 2002)
The Committee recommends an appropriation of $8,571,000 for
expenses of the Office of the Secretary. The Committee has included
$7,000,000 for the Legislative Information System Augmentation
Project.
Joint Explanatory Statement, House Committee on Appropriations, Omnibus Act, 2009 (accompanying H.R. 1105 / Public Law 111-8, Omnibus Appropriations Act of 2009)
See Book G, explanatory statement on Congressional Research Service Salaries and Expenses, the paragraph starting with the phrase "Public Access to Legislative Data" (or page 10 of this PDF) (March 2009).
Public Access to Legislative Data.--There is support for enhancing public access to legislative documents, bill status, summary information, and other legislative data through more direct methods such as bulk data downloads and other means of no-charge digital access to legislative databases. The Library of Congress, Congressional Research Service, and Government Printing Office and the appropriate entities of the House of Representatives are directed to prepare a report on the feasibility of providing advanced search capabilities. This report is to be provided to the Committees on Appropriations of the House and Senate within 120 days of the release of Legislative Information System 2.0.
H. Rept. 112-511 (accompanying HR 5882, Legislative Branch Appropriations Bill for 2013)
During the hearings this year, the Committee heard testimony
on the dissemination of congressional information products in Extensible
Markup Language (XML) format. XML permits data to be
reused and repurposed not only for print output but for conversion
into ebooks, mobile web applications, and other forms of content delivery
including data mashups and other analytical tools. The Com-
mittee has heard requests for the increased dissemination of congressional
information via bulk data download from non-governmental
groups supporting openness and transparency in the legislative
process. While sharing these goals, the Committee is also
concerned that Congress maintains the ability to ensure that its
legislative data files remain intact and a trusted source once they
are removed from the Government’s domain to private sites.
The GPO currently ensures the authenticity of the congressional
information it disseminates to the public through its Federal Digital
System and the Library Congress’s THOMAS system by the
use of digital signature technology applied to the Portable Document
Format (PDF) version of the document, which matches the
printed document. The use of this technology attests that the digital
version of the document has not been altered since it was authenticated
and disseminated by GPO. At this time, only PDF files
can be digitally signed in native format for authentication purposes.
There currently is no comparable technology for the application
and verification of digital signatures on XML documents.
While the GPO currently provides bulk data access to information
products of the Office of the Federal Register, the limitations on
the authenticity and integrity of those data files are clearly spelled
out in the user guide that accompanies those files on GPO’s Federal
Digital System.
The GPO and Congress are moving toward the use of XML as
the data standard for legislative information. The House and Senate
are creating bills in XML format and are moving toward creating
other congressional documents in XML for input to the GPO.
At this point, however, the challenge of authenticating downloads
of bulk data legislative data files in XML remains unresolved, and
there continues to be a range of associated questions and issues:
Which Legislative Branch agency would be the provider of bulk
data downloads of legislative information in XML, and how would
this service be authorized. How would ‘‘House’’ information be differentiated
from ‘‘Senate’’ information for the purposes of bulk data
downloads in XML? What would be the impact of bulk downloads
of legislative data in XML on the timeliness and authoritativeness
of congressional information? What would be the estimated
timeline for the development of a system of authentication for bulk
data downloads of legislative information in XML? What are the
projected budgetary impacts of system development and implementation,
including potential costs for support that may be required
by third party users of legislative bulk data sets in XML, as well
as any indirect costs, such as potential requirements for Congress
to confirm or invalidate third party analyses of legislative data
based on bulk downloads in XML? Are there other data models or
alternative that can enhance congressional openness and transparency
without relying on bulk data downloads in XML?
The Committee directs the establishment of a task force composed
of staff representatives of the Library of Congress, the Congressional
Research Service, the Clerk of the House, the Government
Printing Office, and such other congressional offices as may
be necessary, to examine these and any additional issues it considers
relevant and to report back to the Committee on Appropriations
of the House and Senate.
Documents and Reports Prepared by Congress and Legislative Branch Support Agencies
Duplication Among Legislative Tracking Systems: Findings, A Report Prepared by the Library of Congress for the House and Senate Appropriations Committees Pursuant to House Report 103-517 and House Report 104-141, July 14, 1995
The Legislative Information System Strategic Objective Report, FY2012
Ideas for Upgrading THOMAS
Top Suggestions
Bulk Access to THOMAS data
Incorporate open data principles
Meta Suggestions
Have regular roundtable discussions with members of public and government to discuss ideas for improving THOMAS
Create THOMAS users group (email discussion?)
Programmer access page: for XML access, RSS feeds, email sign ups, etc.
Work to improve parsability of all search results; more structured data
All bills in XML
Singe page (no pagination) that lists every bill in Congress with status; updated daily on a new page (for scraping); preferably in a feed or XML format
Create and make public unique IDs for commonly used entities (or draw upon those created by others)