StateRep.Me Blog

By Christopher M Brown

StateRep.Me would not be possible without great open data available from places like the Sunlight Foundation's OpenStates project. In the spirit of giving back to the civic hacking and open data movement, we are opening up some of the new data we've collected and crunched to the wider community so that others can hopefully put it to good use.

Version 0.10 of the StateRep.Me API makes available the ideology scores and press releases for state legislators in PA. We hope to add more original data for the API going forward. We will not be adding data already available through existing APIs such as OpenStates to StateRep.Me's API, focusing instead on giving new data to the community.

Press Releases

When putting together StateRep.Me we wanted to add to some of the already existing data out there. One hole in existing APIs and data on state legislators were press releases. We agree with research in political science that press releases can serve as a useful metric for the priorities of legislators so we knew it would be important to grab it.

However, getting this data was not easy. No central repository for state legislator personal homepages or press releases already exist. Not only did we have to scrape for the press releases, we had to find all the personal pages of legislators where their press releases are posted. Scraping their websites also posed a number of challenges. While there is some consistency among the layout of homepages and the placement of press releases on the homepages of legislators, there are many edge cases that required special care when writing web scrapers. Additionally, some of the html and coding on these webpages can be kind of crude and hard to understand. To get an idea of what it took to grab the data, check out the web-scraping code we used on Github.

There are a total of 15,908 press releases available through the API as of December 13, 2012. The earliest press release in our database is from January 14, 2001; however, we can only be certain that our data covers 2011-2012 comprehensively because we only grabbed press releases from members serving in the 2011-2012 legislative session.

The database itself will be updated nightly. There are bound to be errors and inconsistencies that we are not aware of yet, so if you find some we would appreciate letting us know via e-mail or on the issue-tracker on StateRep.Me's github account.

Ideology Scores

Another area we saw an opening for adding some value was using voting records as a way to measure ideology. With sometimes over 1,000 votes being taken in a single legislative session it can be a daunting task trying to make sense of what it all means. Drawing on research in political science, we used a common method to scale and determine the ideology of legislators based on their voting record.

I'm going to leave a more detailed explanation of the statistical methods, shortcomings, and advantages of the method we decided to use to a later blog post. However, if you are interested in seeing the code used to produce these scores the code is available on Github. For the ambitious, you can check out the linked academic articles above as well.

For StateRep.Me we use these scores to produce the plots that display the distribution of ideology and determine the conservative or liberal ranking of state legislators. We have plans to estimate ideology and polarization for previous legislative sessions as well.

Talk is Cheap, Show me the Code!

The API is constructed using the TastyPieAPI app for Django. This made it easy to set up and will hopefully enable us to add more functionality as we develop StateRep.Me further.

The search URL is fairly straightforward and is similar to many other APIs publicly available.

staterep.me/api/v1/[DATA-TYPE]/?[SEARCH PARAMETERS]

The results are returned as JSON objects (we hope to add other functionality as well). There is a limit of 20 search results per page, with a link provided to the next page of results if there are more than 20.

There are 2 data types:

  • preferences
  • press_releases

It is possible to filter results for preferences by chamber (u=State Senate, l=House), OpenStates Legislator ID (eg. PAL000001), or party (R=Republican, D=Democrat). Each search will return a JSON object with 2 top-level dictionaries:

  • meta provides information on the search and its results
  • objects contains the actual data you are probably interested in

Get all Democrats in the House:

/api/v1/preferences/?chamber__startswith=l&party__startswith=D

Result:


    {
       "meta":{
          "limit":20,
          "next":"/api/v1/preferences/?chamberstartswith=l&partystartswith=D&limit=20&offset=20",
          "offset":0,
          "previous":null,
          "total_count":90
       },
       "objects":[
          {
             "chamber":"lower",
             "ideology":-0.753263832506344,
             "legid":"PAL000053",
             "party":"Democratic"
          },
        ...
          {
             "chamber":"lower",
             "ideology":-0.849487427764698,
             "legid":"PAL000088",
             "party":"Democratic"
          }
       ]
    }

When searching for press releases there are 3 possible search parameters:

  • date - you are able to search for press releases before, after, and between multiple dates
  • OpenStates legislative id

Results will include the full-text (if available), date, title, url, and OpenStates legislator ID.

An example query with results is below.

Find all press releases from the legislator with ID PAL000002 between August 1, 2011 and August 1, 2012.

 /api/v1/press_releases/?pr_date__gt=2011-08-01&pr_date__lt=2012-08-01&pr_legid=PAL000002

Results:
    {
   "meta":{
      "limit":20,
      "next":"/api/v1/press_releases/?pr_legid=PAL000002&pr_date__gt=2011-08-01&pr_date__lt=2012-08-01&limit=20&offset=20",
      "offset":0,
      "previous":null,
      "total_count":37
   },
   "objects":[
      {
         "pr_date":"2012-02-22",
         "pr_legid":"PAL000002",
         "pr_text":"Adjust  Text Size\n\n  \n  \n\nFor Immediate Release\n\nFebruary 22, 2012\n\nContact: Jon  Hopcraft\n\n(717) 787-2637\n\n(570) 773-0891\n\n#\n\n# Resources to Fight Blight at Your Fingertips\n\nMAHANOY CITY ...,
         "pr_title":"Resources to Fight Blight at Your Fingertips",
         "pr_url":"http://senatorargall.com/press/2012/0212/022212.htm"
      },
      {
         "pr_date":"2012-03-05",
         "pr_legid":"PAL000002",
         "pr_text":"Adjust  Text Size\n\n  \n  \n\nFor Immediate Release\n\nMarch 5, 2012\n\nContact: Jon  Hopcraft\n\n(717) 787-2637\n\n(570) 773-0891\n\n# Argall Bill Strengthens Downtown Location Law\n\nHARRISBURG Legislation strengthening existing law requiring state agencies  to\nbe located in a downtown unanimously passed the Senate Appropriations\nCommittee today.\u00a0 Senator Argall's Senate Bill 276 reinforces Act 32 of 2000,\nthe original Downtown Location  ... ,
         "pr_title":"Argall Bill Strengthens Downtown Location Law",
         "pr_url":"http://senatorargall.com/press/2012/0312/030512.htm"
        },
      ]
    }

Feedback Welcome

If you have questions or comments let us know! You can contact Chris Brown on twitter @notthatbreezy or through e-mail. You can also leave a comment to this blogpost or on our Github issue tracker.

Let us know if you'd like to see additional features or data and we'll do our best to help you out!

0 Reader Comments

Comments

Please log in to leave a comment.