Friday, August 26, 2011

Example #12 - Pull data out of an API response

Let's take a look at how you might access specific values from the JSON-formatted response from an API request.

To get some data to play around with, we'll send one API request for metadata for the views in the Chicago data portal.  The response, in JSON, is a set of metadata structured as a list.  This is a SODA View[].  Each element of the list contains the metadata for one view.  That metadata (a SODA View), in turn, is represented as a number of key-value pairs.  Some of the values are themselves a set of nested key-value pairs.  You can look at the output from the snippet #4 to see the metadata for one view.

The example code does eight actions to illustrate some of the ways to access and use the data returned from an API request.  You can generalize these to accessing data from other API requests.

Example Code

#
# Example12.py - Shows code to access data in an API response
#
import json
import urllib2
import pprint
import time

# Get metadata for views in the Chicago data portal
fileHandle = urllib2.urlopen("http://data.cityofchicago.org/api/views.json")
views = json.load(fileHandle)

# 1 - Get the number of views that were returned
numViews = len(views)
print '\nNUMBER OF VIEWS RETURNED = %s\n' % numViews

# 2 - Access one of the metadata values.  Get the name of the first view.
viewName = views[0]['name']
print 'NAME OF FIRST VIEW = %s\n' % viewName

# 3 - Iterate through the views and print the name of each view
print 'NAMES OF ALL %s VIEWS DOWNLOADED:' % numViews
for view in views:
print view['name']

# 4 - Print all metadata for first view, both the keys and values, including all nested pairs 
print '\nFORMATTED DUMP OF THE FIRST VIEW SHOWING ALL KEYS AND VALUES:'
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(views[0])

# 5 - Iterate thru first view and print the keys.  Does not print nested keys
print '\nHIGH-LEVEL FIELDS IN A VIEW:'
for key in views[0].iterkeys():
print key

# 6 - Access a nested key-value pair.  Print the name of the view owner.
ownerName = views[0]['owner']['displayName']
print '\nNAME OF THE OWNER OF THE FIRST VIEW %s\n' % ownerName

# 7 - Get and print a time value.  It's in seconds since the epoch.  
secsOwnerProfileModified =  views[0]['owner']['profileLastModified']
print 'OWNER OF FIRST VIEW LAST MODIFIED HIS/HER USER PROFILE %s SECONDS SINCE JANUARY 1, 1970\n' % secsOwnerProfileModified

# 8 - Now print the time value in local timezone.
localtimeOwnerProfileModified = time.ctime(secsOwnerProfileModified)
print 'OWNER LAST MODIFIED HIS/HER USER PROFILE ON %s\n' %localtimeOwnerProfileModified

Explanations of each snippet
By the numbers, you'll see the few lines of code, an explanation, and the portion of the output generated by this section of the code.

Convert JSON and load an array
# Get metadata for views in the Chicago data portal
fileHandle = urllib2.urlopen("http://data.cityofchicago.org/api/views.json")
views = json.load(fileHandle)
To  start, get that array of metadata of views.  First, make the API request to the city of Chicago data portal and ask for the metadata for all views.  Second, use json.load() to take data coming back from the API, parse the JSON, and place the elements the views list. The remaining eight numbered snippets work with that list.

Get the number of views returned
# 1 - Get the number of views that were returned
numViews = len(views)
print '\nNUMBER OF VIEWS RETURNED = %s\n' % numViews
How many individual views were sent back?  We can get the length of the list to find out.  The result shows there are 50 elements.  Hey, first discovery - with no parameters, the API defaults to return 50 of the 603 currently available views



Get a value from one view
# 2 - Access one of the metadata values.  Get the name of the first view.
viewName = views[0]['name']
print 'NAME OF FIRST VIEW = %s\n' % viewName
This may be something you'll be doing for an app.  You can pull out the individual strings that are the values for different pieces of the metadata.  Here, we are asking for the value associated with the key called name for the zeroeth element of the views list.  Translation:  get the name of the first view.


Iterate through all of the views
# 3 - Iterate through the views and print the name of each view
print 'NAMES OF ALL %s VIEWS DOWNLOADED:' % numViews
for view in views:
    print view['name']
Here's how you can step through the list of views and get a value for the name key in each view.  The output is a nice summary of the views from the portal.


Print all metadata for a view
# 4 - Print all metadata for first view, both the keys and values, including all nested pairs
print '\nFORMATTED DUMP OF THE FIRST VIEW SHOWING ALL KEYS AND VALUES:'
pp = pprint.PrettyPrinter(indent=4)
pp.pprint(views[0])
Want the details of the metadata for a view?  Here's everything we have for the first view.  pprint outputs each key-value pair on its own line.  The nested pairs are indented four spaces.  You can use this for other API responses to see everything that you receive in a layout that is somewhat easy to read.  The u'....' means ... is a unicode string.


Iterate through one view
# 5 - Iterate thru first view and print the keys.  Does not print nested keys
print '\nHIGH-LEVEL FIELDS IN A VIEW:'
for key in views[0].iterkeys():
    print key
We iterated through all views and printed one value from each view.  Now, let's go inside one view and iterate through it.  We'll step through all the key-value pairs and just print the key.  We aren't going deeper into the nested key-value pairs.  For example, the key owner has as it's value a whole set of key-value pairs.  We'll just print out owner and move on.  The result is a nice list of all the data elements in the View data type.  Note that it differs from the reference material on the SODA site.  That documentation does not match the API's operation.


Get a nested key-value pair
# 6 - Access a nested key-value pair.  Print the name of the view owner.
ownerName = views[0]['owner']['displayName']
print '\nNAME OF THE OWNER OF THE FIRST VIEW %s\n' % ownerName
Now, let's dig into one of those nested dictionaries.  The owner key has a value that consists of nine key-value pairs.  Here's how you can get one of those.  The code goes to the first view and then to the owner key and then the displayName key of within the owner key.  Translation:  what's the name of the owner of the first view?


Get a time value
# 7 - Get and print a time value.  It's in seconds since the epoch. 
secsOwnerProfileModified =  views[0]['owner']['profileLastModified']
print 'OWNER OF FIRST VIEW LAST MODIFIED HIS/HER USER PROFILE %s SECONDS SINCE JANUARY 1, 1970\n' % secsOwnerProfileModified
Some of the data elements are timestamps.  This example and the next one look at those.  We'll work with the element that represents the time when the owner of the first view last modified his or her user profile.  These values are represented as the number of seconds since January 1, 1970, which is also called the epoch.  This snippet prints out that value.


Convert time in seconds to local time
# 8 - Now print the time value in local timezone.
localtimeOwnerProfileModified = time.ctime(secsOwnerProfileModified)
print 'OWNER LAST MODIFIED HIS/HER USER PROFILE ON %s\n' %localtimeOwnerProfileModified
Seeing the total number of seconds isn't very useful to us (it is, however, very useful for comparing times or computing the difference between times).  Here we'll use a method to create a string that has day, month, time of day, and year in the local timezone.  Now, it's easier understand when the view's owner's profile was modified.


Final comments
Remember, if you run this code, your output may differ from what's shown.  When you run it, the portal may have a different sets of views with different values.

No comments:

Post a Comment