Working in Python
Overview
Teaching: 30 min
Exercises: 10 minQuestions
What does it look like to make API requests in python and work with data in the responses?
Objectives
Take a look at a variety of ways to make responses and get requests of different types.
Now how do we make API requests in python? Let’s explore some live notebooks:
Example with ERRDAP: csv to data frame
Let’s go through another example using the ERDDAP API to get a data table as csv and get it into a pandas data frame:
Example with ERRDAP: json to data frame
Let’s go through another example using the ERDDAP API to get the same data as the last example. Except this time we will review the data as json and get it into a data frame.
Working with dictionaries
In the previous episode we saw that some APIs will return data formatted as JSON, including names (or keys) and values associated with them.
Since we would ultimately like to work with data from these APIs in Python, it would be nice if Python had a data structure that behaved similarly. In the Software Carpentry introduction to Python, we learned about lists, which are ordered collections of things, indexed by their position in the ordering. What we would like here is similarly a collection, but rather than having ordering and indexing by position, instead we would like elements to have an arbitrary index of our choice.
In fact, Python has such a collection built into it; it is called a dict
(short for dictionary). Let’s construct one now, to hold data from the Mayo
Clinic about caffeine levels in various beverages.
caffeine_mg_per_serving = {'coffee': 96, 'tea': 47, 'cola': 24, 'energy drink': 29}
We see here that the dict
is created within curly braces {}
, and contains
keys and corresponding values separated by a :
, with successive pairs
being separated by a ,
like in a list.
Again, similarly to a list, we can access elements of the dict
with square
brackets []
. For example, to get the number of mg of caffeine per serving of
coffee, we could use the following:
print("Coffee has", caffeine_mg_per_serving['coffee'], "mg of caffeine per serving")
Coffee has 96 mg of caffeine per serving
We can also replace elements in the same way that we can for a list. For
instance, you may have spotted that the value for 'cola'
is incorrect. Let’s
fix that now.
caffeine_mg_per_serving['cola'] = 22
print(caffeine_mg_per_serving)
{'coffee': 96, 'tea': 47, 'cola': 22, 'energy drink': 29}
One thing that we can’t do for lists is create new elements by indexing with
[]
. But dict
s let us do that, as well:
caffeine_mg_per_serving['green tea'] = 28
print(caffeine_mg_per_serving)
{'coffee': 96, 'tea': 47, 'cola': 22, 'energy drink': 29, 'green tea': 28'}
Ordering
Python
dict
s historically were not ordered—you would not be guaranteed to get back results in the same order that you put them in. In more recent versions of Python,dict
s do preserve the ordering in which they are created, so'green tea'
, having been added most recently, appears at the end.
What if we want to know whether we can use a particular key? In a list, this is
simple, as we can check whether a particular index is less than the length of
the list. With a dict, we need to use a keyword to check whether a particular
key is in
the list:
'coffee' in caffeine_mg_per_serving
True
Alternatively, if we want to get an element of the list and use a default value
if the key isn’t found, we can use the .get()
method:
print(caffeine_mg_per_serving.get("coffee", 0))
print(caffeine_mg_per_serving.get("hot chocolate", 0))
96
0
(If you don’t specify the default value, then Python uses None
for keys that
are not found.)
Now, a particularly useful thing to do with a list is to loop over it. What
happens when we loop over a dict
?
for item in caffeine_mg_per_serving:
print(item)
coffee
tea
cola
energy drink
green tea
Looping (or otherwise iterating) over a dict
in fact loops over its keys. This
matches with what the in
keyword does—it would be strange for the two to
look at different aspects of the dict
. But sometimes we may want to use the
values as well as the keys in a loop. We could index back into the dict
via
the key, but that is repetitive. We can instead use the .items()
method of the
dict
:
for drink, quantity in caffeine_mg_per_serving.items():
print(drink.capitalize(), "contains", quantity, "mg of caffeine per serving")
Coffee contains 96 mg of caffeine per serving
Tea contains 47 mg of caffeine per serving
Cola contains 22 mg of caffeine per serving
Energy drink contains 29 mg of caffeine per serving
Green tea contains 28 mg of caffeine per serving
What’s in a key?
In this episode, we have used strings as keys, as this is what we’re most likely to see when working with JSON. This is not a Python restriction, however. We can use any “hashable” type as a
dict
key; this includes strings, numbers, and tuples, among other types.
dict
s of functionsWhat will the following code do?
import numpy as np operations = { 'min': np.min, 'max': np.max } def process(array, operation): return operations[operation](array) print(process([1, 4, 7, 2, -3], 'min'))
When might this kind of behaviour be useful?
Try adjusting the example so that
'mean'
and'std'
also work as you might expect.Solution
This will pull the described function out of the dictionary. This could be useful when you want to allow the user to decide what functionality is desired at run-time, perhaps in a configuration file. Perhaps a choice of inversion algorithms or fitting functions could be offered.
To add other functions, the
operations
dict
could be adjusted as:operations = { 'min': np.min, 'max': np.max, 'mean': np.mean, 'std': np.std }
Deeper dive on working in Python
See (Introduction to the Web and Online APIs: Requests)[https://edbennett.github.io/web-novice/04-requests/index.html]
Key Points