Created by Nathan Kelber for JSTOR Labs under Creative Commons CC BY License
For questions/comments/improvements, email nathan.kelber@ithaka.org.


Python Basics 4#

Description: This lesson describes the basics of writing your own functions including:

This lesson concludes with a description of popular Python packages and directions for installing them in Constellate.

This is part 4 of 5 in the series Python Basics that will prepare you to do text analysis using the Python programming language.

Use Case: For Learners (Detailed explanation, not ideal for researchers)

Difficulty: Beginner

Completion Time: 90 minutes

Knowledge Required:

Knowledge Recommended: None

Data Format: None

Libraries Used: time to put make the computer wait a few seconds random to generate random numbers

Research Pipeline: None


Functions#

We have used several Python functions already, including print(), input(), and range(). You can identify a function by the fact that it ends with a set of parentheses () where arguments can be passed into the function. Depending on the function (and your goals for using it), a function may accept no arguments, a single argument, or many arguments. For example, when we use the print() function, a string (or a variable containing a string) is passed as an argument.

Functions are a convenient shorthand, like a mini-program, that makes our code more modular. We don’t need to know all the details of how the print() function works in order to use it. Functions are sometimes called “black boxes”, in that we can put an argument into the box and a return value comes out. We don’t need to know the inner details of the “black box” to use it. (Of course, as you advance your programming skills, you may become curious about how certain functions work. And if you work with sensitive data, you may need to peer in the black box to ensure the security and accuracy of the output.)

Libraries and Modules#

While Python comes with many functions, there are thousands more that others have written. Adding them all to Python would create mass confusion, since many people could use the same name for functions that do different things. The solution then is that functions are stored in modules that can be imported for use. A module is a Python file (extension “.py”) that contains the definitions for the functions written in Python. These modules (individual Python files) can then be collected into even larger groups called packages and libraries. Depending on how many functions you need for the program you are writing, you may import a single module, a package of modules, or a whole library.

The general form of importing a module is: import module_name

You may recall from the “Getting Started with Jupyter Notebooks” lesson, we imported the time module and used the sleep() function to wait 5 seconds.

# A program that waits five seconds then prints "Done"

import time # We import all the functions in the `time` module

print('Waiting 5 seconds...')
time.sleep(5) # We run the sleep() function from the time module using `time.sleep()`
print('Done')
Waiting 5 seconds...
Done

We can also just import the sleep() function without importing the whole time module. The syntax is:

from module import function

# A program that waits five seconds then prints "Done"

from time import sleep # We import just the sleep() function from the time module


print('Waiting 5 seconds...')

sleep(5) # Notice that we just call the sleep() function, not time.sleep()
print('Done')
Waiting 5 seconds...
Done

Writing a Function#

In the above examples, we called a function that was already written. However, we can also create our own functions!

The first step is to define the function before we call it. We use a function definition statement followed by a function description and a code block containing the function’s actions:

def my_function():
    """Description of what the functions does"""
    python code to be executed

After the function is defined, we can call on it to do us a favor whenever we need by simply executing the function like so:

my_function()

After the function is defined, we can call it as many times as we want without having to rewrite its code. In the example below, we create a function called complimenter_function then call it twice.

# Create a complimenter function
def complimenter_function():
    """prints a compliment""" # Function definition statement
    print('You are looking great today!')

After you define a function, don’t forget to call it to make it do the work!

# Give a compliment by calling the function
complimenter_function()
You are looking great today!

Ideally, a function definition statement should specify the data that the function takes and whether it returns any data. The triple quote notation can use single or double quotes, and it allows the string for the definition statement to expand over multiple lines in Python. If you would like to see a function’s definition statement, you can use the help() function to check it out.

# Examining the function definition statement for our function
# Note that the parentheses are not included with complimenter_function
help(complimenter_function)
Help on function complimenter_function in module __main__:

complimenter_function()
    prints a compliment
# Try using help() to read the definition for the sleep function

Parameters vs. Arguments#

When we write a function definition, we can define a parameter to work with the function. We use the word parameter to describe the variable in parentheses within a function definition:

def my_function(input_variable):
    """Takes in X and returns Y"""
    do this task

In the pseudo-code above, input_variable is a parameter because it is being used within the context of a function definition. When we actually call and run our function, the actual variable or value we pass to the function is called an argument.

# Change the complimenter function to give user-dependent compliment
def complimenter_function(user_name):
    """Takes in a name string, prints a compliment with the name"""
    print(f'You are looking great today, {user_name}!')
# Pass an argument to a function
complimenter_function('Sam')
You are looking great today, Sam!

Arguments can be passed in based on parameter order (positional) or they can be explicitly passed using an =. (This could be useful if we wanted to pass an argument for the 10th parameter, but we did not want to pass arguments for the nine other parameters defined before it.)

# Pass an argument with =
# user_name is the parameter, 'Sam' is the argument
complimenter_function(user_name='Sam')
You are looking great today, Sam!

In the above example, we passed a string into our function, but we could also pass a variable. Try this next. Since the complimenter_function has already been defined, you can call it in the next cell without defining it again.

# Ask the user for their name and store it in a variable called name
# Then call the complimenter_function and pass in the name variable

A variable passed into a function could contain a list or dictionary.

# A list of names
list_of_names = ['Jenny', 'Pierre', 'Hamed']

def greet_a_list(names):
    """takes a list of names and prints out a greeting for each name"""
    for name in names:
        print(f'Hi {name}!')

greet_a_list(list_of_names)
Hi Jenny!
Hi Pierre!
Hi Hamed!

The Importance of Avoiding Duplication#

Using functions makes it easier for us to update our code. Let’s say we wanted to change our compliment. We can simply change the function definition one time to make the change everywhere. See if you can change the compliment given by our complimenter function.

# Create a complimenter function that gives compliment
def complimenter_function(user_name):
    """Takes in a name string, prints a compliment with the name"""
    print(f'You are looking great today, {user_name}!')
# Give a new compliment by calling the function
name = input('What is your name? ')
complimenter_function(name)

friend = input('Who is your friend? ')
complimenter_function(friend)
You are looking great today, John!
You are looking great today, Jane!

By changing our function definition just one time, we were able to make our program behave differently every time it was called. If our program was large, it might call our custom function hundreds of times. If our code repeated like that, we would need to change it in every place!

Generally, it is good practice to avoid duplicating program code to avoid having to change it in multiple places. When programmers edit their code, they may spend time deduplicating (getting rid of code that repeats). This makes the code easier to read and maintain.


Coding Challenge! < / >

In the next cell, try writing a function that accepts a dictionary as an argument. Use a flow control statement to print out all the names and occupations for the contacts.


# A dictionary of names and occupations
contacts = {
 'Amanda Bennett': 'Engineer, electrical',
 'Bryan Miller': 'Radiation protection practitioner',
 'Chris Garrison': 'Planning and development surveyor',
 'Debra Allen': 'Intelligence analyst'}

# Define and then call your function here

Function Return Values#

Whether or not a function takes an argument, it will always return a value. If we do not specify that return value in our function definition, it is automatically set to None, a special value like the Boolean True and False that simply means null or nothing. (None is not the same thing as, say, the integer 0.) We can also specify return values for our function using a flow control statement followed by return in a code block.

If you don’t write a Return statement in your function, a None value will be returned. If you don’t write a Return statement in your function, a None value will be returned.

# Find out the returned value for the following function

def complimenter_function(user_name):
    """Takes in a name string, prints a compliment with the name"""
    print(f'You are looking great today, {user_name}!')

print(complimenter_function('Sam'))
You are looking great today, Sam!
None

Instead of automatically printing inside the function, the better approach is to return a string value and let the user decide whether to print it or do something else with it. Ideally, our function definition statement should indicate what goes into the function and what is returned by the function.

# Adding a return statement
def complimenter_function(user_name):
    """Takes in a name string, returns a compliment with the name""" # We are returning now
    return f'You are looking great today, {user_name}!'

compliment = complimenter_function('Sam')
print(compliment)
You are looking great today, Sam!

Returning the string allows the programmer to use the output instead of just printing it automatically. This is usually the better practice.

We can also offer multiple return statements with flow control. Let’s write a function for telling fortunes. We can call it fortune_picker and it will accept a number (1-6) then return a string for the fortune.

# A fortune-teller program that contains a function `fortune_picker`
# `fortune_picker` accepts an integer (1-6) and returns a fortune string

def fortune_picker(fortune_number): # A function definition statement that has a parameter `fortune_number`
    """takes an integer (1-6) and returns a fortune string"""
    if fortune_number == 1:
        return 'You will have six children.'
    elif fortune_number == 2:
        return 'You will become very wise.'
    elif  fortune_number == 3:
        return 'A new friend will help you find yourself.'
    elif fortune_number == 4:
        return 'A great fortune is coming to you.'
    elif fortune_number == 5:
        return 'That promising venture... it is a trap.'
    elif fortune_number == 6: 
        return 'Sort yourself out then find love.'

fortune = fortune_picker(3) # return a fortune string and store it in fortune
print(fortune)
A new friend will help you find yourself.

In our example, we passed the argument 3 that returned the string 'A new friend will help you find yourself'. To change the fortune, we would have to pass a different integer into the function. To make our fortune-teller random, we could import the function randint() that chooses a random number between two integers. We pass the two integers as arguments separated by a comma.

# A fortune-teller program that uses a random integer

from random import randint # import the randint() function from the random module

def fortune_picker(fortune_number): # A function definition statement that has a parameter `fortune_number`
    if fortune_number == 1:
        return 'You will have six children.'
    elif fortune_number == 2:
        return 'You will become very wise.'
    elif  fortune_number == 3:
        return 'A new friend will help you find yourself.'
    elif fortune_number == 4:
        return 'A great fortune is coming to you.'
    elif fortune_number == 5:
        return 'That promising venture... it is a trap.'
    elif fortune_number == 6: 
        return 'Sort yourself out then find love.'

random_number = randint(1, 6) # Choose a random number between 1 and 6 and assign it to a new variable `random_number`
fortune = fortune_picker(random_number) # Return a fortune string

print('Your fortune is: ')
print(fortune)
    
while True:
    print('Would you like another fortune?')
    repeat_fortune = input()
    if repeat_fortune == 'yes' or repeat_fortune == 'Yes':
        random_number = randint(1, 6) 
        print(fortune_picker(random_number))
        continue
    else:
        print('I have no more fortunes to share.')
        break
Your fortune is: 
Sort yourself out then find love.
Would you like another fortune?
I have no more fortunes to share.

Coding Challenge! < / >

Try writing a function that accepts user inputting a name and returns the person’s occupation. You can use the .get() method to retrieve the relevant occupation, such as:

contacts.get('Amanda', 'No contact with that name')

Remember, the second string will be returned if the name ‘Amanda’ is not in our dictionary.


# A program that returns the occupation when users supply a given name

# A dictionary of names and occupations
contacts = {
 'Amanda': 'Engineer, electrical',
 'Bryan': 'Radiation protection practitioner',
 'Christopher': 'Planning and development surveyor',
 'Debra': 'Intelligence analyst'}

# The function definition and program

Local and Global Scope#

We have seen that functions make maintaining code easier by avoiding duplication. One of the most dangerous areas for duplication is variable names. As programming projects become larger, the possibility that a variable will be re-used goes up. This can cause weird errors in our programs that are hard to track down. We can alleviate the problem of duplicate variable names through the concepts of local scope and global scope.

We use the phrase local scope to describe what happens within a function. The local scope of a function may contain a local variables, but once that function has completed the local variables and their contents are erased.

On the other hand, we can also create global variables that persist at the top-level of the program and also within the local scope of a function.

Ideally, Python programs should limit the number of global variables and create most variables in a local scope. This keeps confounding variables localized in functions where they are used and then discarded.

# Demonstration of global variable being used in a local scope
# The program crashes when a local variable is used in a global scope
global_string = 'global'

def print_strings():
    print('We are in the local context:')
    local_string = 'local'
    print(global_string)
    print(local_string)
    

print_strings()
We are in the local context:
global
local

The code above defines a global variable global_string with the value of ‘global’. A function, called print_strings, then defines a local variable local_string with a value of ‘local’. When we call the print_strings() function, it prints the local variable and the global variable.

# The function has closed, now the local string has been discarded
print('We are now in the global context: ')
print(global_string)
print(local_string)
We are now in the global context: 
global
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[21], line 4
      2 print('We are now in the global context: ')
      3 print(global_string)
----> 4 print(local_string)

NameError: name 'local_string' is not defined

After the print_strings() function completes, we try to print both variables in a global scope. The program prints global_string but crashes when trying to print local_string in a global scope.

It’s a good practice not to name a local variable the same thing as a global variable. If we define a variable with the same name in a local scope, it becomes a local variable within that scope. Once the function is closed, the global variable retains its original value.

# A demonstration of global and local scope using the same variable name
# print(string) returns two different results
string = 'global'

def share_strings():
    string = 'local'
    print(string)

share_strings()
local
# print the string variable in the global context
print(string)
global

Installing a Python Package in Constellate#

If you would like to install a package that is not in Constellate, we recommend using the pip installer with packages from the Python Package Index. In a code cell insert the following code:

!pip install package_name

for the relevant package you would like to install. The exclamation point indicates the line should be run as a terminal command.

Refer to the package’s documentation for guidance.

# Install Scrapy
!pip install scrapy

Lesson Complete#

Congratulations! You have completed Python Basics 4. There is one more lesson in Python Basics:

  • Python Basics 5

Start Next Lesson: Python Basics 5#

Exercise Solutions#

Here are a few solutions for exercises in this lesson.

# A dictionary of names and occupations
contacts = {
 'Amanda Bennett': 'Engineer, electrical',
 'Bryan Miller': 'Radiation protection practitioner',
 'Chris Garrison': 'Planning and development surveyor',
 'Debra Allen': 'Intelligence analyst'}

# Define and then call your function here

def print_contacts(contacts_names):
    """Prints out all the contacts in a contacts dictionary"""
    for name, occupation in contacts.items():
        print(name.ljust(15), '|', occupation)

print_contacts(contacts)
Amanda Bennett  | Engineer, electrical
Bryan Miller    | Radiation protection practitioner
Chris Garrison  | Planning and development surveyor
Debra Allen     | Intelligence analyst
# A dictionary of names and occupations
contacts = {
 'Amanda': 'Engineer, electrical',
 'Bryan': 'Radiation protection practitioner',
 'Christopher': 'Planning and development surveyor',
 'Debra': 'Intelligence analyst'}


def occupation_finder(name):
    """Allows a user to find the occupation of a particular contact"""
    return contacts.get(name, 'No contact with that name')
    
while True:
    print('Enter a name to look up an occupation (or enter quit):')
    name = input()
    if name == 'quit':
        print('Shutting down..')
        break
    else:
        print(occupation_finder(name))
        continue
Enter a name to look up an occupation (or enter quit):
Engineer, electrical
Enter a name to look up an occupation (or enter quit):
Shutting down..