<img align="left" src="https://ithaka-labs.s3.amazonaws.com/static-files/images/tdm/tdmdocs/CC_BY.png"><br />

Created by [Nathan Kelber](http://nkelber.com) for [JSTOR Labs](https://labs.jstor.org/) under [Creative Commons CC BY License](https://creativecommons.org/licenses/by/4.0/)<br />
____

# Python Basics 5

**Description:**
This notebook focuses on strings, preparing learners to use:
* Escape characters
* String methods

**Knowledge Required:** 
* [Getting Started with Jupyter Notebooks](../../../getting-started-with-jupyter.ipynb)
* [Python Basics 1](./python-basics-1.ipynb)
* [Python Basics 2](./python-basics-2.ipynb)
* [Python Basics 3](./python-basics-3.ipynb)
* [Python Basics 4](./python-basics-4.ipynb)

___

## The `print()` function



Often when working with strings, we use the `print()` function. A deeper understanding of `print()` will help us work with strings more flexibly.

### Escape characters

Python strings can use single or double quotes. If the string contains a single quote character, it may be beneficial to use double quotes. Try printing out the string in the next code cell:

In [1]:
# Print out a string using single or double quotes
string = 'Hello World: Here's a string.'
print(string)

SyntaxError: unterminated string literal (detected at line 2) (3745037211.py, line 2)

An easy solution would be to use double quotes, such as:
> string = "Hello World: Here's a string."

The use of double quotes keeps Python from ending the string prematurely. But what if your string contains both single and double quotes? Escape characters help us insert certain characters into a string. An escape character begins with a `\`. For example, we could insert a single quote into a string surrounded by single quotes by using an escape character.

In [2]:
# Print out a single quote in a Python string
string = 'There\'s an escape character in this string.'
print(string)

There's an escape character in this string.


The backslash character `\` in front of the single quote tells Python not to end the string prematurely. Of course, this opens a new question: How do we create a string with a backslash? The answer is another escape character using two backslashes.

In [3]:
# Print a backslash using an escape character
string = 'Adding a backslash \\ requires an escape character.'
print(string)

Adding a backslash \ requires an escape character.


Another option is to use a raw string, which ignores any escape characters. A raw string simply starts with an `r` similar to an `f` string.

In [4]:
string = r'No escape characters \ here'
print(string)

No escape characters \ here


Escape characters also do more than just allow us to add quotes and backslashes. They are also responsible for string formatting for aspects such as tabs and new lines.

|Code|Result|
|---|---|
|`\'`| ' |
|`\\`| \ |
|`\t`| tab |
|`\n`| new line|

In [5]:
# Print out a string with two lines


In [6]:
# Print out a string with a tab


___
<h3 style="color:red; display:inline">Try it! &lt; / &gt; </h3>

**Can you print a string with a new line? How about a tab?**
___

The newline escape character `\n` can affect readability for many lines. Consider this string containing four lines of a Shakespeare sonnet.

In [7]:
string = 'Shall I compare thee to a summer’s day?\nThou art more lovely and more temperate:\nRough winds do shake the darling buds of May,\nAnd summer’s lease hath all too short a date;\n'
print(string)

Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;



A more readable option is to create a string with a triple quote (single or double). This string type can also automatically interpret new lines and tabs.

In [8]:
# Print out Shakespeare's Sonnet 18
string = """Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature’s changing course untrimm'd;
But thy eternal summer shall not fade,
Nor lose possession of that fair thou ow’st;
Nor shall death brag thou wander’st in his shade,
When in eternal lines to time thou grow’st:
    So long as men can breathe or eyes can see,
    So long lives this, and this gives life to thee."""

print(string)


Shall I compare thee to a summer’s day?
Thou art more lovely and more temperate:
Rough winds do shake the darling buds of May,
And summer’s lease hath all too short a date;
Sometime too hot the eye of heaven shines,
And often is his gold complexion dimm'd;
And every fair from fair sometime declines,
By chance or nature’s changing course untrimm'd;
But thy eternal summer shall not fade,
Nor lose possession of that fair thou ow’st;
Nor shall death brag thou wander’st in his shade,
When in eternal lines to time thou grow’st:
    So long as men can breathe or eyes can see,
    So long lives this, and this gives life to thee.


### Formatted strings (f-strings)

An f-string can help us concatenate a variable inside of string. Consider this example where a print function must concatenate three strings:

In [9]:
# Greeting a user with a concatenated string
username = input('Hi. What is your name? ')

print('Hello ' + username + '!')

Hello John!


We used the `+` operator twice to concatenate `username` between the strings `'Hello '` and `'!'`. A simpler method would be to use an f-string. Similar to the way a raw string begins with an **r** `r'string'`, the formatted string begins with an **f** `f'string'`. The variable to be concatenated is then included in curly brackets `{}`.

In [10]:
# Print the username inside a formatted string
print(f'Hello {username}!')

Hello John!


### Using `print()` with a `sep` or `end` argument
The `print()` function can accept additional arguments such as `sep` or `end`. These can help format a string appropriately for output. By default, the print function will print many objects separated by a comma.

In [11]:
# Print multiple objects with a single print() statement
string1 = 'Hello'
string2 = 'World'
string3 = '!'

print(string1, string2, string3)

Hello World !


Notice that the `print()` function defaults to include a single space separator between the objects it prints. We can change this default separator by using the `sep` parameter.

In [12]:
# Use a plus as a separator
print(string1, string2, string3, sep='+')

Hello+World+!


We can even remove the separator by specifying an empty string.

In [13]:
# Specify an empty string for no separation
print(string1, string2, string3, sep='')

HelloWorld!


The print `print()` function also concatenates a new line by default. The is specified in the default argument `end='\n'`.

In [14]:
# Two strings printed on separate lines
print('Hello')
print('World')

Hello
World


___
<h3 style="color:red; display:inline">Try it! &lt; / &gt; </h3>

**Keeping both `print()` functions above, can you print the outputs on the same line?**
___

## String slices and methods

### String slices
The characters of a string can also be indexed and sliced like the items of a list. 

In [15]:
# Using a string index
string = 'Python Basics'
string[0]

'P'

In [16]:
# Slicing a string
string = 'Python Basics'
string[0:6]

'Python'

We can use flow control on a string the same way we would with a list.

In [17]:
# Use a for loop on the string
# To print each character except any letter 'o'
string = 'Hello World'


___
<h3 style="color:red; display:inline">Try it! &lt; / &gt; </h3>

**Can you use a `for` loop to print each character of the string `Hello World` without printing the letter 'o'?**
___

### String methods

There are a variety of methods for manipulating strings. 


|Method | Purpose | Form |
|---|---|---|
|.lower()| change the string to lowercase | string.lower()|
|.upper()| change the string to uppercase | string.upper()|
|.join()| joins together a list of strings | ' '.join(string_list)|
|.split()| splits strings apart | string.split()|
|.replace()| replaces characters in a string | string.replace(oldvalue, newvalue)|
|.rjust(), .ljust(), .center()| pad out a string | string.rjust(5)|
|.rstrip(), .lstrip(), .strip()| strip out whitespace | string.rstrip()|


All of the characters in a string can be lowercased with `.lower()` or uppercased with `.upper()`.

In [18]:
# Lowercase a string
string = 'Hello World'
string.lower()

'hello world'

These methods do not change the original string, but they return a string that can be saved to a new variable.

In [19]:
# The original string is unchanged
print(string)

# The returned string can be assigned to a new variable
new_string = string.upper()
print(new_string)

Hello World
HELLO WORLD


A string can be split on any character, or set of characters, passed into `.split()`. By default, strings are split on any whitespace including spaces, new lines, and tabs.

In [20]:
# Splitting a string on white space
string = 'This string will be split on whitespace.'
string.split()

['This', 'string', 'will', 'be', 'split', 'on', 'whitespace.']

In [21]:
# Splitting a phone string based on the '-' character
phone_string = '313-555-3434'
phone_string.split('-')

['313', '555', '3434']

Similarly, lists of strings can be joined together by passing them into `.join()`. A joining string must be specified before the `.join()`, even if it is the empty string `''`.

In [22]:
# List of strings joined together
name_list = ['Sam', 'Delilah', 'Jordan']
', '.join(name_list)

'Sam, Delilah, Jordan'

The `.strip()` method will strip leading and trailing whitespace (including spaces, tabs, and new lines) from a string. Remember, these changes will not affect the original string, but they can be assigned to a new variable.

In [23]:
# Stripping leading and trailing whitespaces from a string
string = '    Python Basics '
string.strip()

'Python Basics'

It is also possible to only strip whitespace from the right or left of a string.

In [24]:
# Stripping leading whitespace from the leftside of a string
string = '    Python Basics '
string.lstrip()

'Python Basics '

Characters in a string can be replaced with other characters using the `.replace()` method.

In [25]:
# Replacing characters in a string with .replace()
string = 'Hello world'
string.replace('l', 'x')

'Hexxo worxd'

In [26]:
# Removing characters from a string
# using .replace with an empty string
string = 'Hello! World!'
string.replace('!', '')

'Hello World'

Finally, strings can be justified (or padded out) with characters leading, trailing, or both. By default, strings are justified with spaces but other characters can be specified by passing a second argument.

In [27]:
# Left justifying a string
string1 = 'Hello'
string2 = 'world!'

print(string1.ljust(10) + string2)

Hello     world!


In [28]:
# Left justifying a string with pluses
string1 = 'Hello'
string2 = 'world!'

print(string1.ljust(10, '+') + string2)

Hello+++++world!


In [29]:
# Right justifying a string
string1 = 'Hello'
string2 = 'world!'

print(string1 + string2.rjust(10))

Hello    world!


In [30]:
# Center a string
string = 'Hello world!'

print('|' + string.center(20) + '|')

|    Hello world!    |


In [31]:
# Center a string
string = 'Hello world!'

print('|' + string.center(20, '+') + '|')

|++++Hello world!++++|


In [32]:
# Printing a dictionary of contacts in neat columns
contacts ={
 'Amanda Bennett': 'Engineer, electrical',
 'Bryan Miller': 'Radiation protection practitioner',
 'Christopher Garrison': 'Planning and development surveyor',
 'Debra Allen': 'Intelligence analyst'}

print('Name', 'Occupation')
for name, occupation in contacts.items():
    print(name, occupation)

Name Occupation
Amanda Bennett Engineer, electrical
Bryan Miller Radiation protection practitioner
Christopher Garrison Planning and development surveyor
Debra Allen Intelligence analyst


___
<h3 style="color:red; display:inline">Try it! &lt; / &gt; </h3>

**Can you clean up the printing of the contacts dictionary?**
___

### Checking string contents

There are a variety of ways to to verify the contents of a string. These return a Boolean `True` or `False` and are useful for flow control. For example, we can check if a particular set of characters is inside of a string with the `in` and `not in` operators. The result is a Boolean True or False.

In [33]:
# Check whether a set of characters can be found in a string
string = 'Python Basics'
'Basics' in string

True

The following string methods also return Boolean `True` or `False` values.

|Method | Purpose | Form |
|---|---|---|
|.startswith(), .endswith()| returns `True` if the string starts/ends with another string | string.startswith('abc')|
|.isupper(), .islower()| returns `True` if all characters are upper or lowercase| string.isupper()|
|.isalpha()| returns `True` if string is only letters and not blank | string.isalpha()|
|.isalnum()| returns `True` if string only letters or numbers but not blank | string.alnum()|
|.isdigit()| returns`True` if string is only numbers and not blank | string.isdigit()|

In [34]:
# Checking if a string starts 
# with a particular set of characters

string = 'Python Basics'
string.startswith('Python')

True

In [35]:
# Checking if a string is lowercased
string = 'python basics'
string.islower()

True

In [36]:
# Checking if a string is alphabet characters
string = 'PythonBasics'
string.isalpha()

True

In [37]:
# Checking if a string only
# alphabetic characters and numbers
string = 'PythonBasics5'
string.isalnum()

True

In [38]:
# Checking if a string is only numbers
string = '50'
string.isdigit()

True

The `.isdigit()` method checks each character to verify it is a digit between 0-9. It will return `false` if there is a negative (-) or decimal point (.) character.

___
<h2 style="color:red; display:inline">Coding Challenge! &lt; / &gt; </h2>

**Use flow control on the staff list below to print the name of every person with the first name 'Patricia' or the the last name 'Mitchell'**
___

In [39]:
# A list of staff members
staff = ['Tara Richards',
 'Tammy French',
 'Justin Douglas',
 'Lauren Marquez',
 'Aaron Wilson',
 'Dennis Howell',
 'Brandon Reed',
 'Kelly Baker',
 'Justin Howard',
 'Sarah Myers',
 'Vanessa Burgess',
 'Timothy Davidson',
 'Jessica Lee',
 'Christopher Miller',
 'Lisa Grant',
 'Ryan Chan',
 'Gary Carson',
 'Anthony Mitchell',
 'Jacob Turner',
 'Jennifer Bonilla',
 'Rachel Gonzalez',
 'Patricia Clark',
 'Richard Pearson',
 'Glenn Allen',
 'Jacqueline Gallagher',
 'Carlos Mcdowell',
 'Jeffrey Harris',
 'Danielle Mitchell',
 'Sarah Craig',
 'Vernon Vasquez',
 'Anthony Burton',
 'Erica Bryant',
 'Patricia Walker',
 'Karen Brown',
 'Terri Walker',
 'Michelle Knight',
 'Kathleen Douglas',
 'Debbie Estrada',
 'Jennifer Brewer',
 'Taylor Rodriguez',
 'Lisa Turner',
 'Julie Hudson',
 'Christina Cox',
 'Nancy Patrick',
 'Patricia Mosley',
 'Nicholas Gordon',
 'Wanda Vasquez',
 'Jason Lopez',
 'Anna Mitchell',
 'Tyler Perez']

___
<h2 style="color:red; display:inline">Coding Challenge! &lt; / &gt; </h2>

**Print all the first names in the staff list. The `.split()` method would be useful.**
___

In [40]:
# Print all the first names in the staff list


___
<h2 style="color:red; display:inline">Coding Challenge! &lt; / &gt; </h2>

**Use flow control on the staff list to add all of the staff list to the colleagues list below. You will need to split the first and last names apart using the `.split()` method. You will also need to create a dictionary for each entry in the colleagues list and append the dictionary to colleagues list. Print out the first names of all colleagues.**
___

In [41]:
colleagues = [
    {'first_name': 'Ada', 'last_name': 'Lovelace'},
    {'first_name': 'Charles', 'last_name': 'Babbage'}
]

# Print the first name of the first colleague
print(colleagues[0]['first_name'])

Ada


___
## Lesson Complete
Congratulations! You've completed the *Python Basics* series. 

Considering the amount of material in *Python Basics 1-5* there's a good chance you won't retain it all. That's okay. Programmers often need to look up things to accomplish a task they haven't done in a while, particularly if it is in a language they don't often use. When you're working on a project, you can always come back to these lessons as reference materials. In other words, you've learned an incredible amount, so don't be surprised if it doesn't all stick at first.

If you want to help yourself retain what you've learned, the best way is to start putting it into practice. Try your hand at creating some small Python projects and recognize that the things you've learned here will cement with time and practice. When you do forget a particular thing&mdash;as we all do&mdash;a quick web search often turns up some useful examples.


### Start An Intermediate Python Skills Lesson: 
* [Python Intermediate 1](../intermediate/python-intermediate-1.ipynb)
* [Pandas 1](../pandas/pandas-1.ipynb)

### Start a Text Analysis Lesson:
* [Exploring Metadata](../pandas/exploring-metadata.ipynb) 


### Coding Solutions

Here are a few solutions for exercises in this lesson.
___

In [42]:
# Using a for loop on a string
string = 'Hello World'

for character in string:
    if character != 'o':
        print(character, end='')

Hell Wrld

In [43]:
# Printing a dictionary of contacts in neat columns
contacts ={
 'Amanda Bennett': 'Engineer, electrical',
 'Bryan Miller': 'Radiation protection practitioner',
 'Christopher Garrison': 'Planning and development surveyor',
 'Debra Allen': 'Intelligence analyst'}

print('Name'.ljust(22), 'Occupation')
print('|'.center(44, '-'))
for name, occupation in contacts.items():
    print(name.ljust(20), '|', occupation)

Name                   Occupation
---------------------|----------------------
Amanda Bennett       | Engineer, electrical
Bryan Miller         | Radiation protection practitioner
Christopher Garrison | Planning and development surveyor
Debra Allen          | Intelligence analyst


In [44]:
# Print all staff names with the first name Patricia or last name Mitchell
for name in staff:
   if name.startswith('Patricia') or name.endswith('Mitchell'):
        print(name)

Anthony Mitchell
Patricia Clark
Danielle Mitchell
Patricia Walker
Patricia Mosley
Anna Mitchell


In [45]:
# Print all the first names in the staff list
for name in staff:
    print(name.split()[0])

Tara
Tammy
Justin
Lauren
Aaron
Dennis
Brandon
Kelly
Justin
Sarah
Vanessa
Timothy
Jessica
Christopher
Lisa
Ryan
Gary
Anthony
Jacob
Jennifer
Rachel
Patricia
Richard
Glenn
Jacqueline
Carlos
Jeffrey
Danielle
Sarah
Vernon
Anthony
Erica
Patricia
Karen
Terri
Michelle
Kathleen
Debbie
Jennifer
Taylor
Lisa
Julie
Christina
Nancy
Patricia
Nicholas
Wanda
Jason
Anna
Tyler


In [46]:
# Add the first and last names of the staff list to the colleagues list

colleagues = [
    {'first_name': 'Ada', 'last_name': 'Lovelace'},
    {'first_name': 'Charles', 'last_name': 'Babbage'}
]

# Add staff names to colleagues list
for name in staff:
    colleague_dict = {}
    first_name = name.split()[0]
    last_name = name.split()[1]
    colleague_dict['first_name'] = first_name
    colleague_dict['last_name'] = last_name
    colleagues.append(colleague_dict)

for entry in colleagues:
    print(entry['first_name'])
    
# Verify all the colleagues were added    
#from pprint import pprint
#pprint(colleagues)

Ada
Charles
Tara
Tammy
Justin
Lauren
Aaron
Dennis
Brandon
Kelly
Justin
Sarah
Vanessa
Timothy
Jessica
Christopher
Lisa
Ryan
Gary
Anthony
Jacob
Jennifer
Rachel
Patricia
Richard
Glenn
Jacqueline
Carlos
Jeffrey
Danielle
Sarah
Vernon
Anthony
Erica
Patricia
Karen
Terri
Michelle
Kathleen
Debbie
Jennifer
Taylor
Lisa
Julie
Christina
Nancy
Patricia
Nicholas
Wanda
Jason
Anna
Tyler
