Programming Workshop for Beginners

Day 2: Functions & Data Analysis

Sunday, June 24, 2018

University of Waterloo

Schedule

10:00 - 11:00: Work
11:00 - 11:10: Break
11:10 - 12:30: Work

12:30 - 13:30: Lunch
13:30 - 17:00: Work & Project

Slides available online at:

Or a shorter link (it takes you to the same place):

Memory refresher

  • yesteday we learned about different data types (lists, etc)
  • if-then statements
  • for loops
  • and we wrote our own quiz!

Your questions from yesterday

  • indexing
  • input
  • for loops

Indexing

In [2]:
a = "kitty"

In [3]:
b = a[2:4]

In [4]:
print(b)
tt
In [5]:
c = a + " " + "kat"
In [6]:
print(c)
kitty kat

In [8]:
d = c[:4]
e = c[4:]
print(d)
print(e)
kitt
y kat
In [9]:
print(c)
kitty kat

In [10]:
f = c[-1]
print(f)
t
In [11]:
g = c[-5:-2]
print(g)
y k

Input

input() does two thing at once:

  1. display the prompt to the user
  2. takes what's entered by the user and stores it as a string variable
In [1]:
answer_1 = input("What's the capital city of Canada?")
answer_2 = input("When did the Cold War end?")
What's the capital city of Canada?Ottawa

For loops

In [14]:
names = ["Alan", "Irish", "Ryan", "Sajed", "Stacy"]
counter = 0

for mentor in names:
    print("High five " + mentor)
    counter = counter + 1

print("High-fived " + str(counter) + " mentors!")
High five Alan
High five Irish
High five Ryan
High five Sajed
High five Stacy
High-fived 5 mentors!

In [15]:
names = ["Alan", "Irish", "Ryan", "Sajed", "Stacy"]
counter = 0

for mentor in names:
    print("High five " + mentor)
    counter = counter + 1

print("High-fived " + str(counter) + " mentors!")
High five Alan
High five Irish
High five Ryan
High five Sajed
High five Stacy
High-fived 5 mentors!
In [1]:
names = ["Mariah", "Sajed", "Sean"]
counter = 0

for mentor in names:
    print("High five " + mentor)
    counter = counter + 1

print("High-fived " + str(counter) + " mentors!")
High five Mariah
High five Sajed
High five Sean
High-fived 3 mentors!

Today: functions give us superpowers!

  • Learn what functions are, how are they used and write our own functions
  • Use functions other people wrote
    • NumPy and Matplotlib
    • Simple data analysis: how does the amount of chocolate influence overall happiness?

Project: Analyze historical temperature values Canada

Functions

Funtions we have used already

  • input(q): display message in q and store whatever user types into a variable
  • type(x): ask Python about the data type of x
  • len(x): ask Python to tell us the length of a variable x
  • print(x): printing x to screen

Functions

  • a piece of code that we might want to run more than a few times
  • a nice way to package commands so we don't have to write them over and over again
  • our first function: Saying hello!

Making a function

This goes into your script

In [1]:
def say_hello():
    print("Hello, world!")

Using a function

In ipython, run the script first

In [2]:
say_hello()
Hello, world!

Anatomy of a function

Exercise: Love letter

Doubt thou the stars are fire;
Doubt that the sun doth move;
Doubt truth to be a liar;
But never doubt I love.
O dear Ophelia, I am ill at these numbers.

Quiz

What is the output of the following code?

def increase_by_one(number):
    new = number + 1
    print(new)

increase_by_one(6)

a) Nothing

b) 6

c) 7

Example: Converting kilograms to pounds

In [2]:
def kg_to_lb(weight_kg):
    weight_lb = weight_kg * 2.2
    return weight_lb
In [3]:
answer = kg_to_lb(2)
print(answer)
4.4

Note: function inside another function

We can use the output of one function as the input of another function.

In [4]:
answer = kg_to_lb(2)
print(answer)
4.4
In [5]:
print(kg_to_lb(2))
4.4

Exercise: pounds to kilograms

Write a function that converts pounds to kilograms. 1 lb = 0.454 kg, or 1 kg = 2.2 lb.

def lb_to_kg(weight_lb):
    weight_kg = weight_lb / 2.2
    return weight_kg

OR

def lb_to_kg(weight_lb):
    weight_kg = weight_lb * 0.454
    return weight_kg

How do we know if this function is correct?

In [8]:
print(lb_to_kg(4.4))
Out[8]:
2.0

(If you see something like 1.9976000000000003, it's also fine.)

In [9]:
kg_to_lb(2)
Out[9]:
4.4

Note: print vs. return

In [3]:
def increase_by_one(number):
    new = number + 1
    print(new)
In [4]:
value = increase_by_one(6)
7
In [5]:
print(value)
None
In [6]:
def increase_by_one(number):
    new = number + 1
    return new
In [7]:
value = increase_by_one(6)
In [8]:
print(value)
7

What caused the difference?

In [11]:
def increase_by_one(number):
    new = number + 1
    return new
In [12]:
value = increase_by_one(6)
In [13]:
print(value)
7

In [17]:
def increase_by_one(number):
    new = number + 1
    print(new)
In [18]:
value = increase_by_one(6)
7
In [19]:
print(value)
None

Exercise: counting consonants in a word

Write a function called count_consonants that can be used in the following way.

In [25]:
word = 'morphogenesis'
count_consonants(word)
Out[25]:
8

Try not to look at the hint. If you really want a hint, press the downward arrow in the bottom right corner.

Hint

Think about how we wrote count_vowels yesterday.

Quiz

What will the following line return?

count_consonants('ivana')

a) Error
b) 2
c) '2'
In [29]:
count_consonants('ivana')
Out[29]:
2

What will the following line return?

count_consonats('')

a) Error
b) 0
c) '0'
In [27]:
count_consonats('')
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-27-2e12d6f07702> in <module>()
----> 1 count_consonats('')

NameError: name 'count_consonats' is not defined

What will the following line return?

count_consonants(100)

a) Error
b) 3
c) 100
In [30]:
count_consonants(100)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-30-5b835d6cc9df> in <module>()
----> 1 count_consonants(100)

<ipython-input-22-4fc31d8eae71> in count_consonants(word)
      2     vowels = ["a", "e", "i", "o", "u", "A", "E", "I", "O", "U"]
      3     count = 0
----> 4     for letter in word:
      5         if letter not in vowels:
      6             count += 1

TypeError: 'int' object is not iterable

Multiple arguments

In [31]:
def smaller(a, b):
    minimum = a
    if b < a:
        minimum = b
    return minimum
In [32]:
small_number = smaller(5, 2)
print(small_number)
2
In [33]:
print(smaller(-1, 10))
-1

Caution

In [34]:
print(smaller(9))
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-34-bfe8226e0010> in <module>()
----> 1 print(smaller(9))

TypeError: smaller() missing 1 required positional argument: 'b'

Exercise

Write a function that takes 2 strings and returns the longer one. When the two strings have the same length, return the first string.

In [35]:
def longer_string(stringA, stringB):
    if len(stringA) >= len(stringB):
        return stringA
    else:
        return stringB

There are countless ways to write a program. Your program is correct as long as it produces the correct output

In [36]:
longer_string("cilantro", "mint")
Out[36]:
'cilantro'
In [37]:
longer_string("cheddar", "mozzarella")
Out[37]:
'mozzarella'
In [38]:
longer_string("poutine", "pizza")
Out[38]:
'poutine'

Using functions written by others

Using functions other people wrote

  • Ready-to-use functions are stored in packages called libraries
    • When we want to use a certain set of function we import that library
  • Usually you would install what you need on your computer (not everyone wants to bake cookies!)

Packages we will use today

  • Numpy: for dealing with sequences of numbers

  • Matplotlib: for plotting these sequences

NumPy

  • A collection of functions (and bunch of other things) for working with numbers in Python
  • Numbers in Numpy are stored as sequences of numbers called numpy arrays
In [25]:
import numpy as np
In [26]:
numbers = np.arange(10)
print(numbers)
[0 1 2 3 4 5 6 7 8 9]
In [27]:
type(numbers)
Out[27]:
numpy.ndarray
In [28]:
np.zeros(5)
Out[28]:
array([0., 0., 0., 0., 0.])
In [29]:
np.ones(50)
Out[29]:
array([1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.,
       1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1., 1.])

NumPy arrays can look very much like lists, but they are used for different things:

In [30]:
array = np.arange(5)
li = [0, 1, 2, 3, 4]

print(array, ' is ', type(array))
print(li, 'is', type(li))
[0 1 2 3 4]  is  <class 'numpy.ndarray'>
[0, 1, 2, 3, 4] is <class 'list'>

Similar handling like lists:

In [34]:
print('The first element of numbers is:', numbers[0])
print('The last element of numbers is:', numbers[-1])
print('Numbers has', len(numbers), 'elements')
The first element of numbers is: 0
The last element of numbers is: 9
Numbers has 10 elements
In [35]:
numbers[2:5]
Out[35]:
array([2, 3, 4])

...but they are also different in many important ways

In [36]:
array+array
Out[36]:
array([0, 2, 4, 6, 8])
In [37]:
li+li
Out[37]:
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4]
In [38]:
array*5
Out[38]:
array([ 0,  5, 10, 15, 20])
In [39]:
li*5
Out[39]:
[0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4, 0, 1, 2, 3, 4]

Numpy functions for arrays

In [40]:
mean = np.mean(numbers)
smallest_element = np.min(numbers)
print('Mean:', mean)
print('Smallest element:', smallest_element)
Mean: 4.5
Smallest element: 0

Exercise

How would you get the largest element?

Looking for help

It would be difficult (impossible?) to guess all available functions and what they to. If we want to see what functions are available we press Tab after dot in IPython.

Getting help

To understand how to use a function, type ? after the function name. For example, if we wanted to take a look at how to round up a number using np.round:

In [41]:
np.round?

Download the data for today!

  1. Go here: https://github.com/uwpyb/materials/tree/master/lectures
  2. You should find chocolate.csv file there. Click on it!
  3. A short preview of the data in a table will appear in a new page. On that page, click on "Raw" and when a new white page opens with the data, right click "Save as..." (or "Save Page as..." on some computers) and select the workshop folder on your desktop to save the file.

What are these .csv files?

  • CSV stands for comma-separated values (so we have two text files containing some data separated with commas)
  • our data set: how does happiness depend on the number of chocolate bars

Let's peek into the files!

Loading the data with NumPy

Using the loadtxt function from NumPy:

In [19]:
chocolate = np.loadtxt('chocolate.csv', delimiter=',', skiprows=1)

Parameters in this function:

  • delimiter: tell numpy our data in the file is separated with spaces
  • skiprows: tell numpy to skip the first row in the file
In [20]:
print(chocolate)
[[ 0.  0.]
 [ 1.  9.]
 [ 2. 16.]
 [ 3. 21.]
 [ 4. 24.]
 [ 5. 25.]
 [ 6. 24.]
 [ 7. 21.]
 [ 8. 16.]
 [ 9.  9.]
 [10.  0.]]
In [22]:
type(chocolate)
Out[22]:
numpy.ndarray

Finding out the number of elements

In [43]:
len(chocolate)
Out[43]:
11

But we also have columns, so instead we can use .shape (notice no brackets () at the end of shape):

In [45]:
chocolate.shape
Out[45]:
(11, 2)
In [46]:
shape = chocolate.shape
In [47]:
print('Number of rows', shape[0])
print('Number of columns', shape[1])
Number of rows 11
Number of columns 2
In [19]:
print("The first row:", chocolate[0])
The first row: [ 0.  0.]
In [20]:
print("The first five rows:")
print(chocolate[0:5])
The first five rows:
[[  0.   0.]
 [  1.   9.]
 [  2.  16.]
 [  3.  21.]
 [  4.  24.]]

Numpy arrays are similar...

In [26]:
print("All rows:")
chocolate[0:8] 
All rows:
Out[26]:
array([[  0.,   0.],
       [  1.,   9.],
       [  2.,  16.],
       [  3.,  21.],
       [  4.,  24.],
       [  5.,  25.],
       [  6.,  24.],
       [  7.,  21.]])
In [48]:
chocolate[2:11]
Out[48]:
array([[ 2., 16.],
       [ 3., 21.],
       [ 4., 24.],
       [ 5., 25.],
       [ 6., 24.],
       [ 7., 21.],
       [ 8., 16.],
       [ 9.,  9.],
       [10.,  0.]])
In [49]:
chocolate[2:]
Out[49]:
array([[ 2., 16.],
       [ 3., 21.],
       [ 4., 24.],
       [ 5., 25.],
       [ 6., 24.],
       [ 7., 21.],
       [ 8., 16.],
       [ 9.,  9.],
       [10.,  0.]])

Exercises

How would you get the following rows?

a) The first two rows:

[[ 0.,  0.],
[ 1.,  9.]]

b) The last two rows:

[[  9.,   9.],
 [ 10.,   0.]]

c) Seventh to ninth row (count the rows if you are confused about this one):

[[  6.,  24.],
[  7.,  21.],
[  8.,  16.]]

Answers

In [41]:
chocolate[:2]  # you an also write chocolate[:2], but shorter is better!
Out[41]:
array([[ 0.,  0.],
       [ 1.,  9.]])
In [39]:
chocolate[-2:]
Out[39]:
array([[  9.,   9.],
       [ 10.,   0.]])
In [40]:
chocolate[6:9]
Out[40]:
array([[  6.,  24.],
       [  7.,  21.],
       [  8.,  16.]])

...but we also have columns

But how do we get columns? Use comma to separate rows from columns!

In [49]:
chocolate[3]  # the forth row
Out[49]:
array([  3.,  21.])
In [50]:
print("Fourth row, first column:", chocolate[3, 0])
Fourth row, first column: 3.0
In [51]:
print("Fourth row, second column", chocolate[3, 1])
Fourth row, second column 21.0

More columns!

Get all rows and the first column:

In [50]:
chocolate[:, 0]
Out[50]:
array([ 0.,  1.,  2.,  3.,  4.,  5.,  6.,  7.,  8.,  9., 10.])

And all rows and the second column:

In [51]:
chocolate[:, 1]
Out[51]:
array([ 0.,  9., 16., 21., 24., 25., 24., 21., 16.,  9.,  0.])
In [52]:
number_of_chocolates = chocolate[:, 0]
print(number_of_chocolates)
[ 0.  1.  2.  3.  4.  5.  6.  7.  8.  9. 10.]
In [53]:
happiness = chocolate[:, 1]
print("Happiness is:", happiness)
Happiness is: [ 0.  9. 16. 21. 24. 25. 24. 21. 16.  9.  0.]

Exercise

  1. Use NumPy to find the maximal happiness value with chocolate bars

  2. Find the average amount of chocolate bars in this data set

In [54]:
print("Maximal happiness:", np.max(happiness))
Maximal happiness: 25.0
In [55]:
print("Average number oof chocolates:", np.average(chocolate))
Average number oof chocolates: 10.0

More stuff with arrays in the project!

Next: Plotting!

Matplotlib

  • our friend for visualizing data in Python!
In [32]:
import matplotlib.pyplot as plt

Plotting squares

In [58]:
line = np.arange(5)
print('Line:', line)
squared_line = line**2
print('Squared line:', squared_line)
Line: [0 1 2 3 4]
Squared line: [ 0  1  4  9 16]
In [59]:
plt.figure()
plt.plot(line, squared_line)
plt.show()
In [60]:
plt.figure()
plt.title('Squares')
plt.plot(line, squared_line)
plt.xlabel('x')
plt.ylabel('x squared')
plt.show()

Plotting: chocolate over happiness

In [61]:
chocolate = np.loadtxt('chocolate.csv', delimiter=',', skiprows=1)
number_of_chocolates = chocolate[:, 0]
happiness = chocolate[:, 1]

Plotting: chocolate over happiness

In [62]:
plt.figure()
plt.plot(number_of_chocolates, happiness)
plt.xlabel('Number of chocolates')
plt.ylabel('Happiness')
plt.savefig('chocolate.png')
plt.show()

Another kind of plot (just for fun!)

In [63]:
plt.figure()
plt.bar(number_of_chocolates, happiness)
plt.show()

Exercise

  1. Add labels to the plot
  2. Find a way to change the color of the bars (use help, either Google or "?" in IPython)
  3. Make the bars thinner
  4. Harder: Change the background color

(Hint for 3: plt.subplot('111', axisbg='black'))

In [64]:
plt.figure()
plt.subplot('111', facecolor='yellow')
plt.bar(number_of_chocolates, happiness, color='cornflowerblue', width=0.5)
plt.show()

Even more fancy plots

Project time!

The project: https://github.com/uwpyb/materials/blob/master/projects/Day2_DataAnalysis.ipynb

Or a shorter link: https://goo.gl/7aRpZZ

We'll be back with demos and a short wrap-up discussion (i.e., what next?) at around 16:15!

What's next?

  • We will send you more information about the next steps

    • there are many things we did not cover, or mention
  • Take advantage of CS courses at the Uni: CS 116 is also using Python (has a prerequisite CS115)

Thanks!

Lunches and coffee breaks provided by:

  • Women in Computer Science @ UW and Python Software Foundation:
  • And many <3 people all over the world who write programs for free, so other people can use them and do awesome things!

Help us to continue doing what we do by filling out the short survey! (you'll receive the link today)

Graphics attributions

Front graphics, cookies image: Graphics Provided by www.Vecteezy.com