Humans Problem¶
You've developed a machine learning model that classifies images. Specifically, it outputs labels with non-negligible probabilities.
import pandas as pd
predictions = pd.DataFrame.from_dict({
'preds': {
12: "{'dog': 0.55, 'cat': 0.25, 'squirrel': 0.2}",
41: "{'telephone pole': 0.8, 'tower': 0.1, 'stick': 0.1}",
43: "{'man': 0.65, 'woman': 0.33, 'monkey': 0.02}",
46: "{'waiter': 0.45, 'waitress': 0.30, 'newspaper': 0.15, 'cat': 0.10}",
49: "{'nurse': 0.50, 'doctor': 0.50}",
72: "{'baseball': 0.8, 'basketball': 0.15, 'football': 0.05}",
91: "{'woman': 0.62, 'man': 0.28, 'elephant': 0.10}"
}
})
print(predictions)
# preds
# 12 {'dog': 0.55, 'cat': 0.25, 'squirrel': 0.2}
# 41 {'telephone pole': 0.8, 'tower': 0.1, 'stick': 0.1}
# 43 {'man': 0.65, 'woman': 0.33, 'monkey': 0.02}
# 46 {'waiter': 0.45, 'waitress': 0.30, 'newspaper': 0.15, 'cat': 0.10}
# 49 {'nurse': 0.50, 'doctor': 0.50}
# 72 {'baseball': 0.8, 'basketball': 0.15, 'football': 0.05}
# 91 {'woman': 0.62, 'man': 0.28, 'elephant': 0.10}
Each row in predictions
represents predictions for a different image.
Insert a column called prob_human
that calculates the probability each image represents a human. You can use the
following list of strings to identify human labels.
humans = ['doctor', 'man', 'nurse', 'teacher', 'waiter', 'waitress', 'woman']
Prevent pandas from truncating print(predictions)
When you print(predictions)
, the output might get truncated like this.
To prevent this, set display.max_colwidth
to None
.
pd.set_option('display.max_colwidth', None)
print(predictions)