Defraud The Investors Problem¶
You've developed a model that predicts the probability a house for sale can be flipped for a profit . Your model isn't very good, as indicated by its predictions on historic data.
import numpy as np
rng = np.random.default_rng(123)
targets = rng.uniform(low=0, high=1, size=20) >= 0.6
preds = np.round(rng.uniform(low=0, high=1, size=20), 2)
print(targets)
print(preds)
# [ True False False ... False True False]
# [ 0.23 0.17 0.50 ... 0.87 0.30 0.53]
Your investors want to see these results, but you're afraid to share them. You devise the following algorithm to make your predictions look better without looking artificial.
Step 1:
Choose 5 random indexes (without replacement)
Step 2:
Perfectly reorder the prediction scores at these indexes
to optimize the accuracy of these 5 predictions
For example
If you had these prediction scores and truths
indexes: [ 0, 1, 2, 3, 4]
scores: [ 0.3, 0.8, 0.2, 0.6, 0.3]
truths: [True, False, True, False, True]
and you randomly selected indexes 1, 2, and 4, you would reorder their scores like this.
indexes: [ 0, 1, 2, 3, 4]
old_scores: [ 0.3, 0.8, 0.2, 0.6, 0.3]
new_scores: [ 0.3, 0.2, 0.3, 0.6, 0.8]
truths: [True, False, True, False, True]
This boosts your accuracy rate from 0% to 20%.
Help
Here's some code to help you evaluate the accuracy of your predictions before and after your changes.
def accuracy_rate(preds, targets):
return np.mean((preds >= 0.5) == targets)
# Accuracy before finagling
accuracy_rate(preds, targets) # 0.3