pythonnumpy
Ben Gorman

Ben Gorman

Life's a garden. Dig it.

You've developed a model that predicts the probability a 🏠 house for sale can be flipped for a profit . Your model isn't very good, as indicated by its predictions on historic data.

import numpy as np
 
rng = np.random.default_rng(123)
targets = rng.uniform(low=0, high=1, size=20) >= 0.6
preds = np.round(rng.uniform(low=0, high=1, size=20), 2)
 
print(targets)
print(preds)
# [ True False False ... False True False]
# [ 0.23  0.17  0.50 ...  0.87 0.30  0.53]

Your investors want to see these results, but you're afraid to share them. You devise the following algorithm to make your predictions look better without looking artificial.

Step 1: 
  Choose 5 random indexes (without replacement)
 
Step 2: 
  Perfectly reorder the prediction scores at these indexes 
  to optimize the accuracy of these 5 predictions

For example

If you had these prediction scores and truths

indexes: [   0,     1,    2,     3,    4]
scores:  [ 0.3,   0.8,  0.2,   0.6,  0.3]
truths:  [True, False, True, False, True]

and you randomly selected indexes 1, 2, and 4, you would reorder their scores like this.

indexes:    [   0,     1,    2,     3,    4]
old_scores: [ 0.3,   0.8,  0.2,   0.6,  0.3]
new_scores: [ 0.3,   0.2,  0.3,   0.6,  0.8]
truths:     [True, False, True, False, True]

This boosts your accuracy rate from 0% to 20%.

Help

Here's some code to help you evaluate the accuracy of your predictions before and after your changes.

def accuracy_rate(preds, targets):
    return np.mean((preds >= 0.5) == targets)
 
# Accuracy before finagling
accuracy_rate(preds, targets)  # 0.3

Solution

This content is gated

Subscribe to one of the products below to gain access