Tinder Coach Problem¶
You operate an online business called tindercoach.com where you give people advice on their Tinder profiles
. You have a dictionary of visits
indicating how many times each visitor_id
visited each page
on
your site.
import random
import string
from collections import defaultdict
Npages = 10
Nvisitors = 10
Nvisits = 100
random.seed(2357)
visitor_ids = list(set(random.randint(1000, 9999) for i in range(Nvisitors)))
pages = list(set('tindercoach.com/' + ''.join(random.choices(string.ascii_lowercase, k=3)) for i in range(Nvisits)))
visits = defaultdict(int)
for i in range(Nvisits):
key = (random.choice(visitor_ids), random.choice(pages))
visits[key] += 1
print(visits)
# defaultdict(<class 'int'>, {
# (3654, 'tindercoach.com/bgr'): 1,
# (1443, 'tindercoach.com/nky'): 1,
# (3654, 'tindercoach.com/wpb'): 1,
# ...,
# (3181, 'tindercoach.com/jam'): 1,
# (5502, 'tindercoach.com/cjp'): 1,
# (5502, 'tindercoach.com/tjk'): 1
# })
Convert visits
into a Compressed Sparse Column (CSC) matrix where element (i,j) stores the number of times visitor i
visited page j.
Then print the sub-matrix showing how many times visitors 1443, 6584, and 7040 visited pages tindercoach.com/chl, tindercoach.com/nky, and tindercoach.com/zmr.