Given a dataset of concerts
, count the number of concerts per (artist, venue)
, per yearmonth
. Make the resulting table be a wide table - one row per yearmonth
with a column for each unique (artist, venue)
pair. Use the cross product of the artists
and venues
Series to determine which (artist, venue)
pairs to include in the result.
import numpy as np
import pandas as pd
generator = np.random.default_rng( 1357 )
artists = [ 'Mouse Rat' , 'Binary Brothers' , 'Lady Gigabyte' , 'The Rolling Sums' ]
venues = [ 'Super Dome' , 'STAPLES Center' , 'Madison Square Garden' ]
concerts = pd.DataFrame({
'artist' :generator.choice([ 'Mouse Rat' , 'Binary Brothers' , 'Lady Gigabyte' , 'Just-in-Timeberlake' ], size = 200 , replace = True ),
'venue' :generator.choice(venues + [ 'Red Rocks' ], size = 200 , replace = True ),
'date' :pd.to_datetime( '2020-01-01' ) + pd.to_timedelta(generator.choice( 365 , size = 200 , replace = True ), unit = 'D' ),
}).drop_duplicates().convert_dtypes()
print (concerts)
# artist venue date
# 0 Binary Brothers STAPLES Center 2020-04-10
# 1 Binary Brothers STAPLES Center 2020-07-31
# 2 Mouse Rat Madison Square Garden 2020-06-21
# 3 Mouse Rat Red Rocks 2020-04-17
# 4 Binary Brothers Red Rocks 2020-10-05
# .. ... ... ...
# 195 Lady Gigabyte Red Rocks 2020-08-10
# 196 Mouse Rat Madison Square Garden 2020-06-30
# 197 Binary Brothers Super Dome 2020-05-02
# 198 Just-in-Timeberlake Madison Square Garden 2020-06-21
# 199 Binary Brothers Madison Square Garden 2020-04-14
#
# [200 rows x 3 columns]
Solution 1¶
This content is gated
Subscribe to one of the products below to gain access
Solution 2¶
This content is gated
Subscribe to one of the products below to gain access