You own a national restaurant chain called Applewasps. To increase sales, you decide to launch a multi-regional television marketing campaign.
At the end of the campaign you have a table of commercials
indicating when and where each commercial aired, and a table of sales
indicating when and where customers generated sales.
import numpy as np
import pandas as pd
generator = np.random.default_rng( 5555 )
regions = [ 'north' , 'south' , 'east' , 'west' ]
commercials = pd.DataFrame({
'commercial_id' : range ( 10 ),
'region' : generator.choice(regions, size = 10 ),
'date_time' : pd.to_datetime( '2020-01-01' ) + pd.to_timedelta(generator.integers( 240 , size = 10 ), unit = 'h' )
})
sales = pd.DataFrame({
'sale_id' : range ( 10 ),
'region' : generator.choice(regions, size = 10 ),
'date_time' : pd.to_datetime( '2020-01-01' ) + pd.to_timedelta(generator.integers( 240 , size = 10 ), unit = 'h' ),
'revenue' : np.round(generator.normal( loc = 20 , scale = 5 , size = 10 ), 2 )
})
print (commercials)
# commercial_id region date_time
# 0 0 west 2020-01-10 12:00:00
# 1 1 north 2020-01-10 16:00:00
# 2 2 south 2020-01-09 01:00:00
# 3 3 east 2020-01-10 19:00:00
# 4 4 south 2020-01-08 22:00:00
# 5 5 east 2020-01-03 02:00:00
# 6 6 south 2020-01-07 15:00:00
# 7 7 west 2020-01-05 22:00:00
# 8 8 east 2020-01-03 04:00:00
# 9 9 west 2020-01-05 04:00:00
print (sales)
# sale_id region date_time revenue
# 0 0 west 2020-01-05 08:00:00 20.14
# 1 1 east 2020-01-08 22:00:00 22.98
# 2 2 south 2020-01-07 21:00:00 22.98
# 3 3 west 2020-01-05 17:00:00 16.82
# 4 4 west 2020-01-02 12:00:00 20.47
# 5 5 east 2020-01-10 09:00:00 26.93
# 6 6 north 2020-01-08 19:00:00 20.25
# 7 7 south 2020-01-01 08:00:00 23.38
# 8 8 south 2020-01-01 17:00:00 25.74
# 9 9 south 2020-01-10 22:00:00 22.28
In order to analyze the performance of each commercial, map each sale to the commercial that aired prior to the sale, in the same region.
Solution¶
This content is gated
Subscribe to one of the products below to gain access