python - Duplicate row based on value in different column -

i have dataframe of transactions. each row represents transaction of 2 item (think of transaction of 2 event tickets or something). want duplicate each row based on quantity sold.

here's example code:

# dictionary of transactions  d = {'1': ['20', 'nyc', '2'], '2': ['30', 'nyc', '2'], '3': ['5', 'nyc', '2'], \      '4': ['300', 'la', '2'], '5': ['30', 'la', '2'],  '6': ['100', 'la', '2']}  columns=['price', 'city', 'quantity']  # create dataframe , rename columns  df = pd.dataframe.from_dict(data=d, orient='index') df.columns = columns

this produces dataframe looks this

price   city    quantity  20       nyc         2  30       nyc         2  5        nyc         2  300      la          2  30       la          2  100      la          2

so in case above, each row transform 2 duplicate rows. if 'quantity' column 3, row transform 3 duplicate rows.

first, recreated data using integers instead of text. varied quantity 1 can more understand problem.

d = {1: [20, 'nyc', 1], 2: [30, 'nyc', 2], 3: [5, 'sf', 3],            4: [300, 'la', 1], 5: [30, 'la', 2],  6: [100, 'sf', 3]}  columns=['price', 'city', 'quantity']  # create dataframe , rename columns  df = pd.dataframe.from_dict(data=d, orient='index').sort_index() df.columns = columns  >>> df    price city  quantity 1     20  nyc         1 2     30  nyc         2 3      5   sf         3 4    300   la         1 5     30   la         2 6    100   sf         3

i created new dataframe using nested list comprehension structure.

df_new = pd.dataframe([df.ix[idx]                         idx in df.index                         _ in range(df.ix[idx]['quantity'])]).reset_index(drop=true) >>> df_new     price city  quantity 0      20  nyc         1 1      30  nyc         2 2      30  nyc         2 3       5   sf         3 4       5   sf         3 5       5   sf         3 6     300   la         1 7      30   la         2 8      30   la         2 9     100   sf         3 10    100   sf         3 11    100   sf         3

Abbruzzese

Search This Blog

python - Duplicate row based on value in different column -

Comments

Post a Comment