Open In App

Concatenate strings from several rows using Pandas groupby

Improve
Improve
Like Article
Like
Save
Share
Report

Pandas Dataframe.groupby() method is used to split the data into groups based on some criteria. The abstract definition of grouping is to provide a mapping of labels to the group name.

To concatenate string from several rows using Dataframe.groupby(), perform the following steps:

  1. Group the data using Dataframe.groupby() method whose attributes you need to concatenate.
  2. Concatenate the string by using the join function and transform the value of that column using lambda statement.

Example 1: We will concatenate the data in the branch column having the same name.

Python3




# import pandas library
import pandas as pd
  
# read csv file
df = pd.read_csv("Book2.csv")
  
# concatenate the string
df['branch'] = df.groupby(['Name'])['branch'].transform(lambda x : ' '.join(x))
  
# drop duplicate data
df = df.drop_duplicates()   
  
# show the dataframe
print(df)


Output:

Example 2: We can perform Pandas groupby on multiple columns as well.

Apply groupby on Name and year column

Python3




# import pandas library
import pandas as pd
  
# read a csv file
df = pd.read_csv("Book1.csv")
  
# concatenate the string
df['branch'] = df.groupby(['Name', 'year'])['branch'].transform(
                                              lambda x: ' '.join(x))
  
# drop duplicate data
df = df.drop_duplicates()          
  
# show the dataframe
df


Output:

Groupby on multiple columns



Last Updated : 20 Mar, 2024
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads