Open In App

How to Count Observations by Group in Pandas?

Last Updated : 24 Feb, 2021
Improve
Improve
Like Article
Like
Save
Share
Report

In real data science projects, you’ll be dealing with large amounts of data and trying things over and over, so for efficiency, we use the Groupby concept. Groupby concept is really important because its ability to aggregate data efficiently, both in performance and the amount code is magnificent. Groupby mainly refers to a process involving one or more of the following steps they are:

  • Splitting: It is a process in which we split data into groups by applying some conditions on datasets.
  • Applying: It is a process in which we apply a function to each group independently.
  • Combining: It is a process in which we combine different datasets after applying groupby and results into a data structure.

Syntax : groupby(by=None, axis=0, level=None, as_index=True, sort=True, group_keys=True, squeeze=False, **kwargs)

Parameters :

  • by : mapping, function, str, or iterable
  • axis : int, default 0
  • level : If the axis is a MultiIndex (hierarchical), group by a particular level or levels
  • as_index : For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output
  • sort : Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. groupby preserves the order of rows within each group.
  • group_keys : When calling apply, add group keys to index to identify pieces
  • squeeze : Reduce the dimensionality of the return type if possible, otherwise return a consistent type

Returns : GroupBy object 

Here, we use a simple dummy dataframe which is shown below:

Also, we use some methods to count the observations by the group in Pandas which are explained below with examples.

Example 1: Using group.count (Count By One Variable)

In this example, we will use group.count() method which counts the total number of members in each group.

Python3




# import libraries
import pandas as pd
  
#create pandas DataFrame
df = pd.DataFrame({'Name': ['Arun', 'Arun', 'Bhuvi', 'Bhuvi',
                            'Bhuvi', 'Chandan', 'Chandan'],
                     
                   'Department':['CSE', 'IT', 'CSE', 'CSE',
                                 'IT', 'IT', 'CSE'],
                     
                   'Funds': [1100, 800, 700, 600, 600, 500, 1200]})
  
# create a group using groupby
group = df.groupby("Department")
  
# count the observations
group.count()


Output: 

Example 2: Using group.size (Count By Multiple Variables)

In this example, we will use group.size() method which counts the number of entries/rows in each group. 

Python3




# import libraries
import pandas as pd
  
#create pandas DataFrame
df = pd.DataFrame({'Name': ['Arun', 'Arun', 'Bhuvi', 'Bhuvi'
                            'Bhuvi', 'Chandan', 'Chandan'],
                     
                   'Department':['CSE', 'IT', 'CSE', 'CSE'
                                 'IT', 'IT', 'CSE'],
                     
                   'Funds': [1100, 800, 700, 600, 600, 500, 1200]})
  
# create a group using groupby
group = df.groupby(['Name', 'Department'])
  
# size of group to count observations
group = group.size()
  
# make a column name 
group.reset_index(name='Observation')


Output : 



Like Article
Suggest improvement
Share your thoughts in the comments

Similar Reads