Open In App

How to Calculate the Sum by Group in R?

Last Updated : 21 Feb, 2022
Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to calculate the Sum by Group in R Programming Language.

Data for Demonstration

R




# creating data frame
df <- data.frame(Sub = c('Math', 'Math', 'Phy', 'Phy'
                         'Phy', 'Che', 'Che'),
                 Marks = c(8, 2, 4, 9, 9, 7, 1),
                 Add_on = c(3, 1, 9, 4, 7, 8, 2))
  
# view dataframe
df


Output:

Sub    Marks    Add_on
Math    8    3
Math    2    1
Phy    4    9
Phy    9    4
Phy    9    7
Che    7    8
Che    1    2

Method 1: Using aggregate() method in Base R

aggregate() function is used to get the summary statistics of the data by group. The statistics include mean, min, sum. max etc.

Syntax: aggregate(dataframe$aggregate_column, list(dataframe$group_column), FUN)  

where

  • dataframe is the input dataframe.
  • aggregate_column is the column to be aggregated in the dataframe.
  • group_column is the column to be grouped with FUN.
  • FUN represents sum/mean/min/ max.

R




# creating data frame
df <- data.frame(Sub = c('Math', 'Math', 'Phy', 'Phy'
                         'Phy', 'Che', 'Che'),
                 Marks = c(8, 2, 4, 9, 9, 7, 1),
                 Add_on = c(3, 1, 9, 4, 7, 8, 2))
  
aggregate(df$Marks, list(df$Sub), FUN=sum)
aggregate(df$Add_on, list(df$Sub), FUN=sum)


Output:

Group.1    x
Che    8
Math    10
Phy    22

Group.1    x
Che    10
Math    4
Phy    20

Method 2: Using dplyr() package

group_by() function followed by summarise() function with an appropriate action to perform.

R




library(dplyr) 
df %>%                                       
  group_by(Sub) %>%
  summarise_at(vars(Marks),
               list(name = sum))


Output:

Sub    name
Che    8
Math    10
Phy    22

Method 3: Using data.table

data.table package to calculate the sum of points scored by a team.

R




library(data.table) 
  
# convert data frame to data table 
setDT(df)
  
# find sum of points scored by sub 
df[ ,list(sum=sum(Marks)), by=Sub]


Output:

Sub sum
Math 10
Phy 22
Che 8


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads