Open In App

How to read a XLSX file with multiple Sheets in R?

Improve
Improve
Like Article
Like
Save
Share
Report

In this article, we are going to see how to read an XLSX file with multiple Sheets in R Language. There are various external packages in R used to read XLSX files with multiple sheets.

File Used:

Method 1: Using readxl package

The readxl package in R is used to import and read Excel workbooks in R, which can be used to easily work and modify the .xslsx sheets. It can be installed and loaded into the R working space using the following syntax  : 

install.packages("readxl")

Initially, the excel_sheets() method is invoked to fetch all the worksheet names contained in the Excel workbook, with the specified file path. 

excel_sheets(path)

The lapply() method in R is used to apply a function (either user-defined or pre-defined) to a set of components contained within an R list or data frame. The lapply( ) method returns an object of the same length as that of the input object. 

Syntax: lapply( obj , FUN)

Arguments:

obj – The object to apply the function on 

FUN – The function to be applied  over different components of the object obj. 

The FUN is the read_excel method of this package store, which is used to read the contents of the specified sheet name into a tibble, which is a tabular-like structure used to store data in fixed rows and columns. The lapply method applies the read_excel method over every sheet of the workbook. 

Syntax: read_excel(path, sheet)

Arguments:

path – The file path

sheet – The sheet name to read

The tibble objects returned by the read_excel method can be converted to the data frame again using the lapply method and specifying the function as.data.frame() which converts every object into a data frame. These data frames can be assigned the corresponding sheet names for better clarity using the in-built R names() method. 

names (df) <- new-name

Code:

R




# importing required packages
library(readxl)    
multiplesheets <- function(fname) {
   
  # getting info about all excel sheets
  sheets <- readxl::excel_sheets(fname)
  tibble <- lapply(sheets, function(x) readxl::read_excel(fname, sheet = x))
  data_frame <- lapply(tibble, as.data.frame)
    
  # assigning names to data frames
  names(data_frame) <- sheets
    
  # print data frame
  print(data_frame)
}
  
# specifying the path name
path <- "/Users/mallikagupta/Desktop/Gfg.xlsx"
multiplesheets(path)


Output:

$Sheet1
ID Name      Job 
1  1    A Engineer 
2  2    B       CA 
3  3    C      SDE 
4  4    D       CA 
5  5    E      SDE  
$Sheet2   
Post Likes 
1    A    23 
2    B    34 
3    C    56 
4    D    78

Method 2: Using rio package.

The rio package is used to stimulate quick and easy data import and export operations to be performed in R. Rio makes deductions about the file format itself which can be used to read files easily. 

install.packages("rio")

The import() and export() methods in R determine the data structure of the specified file extension. The method import_list() imports a list of data frames from a multi-object file, for instance, an Excel workbook or an R zipped file. 

Syntax: import_list(file)

Arguments : 

file – The file name of the Excel workbook to access

The column and row names are retained while reading the output of the Excel workbook.  The sheet names are also accessible during the read. 

R




# specifying the path name
path <- "/Users/mallikagupta/Desktop/Gfg.xlsx"
  
# importing the required library
library(rio)
  
# reading data from all sheets
data <- import_list(path)
  
# print data
print (data)


Output:

$Sheet1
ID Name      Job 
1  1    A Engineer 
2  2    B       CA 
3  3    C      SDE 
4  4    D       CA 
5  5    E      SDE  
$Sheet2   
Post Likes 
1    A    23 
2    B    34 
3    C    56 
4    D    78

Method 3: Using openxlsx Package.

The openxlsx package in R is used to create and manipulate Excel files by providing a high-level interface for reading and writing as well as modifying the worksheets in the specified workbook. The package can be loaded and installed into the working space using the following syntax : 

install.packages("openxlsx")

The getSheetNames( ) method of this package is used to return the worksheet names contained within an xlsx file.

getSheetNames(file)

The lapply() method in R is used to apply a function (either user-defined or pre-defined) to a set of components contained within an R list or data frame. The lapply( ) method returns an object of the same length as that of the input object. 

Syntax: lapply( obj , FUN)

Arguments:

obj – The object to apply the function on 

FUN – The function to be applied over different components of the object obj. 

After the retrieval of different sheet names using the previous method, the function read.xlsx is applied over each of the sheets of the workbook. The read.xlsx method is used to read data from an Excel file or Workbook object into an R data.frame object over the specified file path. 

In this case, the lapply() method takes as input the sheet names and returns the corresponding data frames belonging to each sheet of the workbook.  Then these data frames are assigned the sheet names using the names() method in R.

Code:

R




# specifying the path name
path <- "/Users/mallikagupta/Desktop/Gfg.xlsx"
  
# importing the required library
library(openxlsx)
  
# getting data from sheets
sheets <- openxlsx::getSheetNames(path)
data_frame <- lapply(sheets, openxlsx::read.xlsx, xlsxFile=path)
  
# assigning names to data frame
names(data_frame) <- sheets
  
# printing the data
print (data_frame)


Output:

$Sheet1
ID Name      Job 
1  1    A Engineer 
2  2    B       CA 
3  3    C      SDE 
4  4    D       CA 
5  5    E      SDE  
$Sheet2   
Post Likes 
1    A    23 
2    B    34 
3    C    56 
4    D    78


Last Updated : 06 Jun, 2021
Like Article
Save Article
Previous
Next
Share your thoughts in the comments
Similar Reads