Reading specific columns of a CSV file using Pandas
Last Updated :
03 Dec, 2023
CSV files are widely utilized for storing tabular data in file systems, and there are instances where these files contain extraneous columns that are irrelevant to our analysis. This article will explore techniques for selectively reading specific columns from a CSV file using Python.
Let us see how to read specific columns of a CSV file using Pandas. This can be done with the help of the pandas.read_csv() method. We will pass the first parameter as the CSV file and the second parameter as the list of specific columns in the keyword usecols. It will return the data of the CSV file of specific columns.
Read Specific Columns From CSV File
Below are some examples by which we can read specific columns of a CSV file using Pandas.
Read Entire Columns of a CSV File
In this example, the Pandas library is imported, and the code reads the entire content of the “student_scores2.csv” file into a DataFrame ‘df’ using Pandas. The printed output displays the entire dataset for further examination.
Link of the CSV file used: link
Python3
import pandas as pd
df = pd.read_csv( "student_scores2.csv" )
print (df)
|
Output
Read Specific Columns of a CSV File Using read_csv()
In this example, the Pandas library is imported, and the code uses it to read only the ‘IQ’ and ‘Scores’ columns from the “student_scores2.csv” file, storing the result in the DataFrame ‘df’. The printed output displays the selected columns for analysis.
Link of the CSV file used: link
Python3
import pandas as pd
df = pd.read_csv( "student_scores2.csv" , usecols = [ 'IQ' , 'Scores' ])
print (df)
|
OutputÂ
Read Specific Columns of a CSV File Using usecols
In this example, the Pandas library is imported, and the code reads the ‘Hours’, ‘Scores’, and ‘Pass’ columns from the “student.csv” file using Pandas. The resulting DataFrame ‘df’ displays the selected columns for further analysis.
Link of the CSV file used: link
Python3
import pandas as pd
df = pd.read_csv( "student.csv" , usecols = [ 'Hours' , 'Scores' , 'Pass' ])
print (df)
|
Output
In this example, the Pandas library is imported, and the code reads the ‘Survived’ and ‘Pclass’ columns from the “titanic.csv” file using Pandas. The resulting DataFrame ‘df’ displays the selected columns for analysis. Link of the CSV file used: link
Python3
import pandas as pd
df = pd.read_csv( "titanic.csv" , usecols = [ 'Survived' , 'Pclass' ])
print (df)
|
Output
Like Article
Suggest improvement
Share your thoughts in the comments
Please Login to comment...