Scrape LinkedIn Profiles without login using Python
Last Updated :
31 Jan, 2024
In this article, we’ll explore how to scrape LinkedIn profiles without the need for a login, empowering you to gather valuable insights and information programmatically. By leveraging Python’s web scraping capabilities, we can access public LinkedIn profiles seamlessly, opening up new possibilities for data-driven analysis and connection building.”
Scrape LinkedIn Profiles without login using Python
Below, is the step-by-step Implementation of Scrape LinkedIn Profiles without login using Python.
Create a Virtual Environment
First, create the virtual environment using the below commands
python -m venv env
.\env\Scripts\activate.ps1
Install Necessary Library
First, it’s essential to install the required libraries, namely requests, bs4, and re, to proceed with the subsequent steps.
pip install requests
pip install bs4
Import the Library
First, it’s essential to import the required libraries, namely requests, bs4, and re, to proceed with the subsequent steps.
Python3
import requests
import re
from bs4 import BeautifulSoup
|
Implement the Logic
In below code , The ‘scrape_linkedin_profiles’ fucntion extracts information from a given LinkedIn profile URL. Using the ‘requests’ library and a predefined ‘User-Agent’ header for guest access, it fetches the HTML content. The script then utilizes ‘BeautifulSoup’ to parse the HTML and extract details like profile name, designation, followers count, and description. Error-checking ensures that information is only extracted if the relevant HTML elements are present. Extracted details are printed, or default messages are displayed if any element is not found. If the request is unsuccessful, an error message with the status code is shown.
Python3
def scrape_linkedin_profiles(url):
headers = {
"User-Agent" : "Guest" ,
}
response = requests.get(url, headers = headers)
if response.status_code = = 200 :
soup = BeautifulSoup(response.content, 'html.parser' )
title_tag = soup.find( 'title' )
designation_tag = soup.find( 'h2' )
followers_tag = soup.find( 'meta' , { "property" : "og:description" })
description_tag = soup.find( 'p' , class_ = 'break-words' )
name = title_tag.get_text(strip = True ).split( "|" )[ 0 ].strip() if title_tag else "Profile Name not found"
designation = designation_tag.get_text(strip = True ) if designation_tag else "Designation not found"
followers_match = re.search(r '\b(\d[\d,.]*)\s+followers\b' , followers_tag[ "content" ]) if followers_tag else None
followers_count = followers_match.group( 1 ) if followers_match else "Followers count not found"
description = description_tag.get_text(strip = True ) if description_tag else "Description not found"
print (f "Profile Name: {name}" )
print (f "Designation: {designation}" )
print (f "Followers Count: {followers_count}" )
print (f "Description: {description}" )
else :
print (f "Error: Unable to retrieve the LinkedIn company profile. Status code: {response.status_code}" )
|
Create Pipeline For Usage
Now we will define a pipeline to pass the target LinkedIn profile URL (here we will use GeeksforGeeks LinkedIn profile) to the scrapper function.
Python3
scrape_linkedin_profiles(profile_url)
|
Output
Share your thoughts in the comments
Please Login to comment...