Python | Sorting URL on basis of Top Level Domain
Last Updated :
11 May, 2020
Given a list of URL, the task is to sort the URL in the list based on the top-level domain.
A top-level domain (TLD) is one of the domains at the highest level in the hierarchical Domain Name System of the Internet. Example – org, com, edu.
This is mostly used in a case where we have to scrap the pages and sort URL according to top-level domain. It is widely used in open-source projects and serves as handy snippet for use.
Input :
url = ["https://www.isb.edu", "www.google.com",
"http://cyware.com", "https://www.gst.in",
"https://www.coursera.org", "https://www.create.net",
"https://www.ontariocolleges.ca"]
Output :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net',
'https://www.coursera.org']
Explanation:
The Tld for the above list is in sorted order
['.ca','.com','.com','.edu','.in','.net','.org']
Below are some ways to do the above task.
Method 1: Using sorted
You can split the input and then use sorting to sort according to TLD.
def tld( Input ):
return Input .split( '.' )[ - 1 ]
Output = sorted ( Input ,key = tld)
print ( "Initial list is :" )
print ( Input )
print ( "sorted list according to TLD is" )
print (Output)
|
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 2: Using Lambda
The most concise and readable way to sort the URL in the list based on the top-level domain is using lambda.
Output = sorted ( Input ,key = lambda x: x.split( '.' )[ - 1 ])
print ( "Initial list is :" )
print ( Input )
print ( "sorted list according to TLD is" )
print (Output)
|
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Method 3: Using reversed
Reversing the input and splitting it and then applying a sort to sort URL according to TLD
def internal(string):
return list ( reversed (string.split( '.' )))
Output = sorted ( Input , key = internal)
print ( "Initial list is :" )
print ( Input )
print ( "sorted list according to TLD is" )
print (Output)
|
Initial list is :
['https://www.isb.edu', 'www.google.com', 'http://cyware.com',
'https://www.gst.in', 'https://www.coursera.org',
'https://www.create.net', 'https://www.ontariocolleges.ca']
Sorted list according to TLD is :
['https://www.ontariocolleges.ca', 'www.google.com',
'http://cyware.com', 'https://www.isb.edu',
'https://www.gst.in', 'https://www.create.net', 'https://www.coursera.org']
Share your thoughts in the comments
Please Login to comment...