Sometimes, while working with Tuple Matrix, we can have a problem in which we get lots of data, which are similar, i.e elements are same in rows, just the ordering of tuples is different, it’s sometimes, desired to get them removed. This kind of problem can have application in domains such as web development and day-day programming. Let’s discuss certain ways in which this task can be performed.
Input : test_list = [[(4, 5), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
Output : {((4, 5), (3, 2))}
Input : test_list = [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
Output : {((3, 2), (4, 6)), ((4, 5), (3, 2))}
Method #1: Using set() + tuple() + sorted() + list comprehension The combination of above functions can be used to solve this problem. In this, we first, perform the sorting, and then convert the rows to set, which automatically removes the duplicates.
Python3
test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 2 , 2 ), ( 4 , 6 )], [( 3 , 2 ), ( 4 , 5 )]]
print ("The original list is : " + str (test_list))
res = set ([ tuple ( set (sub)) for sub in test_list])
print (" Tuple matrix after removal : " + str (res))
|
Output :
The original list is : [[(4, 5), (3, 2)], [(2, 2), (4, 6)], [(3, 2), (4, 5)]]
Tuple matrix after removal : {((4, 6), (2, 2)), ((4, 5), (3, 2))}
Time Complexity: O(n), where n is the length of the list test_dict
Auxiliary Space: O(n) additional space of size n is created where n is the number of elements in the res list
Method #1: Using set() + tuple() + sorted() + list comprehension The combination of above functions can be used to solve this problem. In this, we perform the task of sorting and tuple conversion using frozenset().
Python3
test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 2 , 2 ), ( 4 , 6 )], [( 3 , 2 ), ( 4 , 5 )]]
print ("The original list is : " + str (test_list))
res = set ([ frozenset (sub) for sub in test_list])
print (" Tuple matrix after removal : " + str (res))
|
Output :
The original list is : [[(4, 5), (3, 2)], [(2, 2), (4, 6)], [(3, 2), (4, 5)]]
Tuple matrix after removal : {frozenset({(4, 5), (3, 2)}), frozenset({(4, 6), (2, 2)})}
Method #3 : Using dict + tuple() + list
- Convert each sublist in the matrix to a tuple so that we can use it as a dictionary key later.
- Create an empty dictionary to store the unique tuples.
- Loop through each tuple in the matrix and check if it exists in the dictionary.
- If it doesn’t exist, add it to the dictionary.
- Convert the dictionary values back to tuples and return the set of unique tuples.
Python3
def remove_similar_rows(matrix):
tuples = [ tuple (row) for row in matrix]
seen = {}
for tup in tuples:
key = tuple ( sorted (tup))
if key not in seen:
seen[key] = tup
return list (seen.values())
test_list = [[( 4 , 5 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
print (remove_similar_rows(test_list))
test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
print (remove_similar_rows(test_list))
|
Output
[((4, 5), (3, 2))]
[((4, 6), (3, 2)), ((3, 2), (4, 5))]
Time complexity: O(n)
Auxiliary Space: O(n)
Method #4: Using the numpy library to remove duplicates:
Algorithm:
1.Initialize the input list test_list.
2.Convert the list to a numpy array using np.array().
3.Use the numpy’s unique() function to remove duplicates by specifying the axis as 0.
4.Convert the numpy array back to a list using tolist() function.
5.Print the unique list.
Python3
import numpy as np
test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
arr = np.array(test_list)
unique_arr = np.unique(arr, axis = 0 )
unique_list = unique_arr.tolist()
print (unique_list)
|
Output:
The original list is : [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
[[[3, 2], [4, 5]], [[4, 6], [3, 2]]]
Time complexity:
The time complexity of this algorithm is O(n log n), where n is the number of elements in the input list. The reason for this is that the np.unique() function uses sorting algorithms to find the unique elements, which takes O(n log n) time in the worst-case scenario.
Space complexity:
The space complexity of this algorithm is O(n), where n is the number of elements in the input list. This is because we are creating a numpy array and a list of the same size as the input list to store the unique elements.
Method 5: Using a loop to iterate over each tuple and checking if it has already been added to the unique list using a separate list.
Step-by-step approach:
- Initialize an empty list unique_list to store the unique tuples.
- Initialize an empty list seen to keep track of the tuples that have been seen already.
- Loop through each tuple in the original list test_list.
- If the tuple is not in the seen list, add it to both unique_list and seen.
- Print the unique_list.
Below is the implementation of the above approach:
Python3
test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
unique_list = []
seen = []
for tpl in test_list:
if tpl not in seen:
unique_list.append(tpl)
seen.append(tpl)
print (unique_list)
|
Output
[[(4, 6), (3, 2)], [(3, 2), (4, 5)]]
Time complexity: O(n^2), where n is the length of the original list.
Auxiliary space: O(n), as we need to store the seen list.
Method#6: Using Recursive method.
Algorithm:
1. Initialize an empty list to store unique tuples
2. Initialize an empty list to keep track of seen tuples
3. Loop through each tuple in the original list:
a. If the tuple has not been seen yet, add it to the unique list and seen list
4. Return the unique list
Python3
def remove_duplicates_recursive(lst, unique_lst = [], seen = []):
if not lst:
return unique_lst
elif lst[ 0 ] not in seen:
unique_lst.append(lst[ 0 ])
seen.append(lst[ 0 ])
return remove_duplicates_recursive(lst[ 1 :], unique_lst, seen)
test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
unique_list = remove_duplicates_recursive(test_list)
print (unique_list)
|
Output
[[(4, 6), (3, 2)], [(3, 2), (4, 5)]]
Time complexity:
– The loop iterates through each tuple in the list once, giving us a time complexity of O(n), where n is the length of the list.
– The `if` statement inside the loop has a time complexity of O(1).
– Therefore, the overall time complexity of the algorithm is O(n).
Space complexity:
– We create two lists to store unique tuples and seen tuples, respectively, so the space complexity of the algorithm is O(n), where n is the length of the list.
Method#7: Using heapq:
Algorithm:
- Initialize the input list
- Convert the list to a numpy array
- Use numpy’s unique() function to remove duplicates
- Convert the numpy array back to a list of lists
- Print the unique list
Python3
import heapq
test_list = [[( 4 , 6 ), ( 3 , 2 )], [( 3 , 2 ), ( 4 , 5 )], [( 3 , 2 ), ( 4 , 5 )]]
print ( "The original list is : " + str (test_list))
tuple_set = set ( map ( tuple , test_list))
unique_list = [ list (t) for t in tuple_set]
print (unique_list)
|
Output
The original list is : [[(4, 6), (3, 2)], [(3, 2), (4, 5)], [(3, 2), (4, 5)]]
[[(3, 2), (4, 5)], [(4, 6), (3, 2)]]
Time complexity:
The time complexity of this algorithm is O(n log n), where n is the number of elements in the input list. This is because the numpy unique() function uses a sorting algorithm, which has a time complexity of O(n log n).
Auxiliary Space:
The space complexity of this algorithm is O(n), where n is the number of elements in the input list. This is because we are converting the input list to a numpy array, which takes up space in memory. Additionally, the numpy unique() function creates a new array to store the unique elements, which also takes up space in memory. Finally, we convert the numpy array back to a list of lists, which also takes up space in memory.
Share your thoughts in the comments
Please Login to comment...