Open In App

Java Program to Implement Bitap Algorithm for String Matching

Last Updated : 22 Mar, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

The Bitap Algorithm is an approximate string matching algorithm. The algorithm tells whether a given text contains a substring which is “approximately equal” to a given pattern. Here approximately equal states that if the substring and pattern are within a given distance k of each other. The algorithm begins by precomputing a set of bitmasks containing one bit for each element of the pattern. This does most of the bitwise operations, which are quick.

The Bitap Algorithm is also known as shift-or, shift-and, or Baeza Yates Gonnet Algorithm. 

Example:

Input:
Text: geeksforgeeks
Pattern: geeks
Output:
Pattern found at index: 0

Input:
Text: Youareawesome
Pattern: Youareamazing
Output:
No Match.

Approach:

  • Input the text pattern as a string.
  • We will convert this String to a simple Char Array
  • If the length is 0 or exceeding 63, we return “No Match”.
  • Now, by taking array R which complements 0.
  • Taking an array “pattern_mask” which complements 0, to 1
  • Taking the pattern as an index of “pattern_mask” then by using the and operator we and it with the result of the complement of 1L(long integer) by shifting it to left side by i times.
  • Now by running the loop till text length.
  • We now or it with R and pattern mask.
  • Now by left shifting the 1L by the length of the pattern and then the result is an and with R
  • If it is equal to zero we print it by I-len+1 else return -1
  • Output is the index of the text, where the pattern matches.

Code:

Java




// Java Program to implement Bitap Algorithm.
 
import java.io.*;
import java.io.IOException;
 
public class GFG {
    public static void main(String[] args)
        throws IOException
    {
 
        System.out.println("Bitap Algorithm!");
 
        String text = "geeksforgeeks";
 
        String pattern = "geeks";
 
        // This is an object created of the class for
        // extension of functions.
        GFG g = new GFG();
 
        // Now here we go to findPattern functions , we two
        // arguments.
        g.findPattern(text, pattern);
    }
 
    public void findPattern(String t, String p)
    {
 
        // we convert the String text to Character Array for
        // easy indexing
        char[] text = t.toCharArray();
 
        // we convert the String pattern to Character Array
        // to access each letter in the String easily.
        char[] pattern = p.toCharArray();
 
        // Index shows the function bitap search if they are
        // equal at a particular index or not
        int index = bitap_search(text, pattern);
 
        // If the pattern is not equal to the text of the
        // string approximately Then we tend to return -1 If
        // index is -1 Then we print there is No match
        if (index == -1) {
            System.out.println("\nNo Match\n");
        }
 
        else {
 
            // Else if there is a match
            // Then we print the position of the index at
            // where the pattern and the text matches.
            System.out.println(
                "\nPattern found at index: \n" + index);
        }
    }
 
    private int bitap_search(char[] text, char[] pattern)
    {
 
        // Here the len variable is taken
        // This variable accepts the pattern length as its
        // value
        int len = pattern.length;
 
        // This is an array of pattern_mask of all
        // character values in it.
 
        long pattern_mask[]
            = new long[Character.MAX_VALUE + 1];
 
        // Here the variable of being long type is
        // complemented with 1;
 
        long R = ~1;
 
        // Now if the length of the pattern is 0
        // we would return -1
        if (len == 0) {
            return -1;
        }
 
        // Or if the length of the pattern exceeds the
        // length of the character array Then we would
        // declare that the pattern is too long. We would
        // return -1
       
        if (len > 63) {
 
            System.out.println("Pattern too long!");
            return -1;
        }
 
        // Now filling the values in the pattern mask
        // We would run th eloop until the max value of
        // character Initially all the values of Character
        // are put up inside the pattern mask array And
        // initially they are complemented with zero
       
        for (int i = 0; i <= Character.MAX_VALUE; ++i)
 
            pattern_mask[i] = ~0;
 
        // Now len being the variable of pattern length ,
        // the loop is set  till there Now the pattern being
        // the index of the pattern_mask 1L means the long
        // integer is shifted to left by i times The result
        // of that is now being complemented the result of
        // the above is now being used as an and operator We
        // and the pattern_mask and the result of it
       
        for (int i = 0; i < len; ++i)
            pattern_mask[pattern[i]] &= ~(1L << i);
 
       
        // Now the loop is made to run until the text length
        // Now what we do this the R array is used
        // as an Or function with pattern_mask at index of
        // text of i
 
        for (int i = 0; i < text.length; ++i) {
 
             
          R |= pattern_mask];
 
            // Now  result of the r after the above
            // operation
            // we shift it to left side by 1 time
 
            R <<= 1;
 
            // If the 1L long integer if shifted left of the
            // len And the result is used to and the result
            // and R array
            // If that result is equal to 0
            // We return the index value
            // Index=i-len+1
           
            if ((R & (1L << len)) == 0)
 
                return i - len + 1;
        }
 
        // if the index is not matched
        // then we return it as -1
        // stating no match found.
       
        return -1;
    }
}


Output

Bitap Algorithm!

Pattern found at index: 
0


Like Article
Suggest improvement
Previous
Next
Share your thoughts in the comments

Similar Reads