Metacharacters in Java Regex
Last Updated :
16 Nov, 2021
Regex stands for Regular Expression, which is used to define a pattern for a string. It is used to find the text or to edit the text. Java Regex classes are present in java.util.regex package, which needs to be imported before using any of the methods of regex classes.
java.util.regex package consists of 3 classes:
- Pattern
- Matcher
- PatternSyntaxException
Classes in regex package
Metacharacters
Metacharacters are like short-codes for common matching patterns.
Regular Expression
|
Description
|
\d
|
Any digits, short-code for [0-9]
|
\D
|
Any non-digits, short-code for [^0-9]
|
\s
|
Any white space character, short-code for [\t\n\x0B\f\r]
|
\S
|
Any non-whitespace character
|
\w
|
Any word character, short-code for [a-zA-Z_0-9]
|
\W
|
Any non-word character
|
\b
|
Represents a word boundary
|
\B
|
Represents a non-word boundary
|
Usage of Metacharacters
- Precede the metacharacter with backslash (\).
Explanation of Metacharacters
1. Digit & Non Digit related Metacharacters: (\d, \D)
Java
import java.io.*;
import java.util.regex.*;
class GFG {
public static void main(String[] args)
{
System.out.println(Pattern.matches( "\\d" , "2" ));
System.out.println(Pattern.matches( "\\d" , "a" ));
System.out.println(Pattern.matches( "\\D" , "a" ));
System.out.println(Pattern.matches( "\\D" , "2" ));
}
}
|
Output
true
false
true
false
Explanation
- d metacharacter represents a digit from 0 to 9. So when we compare “d” within the range, it then returns true. Else return false.
- D metacharacter represents a non-digit that accepts anything except numbers. So when we compare “D” with any number, it returns false. Else True.
2. Whitespace and Non-Whitespace Metacharacters: (\s, \S)
Java
import java.io.*;
import java.util.regex.*;
class GFG {
public static void main(String[] args)
{
System.out.println(Pattern.matches( "\\s" , " " ));
System.out.println(Pattern.matches( "\\s" , "2" ));
System.out.println(Pattern.matches( "\\S" , "2" ));
System.out.println(Pattern.matches( "\\S" , " " ));
}
}
|
Output
true
false
true
false
Explanation
- s represents whitespace characters like space, tab space, newline, etc. So when we compare “s” with whitespace characters, it returns true. Else false.
- S represents a Non-whitespace character that accepts everything except whitespace, So when we compare “S” with whitespace characters, it returns false. Else true
3. Word & Non Word Metacharacters: (\w, \W)
Java
import java.io.*;
import java.util.regex.*;
class GFG {
public static void main(String[] args)
{
System.out.println(Pattern.matches( "\\w" , "a" ));
System.out.println(Pattern.matches( "\\w" , "2" ));
System.out.println(Pattern.matches( "\\w" , "$" ));
System.out.println(Pattern.matches( "\\W" , "2" ));
System.out.println(Pattern.matches( "\\W" , " " ));
System.out.println(Pattern.matches( "\\W" , "$" ));
}
}
|
Output
true
true
false
false
true
true
Explanation
- w represents word character which accepts alphabets (Capital & small) and digits [0-9]. So when we compare “w” with an alphabet or number returns true. Else false.
- W represents a non-word character that accepts anything except alphabets and digits. So when we compare “W” with an alphabet or number returns false. Else true.
4. Word & Non-Word Boundary Metacharacters: (\b, \B)
Java
import java.io.*;
import java.util.regex.*;
class GFG {
public static void main(String[] args)
{
System.out.println(
Pattern.matches( "\\bGFG\\b" , "GFG" ));
System.out.println(
Pattern.matches( "\\b@GFG\\b" , "@GFG" ));
System.out.println(Pattern.matches(
"\\B@GFG@\\B" , "@GFG@" ));
System.out.println(
Pattern.matches( "\\BGFG\\B" , "GFG" ));
}
}
|
Output
true
false
true
false
Explanation:
- b indicates a string must have boundary elements of word characters, i.e., either digits or alphabets. So here, the GFG string has boundaries G, G, which are word characters so returns true. For the @GFG string, the boundary elements are @, G where @ is not word character, so return false.
- B indicates a string must have boundary elements of Non-word characters, i.e., it may have anything except digits or alphabets. So here @GFG@ string has boundaries @,@ which are Non-word characters so returns true. For the GFG string, the boundary elements are G, G, which are word characters, returning false.
Example:
Java
import java.io.*;
import java.util.regex.*;
class GFG {
public static void main(String[] args)
{
System.out.println(Pattern.matches(
"\\d\\D\\s\\S\\w\\W" , "1G FG!" ));
System.out.println(Pattern.matches(
"\\d\\D\\s\\S\\w\\W" , "Geeks!" ));
}
}
|
Share your thoughts in the comments
Please Login to comment...