Open In App

Tokenizing a string in C++

Last Updated : 02 Jan, 2023
Improve
Improve
Like Article
Like
Save
Share
Report

Tokenizing a string denotes splitting a string with respect to some delimiter(s). There are many ways to tokenize a string. In this article four of them are explained:

Using stringstream

A stringstream associates a string object with a stream allowing you to read from the string as if it were a stream.

Below is the C++ implementation : 

C++




// Tokenizing a string using stringstream
#include <bits/stdc++.h>
 
using namespace std;
 
int main()
{
     
    string line = "GeeksForGeeks is a must try";
     
    // Vector of string to save tokens
    vector <string> tokens;
     
    // stringstream class check1
    stringstream check1(line);
     
    string intermediate;
     
    // Tokenizing w.r.t. space ' '
    while(getline(check1, intermediate, ' '))
    {
        tokens.push_back(intermediate);
    }
     
    // Printing the token vector
    for(int i = 0; i < tokens.size(); i++)
        cout << tokens[i] << '\n';
}


Output

GeeksForGeeks
is
a
must
try

Time Complexity: O(n ) where n is the length of string.
Auxiliary Space: O(n-d) where n is the length of string and d is the number of delimiters.

Using strtok()

// Splits str[] according to given delimiters.
// and returns next token. It needs to be called
// in a loop to get all tokens. It returns NULL
// when there are no more tokens.
char * strtok(char str[], const char *delims);

Below is the C++ implementation : 

C++




// C/C++ program for splitting a string
// using strtok()
#include <stdio.h>
#include <string.h>
 
int main()
{
    char str[] = "Geeks-for-Geeks";
 
    // Returns first token
    char *token = strtok(str, "-");
 
    // Keep printing tokens while one of the
    // delimiters present in str[].
    while (token != NULL)
    {
        printf("%s\n", token);
        token = strtok(NULL, "-");
    }
 
    return 0;
}


Output

Geeks
for
Geeks

 

Time Complexity: O(n ) where n is the length of string.
Auxiliary Space: O(1).

Another Example of strtok() :

C




// C code to demonstrate working of
// strtok
#include <string.h>
#include <stdio.h>
 
// Driver function
int main()
{
 // Declaration of string
    char gfg[100] = " Geeks - for - geeks - Contribute";
 
    // Declaration of delimiter
    const char s[4] = "-";
    char* tok;
 
    // Use of strtok
    // get first token
    tok = strtok(gfg, s);
 
    // Checks for delimiter
    while (tok != 0) {
        printf(" %s\n", tok);
 
        // Use of strtok
        // go through other tokens
        tok = strtok(0, s);
    }
 
    return (0);
}


Output

  Geeks 
  for 
  geeks 
  Contribute

Time Complexity: O(n ) where n is the length of string.
Auxiliary Space: O(1).

Using strtok_r()

Just like strtok() function in C, strtok_r() does the same task of parsing a string into a sequence of tokens. strtok_r() is a reentrant version of strtok().

There are two ways we can call strtok_r() 

// The third argument saveptr is a pointer to a char * 
// variable that is used internally by strtok_r() in 
// order to maintain context between successive calls
// that parse the same string.
char *strtok_r(char *str, const char *delim, char **saveptr);

Below is a simple C++ program to show the use of strtok_r() : 

C++




// C/C++ program to demonstrate working of strtok_r()
// by splitting string based on space character.
#include<stdio.h>
#include<string.h>
 
int main()
{
    char str[] = "Geeks for Geeks";
    char *token;
    char *rest = str;
 
    while ((token = strtok_r(rest, " ", &rest)))
        printf("%s\n", token);
 
    return(0);
}


Output

Geeks
for
Geeks

Time Complexity: O(n ) where n is the length of string.
Auxiliary Space: O(1).

Using std::sregex_token_iterator

In this method the tokenization is done on the basis of regex matches. Better for use cases when multiple delimiters are needed.

Below is a simple C++ program to show the use of std::sregex_token_iterator:

C++




// CPP program for above approach
#include <iostream>
#include <regex>
#include <string>
#include <vector>
 
/**
 * @brief Tokenize the given vector
   according to the regex
 * and remove the empty tokens.
 *
 * @param str
 * @param re
 * @return std::vector<std::string>
 */
std::vector<std::string> tokenize(
                     const std::string str,
                          const std::regex re)
{
    std::sregex_token_iterator it{ str.begin(),
                             str.end(), re, -1 };
    std::vector<std::string> tokenized{ it, {} };
 
    // Additional check to remove empty strings
    tokenized.erase(
        std::remove_if(tokenized.begin(),
                            tokenized.end(),
                       [](std::string const& s) {
                           return s.size() == 0;
                       }),
        tokenized.end());
 
    return tokenized;
}
 
// Driver Code
int main()
{
    const std::string str = "Break string
                   a,spaces,and,commas";
    const std::regex re(R"([\s|,]+)");
   
    // Function Call
    const std::vector<std::string> tokenized =
                           tokenize(str, re);
   
    for (std::string token : tokenized)
        std::cout << token << std::endl;
    return 0;
}


Output

Break
string
a
spaces
and
commas

Time Complexity: O(n * d) where n is the length of string and d is the number of delimiters.
Auxiliary Space: O(n)



Previous Article
Next Article

Similar Reads

Count characters of a string which when removed individually makes the string equal to another string
Given two strings A and B of size N and M respectively, the task is to count characters of the string A, which when removed individually makes both the strings equal. If there exists several such characters, then print their respective positions. Otherwise, print "-1". Examples: Input: A = "abaac", B = "abac"Output: 2 3 4Explanation: Following remo
8 min read
Generate string by incrementing character of given string by number present at corresponding index of second string
Given two strings S[] and N[] of the same size, the task is to update string S[] by adding the digit of string N[] of respective indices. Examples: Input: S = "sun", N = "966"Output: bat Input: S = "apple", N = "12580"Output: brute Approach: The idea is to traverse the string S[] from left to right. Get the ASCII value of string N[] and add it to t
4 min read
std::string::length, std::string::capacity, std::string::size in C++ STL
Prerequisite: String in C++ String class is one of the features provided by the Standard template library to us, So it comes up with great functionality associated with it. With these Functionalities, we can perform many tasks easily. Let's see a few of the functionalities string class provides. Header File &lt;string&gt; String Functionalities The
6 min read
String slicing in Python to check if a string can become empty by recursive deletion
Given a string “str” and another string “sub_str”. We are allowed to delete “sub_str” from “str” any number of times. It is also given that the “sub_str” appears only once at a time. The task is to find if “str” can become empty by removing “sub_str” again and again. Examples: Input : str = "GEEGEEKSKS", sub_str = "GEEKS" Output : Yes Explanation :
2 min read
Longest palindromic string formed by concatenation of prefix and suffix of a string
Given string str, the task is to find the longest palindromic substring formed by the concatenation of the prefix and suffix of the given string str. Examples: Input: str = "rombobinnimor" Output: rominnimor Explanation: The concatenation of string "rombob"(prefix) and "mor"(suffix) is "rombobmor" which is a palindromic string. The concatenation of
11 min read
String slicing in Python to Rotate a String
Given a string of size n, write functions to perform following operations on string. Left (Or anticlockwise) rotate the given string by d elements (where d &lt;= n).Right (Or clockwise) rotate the given string by d elements (where d &lt;= n).Examples: Input : s = "GeeksforGeeks" d = 2Output : Left Rotation : "eksforGeeksGe" Right Rotation : "ksGeek
3 min read
Check whether second string can be formed from characters of first string
Given two strings str1 and str2, check if str2 can be formed from str1 Example : Input : str1 = geekforgeeks, str2 = geeksOutput : YesHere, string2 can be formed from string1. Input : str1 = geekforgeeks, str2 = andOutput : NoHere string2 cannot be formed from string1. Input : str1 = geekforgeeks, str2 = geeeekOutput : YesHere string2 can be formed
5 min read
Convert string X to an anagram of string Y with minimum replacements
Given two strings X and Y, we need to convert string X into an anagram of string Y with minimum replacements. If we have multiple ways of achieving the target, we go for the lexicographically smaller string where the length of each string [Tex]\in [1, 100000] [/Tex] Examples: Input : X = "CDBABC" Y = "ADCABD" Output : Anagram : ADBADC Number of cha
13 min read
Create a new string by alternately combining the characters of two halves of the string in reverse
Given a string s, create a new string such that it contains the characters of the two halves of the string s combined alternately in reverse order. Examples: Input : s = carbohydratesOutput : hsoebtraarcdy Input : s = sunshineOutput : sennuish Explanation: Example 1: Two halves of the string carbohydrate are carboh and ydrates. As they needed to be
7 min read
Convert the string into palindrome string by changing only one character
Given a string str. Check if it is possible to convert the string into palindrome string by changing only one character.Examples: Input : str = "abccaa" Output : Yes We can change the second last character i.e. 'a' to 'b' to make it palindrome string Input : str = "abbcca" Output : No We can not convert the string into palindrome string by changing
5 min read
Practice Tags :