String theory

Millennial Pirate
5 min readJan 11, 2021

Strings are an important concept to cover when it comes to interview-preparation for many companies. In this article, I plan to cover the concepts of string also providing its real-life application along with the solution of a problem on LeetCode based on the above concepts. So let’s get started!

Concept:

There are two ways to hold a sequence of characters in C++. They are

  1. Using character arrays
  2. Using strings

Remember these both are not the same, the difference lies in the fact that character arrays are of the data type character whereas a string is a class that defines objects that can be represented as a string of characters. One more difference lies in the size allocation. Size allocation is static in the case of character arrays whereas it's dynamic in the case of strings. One advantage of this dynamic size allocation is there is no much wastage in memory.

Since strings are actually a class that can be instantiated with the help of objects, there are certain pre-defined member functions in strings. Let’s see some of them :

  1. push_back(char): This function is used to insert a character in the string at its end. In short, appending a character at the end of the string.
  2. pop_back(): This function is used to pop the last character of the string.
  3. length(): This function is used to find the current length of the string.
  4. begin(): This is an iterator function that points to the beginning character of the string.
  5. end(): This is also an iterator function that points to the ending character of the string.

Real-life applications:

  1. Plagiarism detection: String matching algorithms are used for plagiarism detection.
  2. Spam filters: Spam filters are used for matching to discard the spam. For classifying the messages as spam or not the keywords in the message are searched through string matching algorithms.
  3. Search engines: String matching algorithms are also used in search engines to categorize and organize the data so that the results take minimal time to get displayed.

Problem:

Problem statement:

The string "PAYPALISHIRING" is written in a zigzag pattern on a given number of rows like this: (you may want to display this pattern in a fixed font for better legibility)

P   A   H   N
A P L S I I G
Y I R

And then read line by line: "PAHNAPLSIIGYIR"

Write the code that will take a string and make this conversion given a number of rows:

string convert(string s, int numRows);

Sample input and output:

Input: s = "PAYPALISHIRING", numRows = 3
Output: "PAHNAPLSIIGYIR"

Constraints:

  • 1 <= s.length <= 1000
  • s consists of English letters (lower-case and upper-case), ',' and '.'.
  • 1 <= numRows <= 1000

Approach:

image from imgflip.com

For this question, the number of rows(numRows) is specified in the problem. We make an array of strings of size numRows. Each element of the array will store each line of the format expected. So the string “PAYPALISHIRING” becomes

P   A   H   N
A P L S I I G
Y I R

So in the string array, each element will hold the string when the output is read line by line.

So the array will contain elements

{“PAHN”, “APLSIIG”, “YIR”}

In the end, we just display all the strings in the array as a single string. Now the next point which we must think about is how to store the string elements in this format?

Let’s now see how can we store the string in such a format.

  1. Let the array of strings be arr[]
  2. Traverse the input string character by character and store the characters to their respective strings in the array arr[].
  3. Since the arr[] strings contain the subparts of the input string read row-wise, we store each element at a particular row in the output in the index with the same value as the row. This requires a little bit of dry running to understand so…
image from memegenerator.net

4. After that comes the task of updating the row. While updating the row number we must keep in mind that it must not exceed the value numRows. For this, we maintain a variable called “dir” which tells us the direction we are traversing to get the particular output format.

5. If we reach row == 0 then it must mean we are at the top row and cannot move more upwards so we make the direction down.

6. Similarly if row == numRows-1 then we cannot proceed further down so we update the direction as up.

7. Else if the current direction is down and we have not reached the bottom then go down by doing row++.

8. Else if the current direction is positive and we have not reached the top then we go more up by decrementing row.

9. Now that we have the array filled print all the strings at once.

Code:

Let’s code this approach now (please read the comments for further understanding):

Analysis of the algorithm:

We can see that storing the string in the array in the required format takes just O(n) time where n is the length of the string. After this, while storing the strings in the array as a single string even though we had taken two for loops the time complexity will still be O(n) as each character is visited once (subparts were stored as strings in the array, not the whole string). The additional space complexity will be O(n) for this problem.

Hope this article has cleared most of the concepts regarding strings. If you like this article don’t hesitate to click the clap icon and in case of any doubts or suggestion feel free to give a comment. Thank you!

--

--