Repeated Substring Pattern⚓︎
Description⚓︎
Given a string s, check if it can be constructed by taking a substring of it and appending multiple copies of the substring together.
Example 1:
- Input:
s = "abab" - Output:
true - Explanation:
It is the substring "ab" twice.
Example 2:
- Input:
s = "aba" - Output:
false
Example 3:
- Input:
s = "abcabcabcabc" - Output:
true - Explanation:
It is the substring "abc" four times or the substring "abcabc" twice.
Constraints:
1 <= s.length <= 10^4sconsists of lowercase English letters.
Solution⚓︎
Moving Match⚓︎
- Time complexity: \(O(n)\);
- Space complexity: \(O(1)\).
Easier way of writing (see reference):
KMP⚓︎
- Time complexity: \(O(n)\);
- Space complexity: \(O(n)\).
The ne is essentially the LPS (Longest Proper Prefix which is also Suffix) array used in KMP. It stores the length of the longest proper prefix that is also a proper suffix for every substring of s.
ne[i]: The length of the longest proper prefix of the substrings[1..i]which is also a proper suffix ofs[1..i].
Code Explanation:
- Initialization:
n: Length of the strings.s = ' ' + s: The stringsis modified by adding a space at the beginning. This is done to make the string 1-indexed, simplifying the algorithm's logic.vector<int> ne(n + 1, 0): Initializes thenearray of sizen+1with all elements as 0.- Building the
neArray: - The loop
for (int i = 2, j = 0; i <= n; i++)iterates over the string. while (j && s[i] != s[j + 1]) j = ne[j]: This line checks if the current character doesn't match the character in the pattern. If they don't match, we fall back to the previous position'snevalue.if (s[i] == s[j + 1]) j++: If the characters match, we incrementj.ne[i] = j: Assign the calculatednevalue for the positioni.- Checking for Repeated Substring Pattern:
int t = n - ne[n]: This calculates the length of the smallest repeating unit.return t < n && n % t == 0: Checks if the length of the repeating unit is less thannand ifnis a multiple oft.
Why "n - ne[n]" is the Length of the Repeated Substring?:
ne[n]gives the length of the longest proper prefix which is also a suffix for the entire string.- Subtracting this from the total length (
n - ne[n]) gives us the smallest segment of the string that, when repeated, could recreate the original string. - If this segment is smaller than the entire string and the total length of the string is a multiple of this segment's length, it implies that the string is composed of repeated occurrences of this segment.
Correctness of the Algorithm:
- The preprocessing part correctly computes the
nearray which captures the prefix-suffix relationship in the string. - The check
n % t == 0ensures that the entire string can be constructed by repeating this smallest segment. - Since the
nearray is constructed to reflect the longest repeating prefix-suffix pattern in the string, usingn - ne[n]to find the smallest repeating unit is logically sound.