When there is a need to find an input pattern in a string of characters, coders and programmers use the string matching algorithm. Usually, in case of a short string, python programmers prefer to use the naïve approach in which, the program checks each position in the input string for the query pattern. In case it matches, it given an output with the position number.
One of the biggest reasons why naïve string matching algorithm is used is because it is fast and yields quite accurate results. Moreover, it does not require pre-processing. In any case, we will be discussing these advantages at a later stage in this post. Let’s first understand the algorithm for pattern search using the naïve approach.
Naïve Pattern Search Algorithm
In the naïve string pattern search, the program tests the position of the input pattern P [1……i] in a string of characters T [1…..m].
Note that the length of the input text or string will always be greater than or equal to that of the pattern.
Here is the naïve pattern search algorithm for different programming languages.
pat = pattern Size
str = string size
for i = 0 to (str – pat), do
for j = 0 to pat, do
if text[i+j] ≠ pattern[j], then
break the loop
if j == pat, then
display the position of i as pattern found
This algorithm is quite an important one in computer science, as it helps give search results as an ouput.
Examples of Naïve String Matching on Python
Here is an example where the naïve pattern search approach is used in a code of python.
# Python program for Naïve String Matching
# Searching algorithm
def search(P, T):
X = len(P)
Y = len(T)
# A loop to shift P one by one */
for i in range(X – Y + 1):
j = 0
# For current index i, check
# for pattern match */
for j in range(0, X):
if (txt[i + j] != P[j]):
if (j == X – 1):
print(“Pattern found at position “, i)
# Driver Code
if __name__ == ‘__main__’:
T = “UPGRADEDUBUPGRAABUPGRADEDU”
P = “UPGRAD”
Pattern found at position 0
Pattern found at position 17
Explanation: The first position is the 0th position. Since the pattern “UPGRAD” was first spotted here, the output showed that the pattern is found at position 0.
Similarly, the next pattern was found at the position 17.
Best Case of Naïve Pattern Search
There in only one best case for naïve pattern search algorithm, unlike the two worst cases.
The best case occurs when the first character in the pattern text is nowhere in the input string.
T  = “UPGRADEDUHIJKLUPGRA”;
P  = “TUPGRA”;
And therefore, the number of matching patterns case is O(n).
Worst Case of Naïve Pattern Search
There are two worst cases in the naïve string searching approach.
- When all the characters in the pattern are the same as those in the input string.
T  = “EEEEEEEEEEEEEEEE”;
P  = “EEE”;
- When only the last character in the pattern differs from the input string.
T  = “EEEEEEEEEEED”;
P  = “EEEED”;
In such cases, the number of comparisons in O(m*(n-m+1)).
Features of Naïve String Matching Algorithm
String matching algorithm is meant for finding all the occurrences of a given pattern in a text.
Here are the top features of the algorithm.
- It is the simplest method among all to look for patterns in an input text. It checks all the characters one by one in the given string of characters.
- It finds the exact string matches – be it more or more exact occurrences of the pattern.
- It is more used when there is small text. Moreover, it does not require any pre-processing phases.
- This search method does not occupy extra space to work and look for the patterns in the string.
Also read: Data Structure & Algorithm in Python
Advantages of Naïve Pattern Search
- There are no pre-processing phases required in the naïve search approach, as its running time is equal to the matching time.
- There is no extra operating space needed.
- The comparisons of the patterns with the strings can be done in any order.
Disadvantages of Naïve String Matching
There is only one disadvantage of the naïve string matching approach, which is that it is inefficient. This is because when it has found a position, it does not use it again to find the other position. It goes back to the starting point and looks for the pattern over again. And so, it does not use the information from the previous shift again.
The naïve string matching algorithm is the most preferred approach to finding the positions of said patterns in a given text for various reasons like no pre-processing requirement, no extra space for operation, etc. However, it cannot be used for rather larger texts because of its inefficiency to perform large operations faster.
We hope that this post gave you a substantially good idea about the naïve pattern search approach in python. To learn about the uses of this approach and get a broader understanding of the topic, get in touch with the experts at upGrad. We have specially designed courses for individuals looking to expand their skillset. Reach out to us today!
If you’re interested to learn more about AI, machine learning, check out IIIT-B & upGrad’s PG Diploma in Machine Learning & AI which is designed for working professionals and offers 450+ hours of rigorous training, 30+ case studies & assignments, IIIT-B Alumni status, 5+ practical hands-on capstone projects & job assistance with top firms.
What is a naïve string-matching algorithm?
A naive string-matching algorithm is one that simply compares the two strings character by character. This naive algorithm is used by many early computer programs that implemented simple file searching functions. In other words, the strings are compared character for character and the algorithm stops once a mismatch is found. This is an inappropriate way to do string matching as it is slow and wasteful of memory. This is very inefficient since the number of strings in a text is humongous but the search query is only a few characters.
What are the limitations of naïve algorithms for string matching?
Unsatisfiability of 8-queens and related problems as NP-complete show that naïve string-matching algorithms have limitations. Naïve string-matching algorithm will not give you the solution. In case of string-matching it requires exponential time. So, if you have n strings to be matched, it will take 2n time to complete. To get around this problem an algorithm has been developed which has made the string-matching problem feasible. This algorithm, which is an exponential time algorithm, is called Aho-Corasick algorithm. This algorithm works on the principle of dynamic programming.
How can we optimize naive string-matching algorithms?
Optimization of naïve string-matching algorithms is done in two ways:
1) String database search: This is the best solution for database search. It is fast, but requires a huge budget.
2) Tries: These are a great alternative to the database, because they can be made from memory, which keeps them low-budget. You can easily represent the string in a binary tree form. Then, you just go through the tree, and check for the result. If you find that you are at the end of the tree, you have found a good match. There is no need to go back to the beginning of the tree. This algorithm is fast, but it does not allow for long strings to be compared.