23.8.2016

foto Petr Bravenec

Petr Bravenec
Twitter: @BravenecPetr
+420 777 566 384
petr.bravenec@hobrasoft.cz

Links

Wikipedia: Longest common substring problem
Stack Overflow: How to find Longest Common Substring using C++

While working on one of our projects I stumbled upon a simple task: found the station name in information sent from radio receiver and show it on display. The trouble is that in the name of the station may occur also other strings, not only the station name. The song name, for example. For simplicity I assume that the station name is the longest common string which occurs in strings sent from radio receiver.

I'm tired to invent the wheel. So my first step was to visit Google. It was told me that the problem is well known and it has also its own page on Wikipedia and of course on Stack Overflow.

Code found on Wikipedia is pretty hard to read. Link on Stack Overflow asks for C++ solution but only Java or C# code can be found here. Fortunately the Java can be rewriten to C++ easily.

Longest Common String algorithm in C++ and Qt

Download

QString longestCommonString(const QString& str1, const QString& str2) {
    #define MAX(x,y) ( (x>y) ? x : y )
    #define INDEX(x,y) ((x)+((str2size)+1)*(y))
    int str1size = str1.size();
    int str2size = str2.size();
    int *longest = new int[(str1size+1)*(str2size+1)];
    int min_index = 0;
    int max_index = 0;

    for (int i=0; i < str1size + 1; i++) {
        longest[INDEX(0,i)]=0;
        }

    for (int i=0; i < str2size + 1; i++) {
        longest[INDEX(i,0)]=0;
        }

    for (int i=0; i<str2size; i++) {
        for (int j=0; j<str1size; j++) {
            int tmp_min = j;
            int tmp_max = j;
            int tmp_offset = 0;
            if (str2[i] == str1[j]) {
                while (tmp_offset <= i && tmp_offset <= j &&
                       str2[i-tmp_offset] == str1[j-tmp_offset]) {
                    tmp_min -= 1;
                    tmp_offset += 1;
                    }
                }

            tmp_min += 1;
            int length = tmp_max - tmp_min + 1;
            int tmp_max_length = MAX(longest[INDEX(i,j)], 
                                    MAX(longest[INDEX(i+1,j)], 
                                    longest[INDEX(i,j+1)]));
            if (length > tmp_max_length) {
                min_index = tmp_min;
                max_index = tmp_max;
                longest[INDEX(i+1,j+1)] = length;
              } else {
                longest[INDEX(i+1,j+1)] = tmp_max_length;
                }
            }
        }

    delete longest;
    return str1.mid(
            min_index, 
            ((max_index >= str1size-1) ? str1size : max_index + 1) - min_index 
            );
}
Hobrasoft s.r.o. | Contact