1

Topic: About string comparison

. This fragment of the code: static Level convertFromString (const char* levelStr) {if ((strcmp (levelStr, "GLOBAL") == 0) || (strcmp (levelStr, "global") == 0)) return Level:: Global; if ((strcmp (levelStr, "DEBUG") == 0) || (strcmp (levelStr, "debug") == 0)) return Level:: Debug; if ((strcmp (levelStr, "INFO") == 0) || (strcmp (levelStr, "info") == 0)) return Level:: Info; if ((strcmp (levelStr, "WARNING") == 0) || (strcmp (levelStr, "warning") == 0)) return Level:: Warning; if ((strcmp (levelStr, "ERROR") == 0) || (strcmp (levelStr, "error") == 0)) return Level:: Error; if ((strcmp (levelStr, "FATAL") == 0) || (strcmp (levelStr, "fatal") == 0)) return Level:: Fatal; if ((strcmp (levelStr, "VERBOSE") == 0) || (strcmp (levelStr, "verbose") == 0)) return Level:: Verbose; if ((strcmp (levelStr, "TRACE") == 0) || (strcmp (levelStr, "trace") == 0)) return Level:: Trace; return Level:: Unknown;} It is broadcast in this  the code (examples for GCC and Intel). The Intel in this case arrived . But, as we see, strcmp () nobody calls. Now let's change this code so that instead of strcmp () it was used memcmp (): #define mymemcmp (l, r) memcmp (l, r, sizeof (r)-1) Level convertFromString (const char* levelStr) {if ((mymemcmp (levelStr, "GLOBAL") == 0) || (mymemcmp (levelStr, "global") == 0)) return Level:: Global; if ((mymemcmp (levelStr, "DEBUG") == 0) || (mymemcmp (levelStr, "debug") == 0)) return Level:: Debug; if ((mymemcmp (levelStr, "INFO") == 0) || (mymemcmp (levelStr, "info") == 0)) return Level:: Info; if ((mymemcmp (levelStr, "WARNING") == 0) || (mymemcmp (levelStr, "warning") == 0)) return Level:: Warning; if ((mymemcmp (levelStr, "ERROR") == 0) || (mymemcmp (levelStr, "error") == 0)) return Level:: Error; if ((mymemcmp (levelStr, "FATAL") == 0) || (mymemcmp (levelStr, "fatal") == 0)) return Level:: Fatal; if ((mymemcmp (levelStr, "VERBOSE") == 0) || (mymemcmp (levelStr, "verbose") == 0)) return Level:: Verbose; if ((mymemcmp (levelStr, "TRACE") == 0) || (mymemcmp (levelStr, "trace") == 0)) return Level:: Trace; return Level:: Unknown;} Result. GCC already compares a line so,  it is integral types. I.e. in the pseudocode of type of it: int v0 = * (int64 *) "GLOBAL", v1 = * (int64 *) levelStr; if (v0 == v1)... The Intel again philosophizes. In some if ` ah it uses memcmp (), and in some arrives as GCC. Similar, depends on length of a constant line... Usage correctness memcmp () for constant lengths - an individual question, but we can make and so: #define mymemcmp (l, s, r) s == sizeof (r)-1 && 0 == memcmp (l, r, s) Level convertFromString (const char* levelStr) {std:: size_t s = strlen (levelStr); if ((mymemcmp (levelStr, s, "GLOBAL")) || (mymemcmp (levelStr, s, "global"))) return Level:: Global; if ((mymemcmp (levelStr, s, "DEBUG")) || (mymemcmp (levelStr, s, "debug"))) return Level:: Debug; if ((mymemcmp (levelStr, s, "INFO")) || (mymemcmp (levelStr, s, "info"))) return Level:: Info; if ((mymemcmp (levelStr, s, "WARNING")) || (mymemcmp (levelStr, s, "warning"))) return Level:: Warning; if ((mymemcmp (levelStr, s, "ERROR")) || (mymemcmp (levelStr, s, "error"))) return Level:: Error; if ((mymemcmp (levelStr, s, "FATAL")) || (mymemcmp (levelStr, s, "fatal"))) return Level:: Fatal; if ((mymemcmp (levelStr, s, "VERBOSE")) || (mymemcmp (levelStr, s, "verbose"))) return Level:: Verbose; if ((mymemcmp (levelStr, s, "TRACE")) || (mymemcmp (levelStr, s, "trace"))) return Level:: Trace; return Level:: Unknown;} Result. I will not consider a question of efficiency because of the insufficient competence of assembly questions, and because of laziness. To these tests I came after perusal of the code of this project. I was surprised always with similar constructions, branchings on the basis of string comparison, sometimes the set of superfluous comparing is not yet the necessary branch.  I would solve it somehow so: Level convertFromString (const char* levelStr) {if (! levelStr) return Level:: Unknown; switch (*levelStr) {case ' G ': case ' g ': return Level:: Global; case ' D ': case ' d ': return Level:: Debug; case ' I ': case ' i ': return Level:: Info; case ' W ': case ' w ': return Level:: Warning; case ' E ': case ' e ': return Level:: Error; case ' F ': case ' f ': return Level:: Fatal; case ' V ': case ' v ': return Level:: Verbose; case ' T ': case ' t ': return Level:: Trace; default: return Level:: Unknown;}}

2

Re: About string comparison

The code with memcmp is resulted in an output abroad an array when levelStr by the short. The code with strcmp too is broken, as gLoBAl any more . If it would be desirable optimization it is not necessary to dive into the assembler at once. It is possible to check up length levelStr if it is laid down in a range of lengths of checked titles contents levelStr can be copied in an uppercase in  and safely to compare one time to each title in the same register with the help memcmp. If it is a lot of titles and they long then can make sense instead of search to make p in the cook type. And whether it is necessary to optimize this piece generally? It can all once for all lifetime of application is caused.

3

Re: About string comparison

Hello, VTT, you wrote: VTT> the Code with memcmp is resulted in an output abroad an array when levelStr by the short. I about it wrote and resulted the corrected variant. VTT> the code with strcmp too is broken, as gLoBAl any more . And should? In the initial task of it is not present. VTT> It is possible to check up length levelStr if it is laid down in a range of lengths of checked titles in a penultimate fragment of the code I almost and made... Almost...

4

Re: About string comparison

Hello, niXman, you wrote: X> Hello, VTT, you wrote: VTT>> the Code with memcmp is resulted in an output abroad an array when levelStr by the short. X> I about it wrote and resulted the corrected variant. And when I looked a post for the first time it was much shorter... VTT>> the Code with strcmp too is broken, as gLoBAl any more . X> and should? In the initial task of it is not present. So there both statements of the problem are not present, and tests for check too like are not present. But time both registers probably this check can be considered not case-sensitive are checked.

5

Re: About string comparison

Hello, niXman, you wrote: X> I would solve it somehow so: But you understand, what your code produces other answers? For example, in your variant convertFromString ("WTF") == Level:: Warning, but Level:: Unknown in the initial. X> a question of efficiency I will not consider And in what then a question? If on efficiency to spit, also the initial variant descends. If efficiency is important, simply use PHF. For example, in the form of the ready utility gperf which under the row list itself generates the code similar to yours, but with where a smaller amount of bugs (that is which it will be correct to process cases when a line with what does not coincide generally).

6

Re: About string comparison

Hello, VTT, you wrote: VTT> And when I looked a post for the first time it was much shorter... The post was not edited, differently it would be visible. VTT> So there both statements of the problem are not present, and tests for check too like are not present. VTT> But time is checked both registers probably this check can be considered not case-sensitive. Well . I not the fan to guess, made according to the source code.

7

Re: About string comparison

Hello, watchmaker, you wrote: W> But you understand, what your code produces other answers? W> for example, in your variant convertFromString ("WTF") == Level:: Warning, but Level:: Unknown in the initial. Well...  yes. In practice, this function  a broad gull, which itself and . X>> a question of efficiency I will not consider W> And in what then a question? I about  the assembly codes.

8

Re: About string comparison

Hello, niXman, you wrote: X> ..... X> I was surprised always with similar constructions, branchings on the basis of string comparison, sometimes the set of superfluous comparing is not yet the necessary branch. Here I will join. Always drives in depression the similar code personally I prefer to do in such cases the associative containers + search the meta-code map/flat_map/unordered_map <string, Level> levelsIndex = {{"global", Level:: Global}, {"info", Level:: Info}, {"error", Level:: Error}}; Level detectLevel (const string& level) {auto lowercaseLevel = toLower (level); auto it = levelsIndex.find (lowerCaseLevel); return (it == levelsIndex.end ())? Level:: Unknown: it-> second;} only one if it is much easier, rather than 8 if. It is declarative and compactly registers . And it is scaled easily with magnification of variants (all the same 1 if will be) that is my preferences in sequence: 1) use switch 2) use map 3) use many if-s

9

Re: About string comparison

Hello, niXman, you wrote: X> about comparing period well that only in single letter was sealed up

10

Re: About string comparison

Hello, uzhas, you wrote: U> personally I prefer to do in such cases the associative containers + search Are assured, what string as a key for map/unordered_map there is the correct decision?

11

Re: About string comparison

Hello, niXman, you wrote: X> to these tests I came after perusal of the code of this project. I was surprised always with similar constructions, branchings on the basis of string comparison, sometimes the set of superfluous comparing is not yet the necessary branch. The majority of lines in programs,  in parcers, on length the sizes SSE2/AVX do not exceed registers, these lines without problems entirely boot in these registers and are compared by one assembly instruction, it is necessary to write only the implementation strcmp/memcmp specially for comparing of small lines.

12

Re: About string comparison

Hello, niXman, you wrote: String comparison all the same is not effective. But personally I would use (and almost always I use) ternary operator for similar "pattern-matchinga": static Level convertFromString (const char* levelStr) {return levelStr == nullptr? Level:: Unknwonwn: strcasecmp (levelStr, "global") == 0? Level:: Global: strcasecmp (levelStr, "debug") == 0? Level:: Debug: strcasecmp (levelStr, "info") == 0? Level:: Info: strcasecmp (levelStr, "warning") == 0? Level:: Warning: strcasecmp (levelStr, "error") == 0? Level:: Error: strcasecmp (levelStr, "fatal") == 0? Level:: Fatal: strcasecmp (levelStr, "verbose") == 0? Level:: Verbose: strcasecmp (levelStr, "trace") == 0? Level:: Trace: Level:: Unknown;}

13

Re: About string comparison

Hello, niXman, you wrote: By the way, who in a subject, whether is in boost any decision, specially for string comparison acceleration?

14

Re: About string comparison

Hello, antropolog, you wrote: A> Hello, niXman, you wrote: A> String comparison all the same is not effective. But personally I would use (and almost always I use) ternary operator for similar "pattern-matchinga": A> A> static Level convertFromString (const char* levelStr) {A> return levelStr == nullptr? Level:: Unknwonwn A>: strcasecmp (levelStr, "global") == 0? Level:: Global A>: strcasecmp (levelStr, "debug") == 0? Level:: Debug A>: strcasecmp (levelStr, "info") == 0? Level:: Info A>: strcasecmp (levelStr, "warning") == 0? Level:: Warning A>: strcasecmp (levelStr, "error") == 0? Level:: Error A>: strcasecmp (levelStr, "fatal") == 0? Level:: Fatal A>: strcasecmp (levelStr, "verbose") == 0? Level:: Verbose A>: strcasecmp (levelStr, "trace") == 0? Level:: Trace A>: Level:: Unknown; A>} A> This code is not equivalent to the original in the beginning. Here "Global" works correctly, and in original is not present.

15

Re: About string comparison

Hello, smeeld, you wrote: S> By the way who in a subject, whether is in boost any decision, specially for string comparison acceleration? Anything such did not see, but saw fast comparing for boost:: uuid: https://github.com/boostorg/uuid/blob/d … id_x86.hpp

16

Re: About string comparison

Hello, niXman, you wrote: X> .... X>  X> I would solve it somehow so: X> X> Level convertFromString (const char* levelStr) {X> if (! levelStr) return Level:: Unknown; X> switch (*levelStr) {X> case ' G ': case ' g ': return Level:: Global; X> case ' D ': case ' d ': return Level:: Debug; X> case ' I ': case ' i ': return Level:: Info; X> case ' W ': case ' w ': return Level:: Warning; X> case ' E ': case ' e ': return Level:: Error; X> case ' F ': case ' f ': return Level:: Fatal; X> case ' V ': case ' v ': return Level:: Verbose; X> case ' T ': case ' t ': return Level:: Trace; X> default: return Level:: Unknown; X>} X>} X> There is a ready compiler for similar entertainments http://www.colm.net/open-source/ragel/

17

Re: About string comparison

Hello, smeeld, you wrote: S> By the way who in a subject, whether is in boost any decision, specially for string comparison acceleration? Here: https://github.com/WojciechMula/simd-string