What's the fastest way to interpret a bool string into a number in C?

Question

I googled it and all the results were about C++ and C# so I am asking a C specific question.
// Str to lower
if (!memcmp(Str, "true", 4) || !memcmp(Str, "1", 1) || ...) {
     return 1;
} else if (!memcmp(Str, "false", 5) || !memcmp(Str, "0", 1) || ...) {
     return 0;
}
return -1;

That's one way to do it. But I'm not quite sure that's the most efficient way to do it. What's the most efficient way to interpret a bool string (eg. "true") into the equivalent value 1?

wovano · Answer

Comparison and benchmark results
Since a number of implementations have been posted here without any benchmarks, I took the liberty to compile them all and make a performance comparison.
Interestingly, most of the answers are actually slower than the code from the question (at least at my machine)!
Benchmarking of all implementations is performed in the same way, by executing them 500000000 times in a loop and measuring the CPU time. Tests are performed with all four mentioned valid values ("true", "false", "0" and "1") and an invalid value. Minimum, maximum and average execution time over all executions are determined.
I measured the time of the following implementations:

empty function (reference): just an empty function that always returns -1, used as reference (execution time includes benchmarking overhead)
code from question by user13783520: code from the question, slightly modified to make it work.
1st answer of chux
1st answer of user13783520 (NB: added strlen() because length of string is not known in advance)
answer of Zan Lynx
2nd answer of chux
2nd answer of user13783520 (NB: added strlen() because length of string is not known in advance, but code requires more changes to include boundary checking)
answer of wovano

Note that it's difficult to make a completely fair comparison between the implementations for at least the following reasons:

Two implementations are actually invalid (resulting in Undefined Behavior) because the boundary of the input string is not checked. One implementation resulted in a crash, so that I couldn't measure the time in the same way as I did for all other implementations.
Some implementations do not check for invalid values. They always return 0 or 1, never -1.
Some implementations require the length of the input string to be known in advance. If that's not the case, it should be determined using strlen() (which I added to the code), making the implementation slower off course.
Performance may vary depending on target platform, user input, etc.

Benchmark results
(tests performed on Intel Core i7-6500U, on Ubuntu for Windows, compiled with gcc -O3)

wovano · Answer

I want to start by saying that I agree with earlier comments that it's not really useful to optimize this function. We're talking about saving nanoseconds on user interaction that typically takes seconds or more. The processing time is probably less than the time it takes for the "enter" key to be released.
Having said that, here's my implementation. It's a pretty simple implementation, avoiding unnecessary calls to library functions, and giving the compiler enough freedom to optimize the code.
On my machine (Intel Core i7-6500U, compiled with gcc -O3) this implementation is faster than all current answers.
int str_to_bool(const char *str)
{
    if ((str[0] & 0xFE) == 48) { // ch == '0' or '1'
        if (str[1] == '') {
            return str[0] - 48;
        }
    } else if (str[0] == 't') {
        if (str[1] == 'r' && str[2] == 'u' && str[3] == 'e' && str[4] == '') {
            return 1;
        }
    } else if (str[0] == 'f') {
        if (str[1] == 'a' && str[2] == 'l' && str[3] == 's' && str[4] == 'e' && str[5] == '') {
            return 0;
        }
    }
    return -1;
}

UPDATATED version
The following versions works with the updated requirements that were not mentioned in the question but in comments. This handles "true", "false", "yes", "no", "t", "f", "y", "n", "1" and "0" and the first letter may be uppercase as well. It's a bit more verbose but still very fast.
int str_to_bool(const char *str)
{
    if ((str[0] & 0xFE) == 48) { // ch == '0' or '1'
        if (str[1] == '') {
            return str[0] - 48;
        }
    } else if ((str[0] | 32) == 't') {
        if (str[1] == '') {
            return 1;
        }
        if (str[1] == 'r' && str[2] == 'u' && str[3] == 'e' && str[4] == '') {
            return 1;
        }
    } else if ((str[0] | 32) == 'f') {
        if (str[1] == '') {
            return 0;
        }
        if (str[1] == 'a' && str[2] == 'l' && str[3] == 's' && str[4] == 'e' && str[5] == '') {
            return 0;
        }
    } else if ((str[0] | 32) == 'y') {
        if (str[1] == '') {
            return 1;
        }
        if (str[1] == 'e' && str[2] == 's' && str[3] == '') {
            return 1;
        }
    } else if ((str[0] | 32) == 'n') {
        if (str[1] == '') {
            return 0;
        }
        if (str[1] == 'o' && str[2] == '') {
            return 0;
        }
    }
    return -1;
}

Q&A (explanation and background info)
Some additional information to answer questions that were asked in comments:
Q: Why is this faster than using memcmp()? I've been told to use library functions when possible.
A: In general it's a good practice to make use of standard library functions such as memcmp(). They're heavily optimized for their intended use and for the targeted platform. For example, on modern cpu architectures memory alignment heavily influences performance, so a memcmp() implementation for such platform will make efforts to read data using the optimal memory alignment. Consequently, the start and end of the memory buffer may need to be handled differently, because they are not guaranteed to be aligned. This causes some overhead, making the implementation slower for small buffers and faster for large buffers. In this case only 1-5 bytes are compared, so using memcmp is not really advantageous. Besides, using the function introduces some calling overhead as well. So, in this case, doing the comparison manually will be much more efficient.
Q: Isn't the use of a switch statement faster than an if-else ladder?
A: It could be, but there's no guarantee. First of all, it depends on the compiler how the switch statement is translated. A common method is to use a jump table. However, this is only feasible if the values used in the case statements are close too each other, otherwise the jump table would be too big to fit in memory. Also note that the jump table implementation is reasonably expensive to execute. My guess is that it's starting to be efficient to use if there at least five cases. Secondly, a good compiler is able to implement a jump table as separate if statements, but it's also able to implement an if-else ladder as a jump table if that would be more efficient. So it really shouldn't matter what you use in C, as long as you make sure that the compiler has enough information and freedom to make such optimizations. (For proof, compile this code for armv7-a using clang 10.0.0 and you'll see that it generates a jump table.)
Q: Isn't it bad to use strcmp() if you already know the length of the string?
A: Well, that depends...

If the length of the string is known in advance, using memcmp() would make more sense indeed, because it probably is slightly faster. However, this is not guaranteed, so you should really benchmark it to know for sure. I can think of a number of reasons why strcmp() could be faster in this case.
If the length of the string is not known, it must be determined (using strlen()) before you can use memcmp(), or access the data otherwise. However, calling strlen() is quite expensive. It could take more time than the above full function to execute.
Note that executing memcmp(Str, "false", 5) is illegal if the buffer is less than 5 bytes. According to the C standard, this results in Undefined Behavior, meaning that the application could crash or give other unexpected results.

Finally, note that my algorithm basically works like a tree. It first checks the first character. If that's a valid character, it will continue with the second character. As soon as a character is found that's not valid, the function returns -1. So it reads every character only once (if the compiler does it's job correctly), in contrast to some of the other implementations that read the input data multiple times.

user13783520 · Answer

I also decided that you can, for short strings such as booleans, convert it into a number by copying the memory and then switching the result:
#include <stdint.h>
signed char BoolFromStrCandidate2(const char *const StrIn, register const unsigned char Len) {
    int64_t Word = 0;
    memcpy(&Word, StrIn, Len);
    switch (Word|32) {
        case '0':
        case 'f':
        case 0x65736c6166:
        case 'n':
        case 0x6f6e:
            return 0;
        case '1':
        case 't':
        case 0x65757274:
        case 'y':
        case 0x736579:
            return 1;
    }
    return -1;
}

chux - Reinstate Monica · Answer

fastest way to interpret a bool string into a number in C

How about taking advantage of ASCII and that '0', '1', 'f', 't' can be hashed to [0-3]?
     (hash & 4) ? ((hash >> 4)&3) : hash & 1
'0'  0
'1'  1
'f'  2
't'  3

int bool_str_decode(const char *s) {
  const char *tf[4] = { "0", "1", "false", "true"};
  unsigned hash = *s;
  hash = (hash & 4) ? ((hash >> 4)&3) : hash & 1;
  if (strcmp(tf[hash], s) == 0) return hash & 1;
  return 0;
}

chux - Reinstate Monica · Answer

int Bool_str_decode(const char *Str) {
  uint_least64_t sum = 0;
  do {
    sum <<= 8;
    sum |= *(unsigned char*) Str;
  } while (*Str++ && (sum & 0xFF0000000000) == 0);  // loop to  or 6 characters

if (sum == T1 || sum == Ttrue) return 1;
  if (sum == T0 || sum == Tfalse) return 0;
  return -1;
}

Zan Lynx · Answer

Try this one. I think it looks pretty good in assembly, especially clang: https://godbolt.org/z/KcYMf8
Update! I HAVE BENCHMARKED IT, along with most everyone else's here.
Results are at https://github.com/zlynx/truth-match-test
#include <stdio.h>

int tobool(const char *s) {
  char lower[16] = {(s[0] | 0x20), (s[1] | 0x20), (s[2] | 0x20),
                    (s[3] | 0x20), (s[4] | 0x20), s[5] | 0x20};
  int match_1 = ((lower[0] == ('1' | 0x20)) & (lower[1] == ('' | 0x20)));
  int match_0 = ((lower[0] == ('0' | 0x20)) & (lower[1] == ('' | 0x20)));
  int match_true = ((lower[0] == 't') & (lower[1] == 'r') & (lower[2] == 'u') &
                    (lower[3] == 'e') & (lower[4] == ('' | 0x20)));
  int match_false =
      ((lower[0] == 'f') & (lower[1] == 'a') & (lower[2] == 'l') &
       (lower[3] == 's') & (lower[4] == 'e') & (lower[5] == ('' | 0x20)));

int is_true = (match_1 | match_true);
  int is_false = (match_0 | match_false);
  return is_true - !(is_true | is_false);
}

const char *outputs[3] = {"invalid", "false", "true"};

int main(int argc, char *argv[]) {
  if (argc < 2)
    return 1;
  int result = tobool(argv[1]);
  puts(outputs[result + 1]);
  return 0;
}

user13783520 · Answer

My personal solution:
#include <ctype.h>
signed char BoolFromStr(const char *const StrIn, register const unsigned char Len) {
    if (!Len || Len > 5 || !StrIn) {
        return -1;
    }
    switch (tolower(*StrIn)) {
        case '0':
            if (Len == 1) {
                return 0;
            }
            break;
        case 'f':
            if (Len == 1 || (Len == 5 && !memcmp(StrIn+1, (const char[]){'a', 'l', 's', 'e'}, 4))) {
                return 0;
            }
            break;
        case 'n':
            if (Len == 1 || (Len == 2 && StrIn[1] == 'o')) {
                return 0;
            }
            break;
        case '1':
            if (Len == 1) {
                return 1;
            }
            break;
        case 'y':
            if (Len == 1 || (Len == 3 && !memcmp(StrIn+1, (const char[]){'e', 's'}, 2))) {
                return 1;
            }
            break;
        case 't':
            if (Len == 1 || (Len == 4 && !memcmp(StrIn+1, (const char[]){'r', 'u', 'e'}, 3))) {
                return 1;
            }
            break;
    }
    return -1;
}

Acorn · Answer

Since in your example it looks like you are returning -1 for invalid inputs, we can assume they aren't always valid, so you will have to check the entirety of the string no matter what you do.
However, whether a chain of memcmp calls (that start from the beginning but are usually very well optimized) or a decision tree is faster will depend on what the options are, how many there are, the target architecture and hardware, etc.

What's the fastest way to interpret a bool string into a number in C?

8 Answers

Add your own answers!

Ask a Question