Does the C++ standard allow for an uninitialized bool to crash a program?





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty{ height:90px;width:728px;box-sizing:border-box;
}







460















I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.



In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.



I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.



Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?



// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
const size_t len = strlen(whichString);

// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}


If this code is executed with clang 5.0.0 + optimizations, it will/can crash.



The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."



I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.



#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;

__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
// Determine which string to print depending if 'boolValue' is evaluated as true or false
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
size_t len = strlen(whichString);

memcpy(destBuffer, whichString, len);
}

int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;

// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}


The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:



const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization


While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?










share|improve this question




















  • 182





    It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

    – John Kugelman
    Jan 10 at 2:04








  • 6





    Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

    – Euro Micelli
    Jan 10 at 3:48






  • 2





    Comments are not for extended discussion; this conversation has been moved to chat.

    – Samuel Liew
    Jan 11 at 12:28






  • 3





    On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

    – TLW
    Jan 12 at 19:36






  • 1





    @TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

    – supercat
    Jan 12 at 22:14


















460















I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.



In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.



I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.



Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?



// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
const size_t len = strlen(whichString);

// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}


If this code is executed with clang 5.0.0 + optimizations, it will/can crash.



The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."



I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.



#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;

__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
// Determine which string to print depending if 'boolValue' is evaluated as true or false
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
size_t len = strlen(whichString);

memcpy(destBuffer, whichString, len);
}

int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;

// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}


The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:



const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization


While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?










share|improve this question




















  • 182





    It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

    – John Kugelman
    Jan 10 at 2:04








  • 6





    Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

    – Euro Micelli
    Jan 10 at 3:48






  • 2





    Comments are not for extended discussion; this conversation has been moved to chat.

    – Samuel Liew
    Jan 11 at 12:28






  • 3





    On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

    – TLW
    Jan 12 at 19:36






  • 1





    @TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

    – supercat
    Jan 12 at 22:14














460












460








460


99






I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.



In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.



I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.



Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?



// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
const size_t len = strlen(whichString);

// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}


If this code is executed with clang 5.0.0 + optimizations, it will/can crash.



The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."



I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.



#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;

__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
// Determine which string to print depending if 'boolValue' is evaluated as true or false
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
size_t len = strlen(whichString);

memcpy(destBuffer, whichString, len);
}

int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;

// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}


The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:



const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization


While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?










share|improve this question
















I know that an "undefined behaviour" in C++ can pretty much allow the compiler to do anything it wants. However, I had a crash that surprised me, as I assumed that the code was safe enough.



In this case, the real problem happened only on a specific platform using a specific compiler, and only if optimization was enabled.



I tried several things in order to reproduce the problem and simplify it to the maximum. Here's an extract of a function called Serialize, that would take a bool parameter, and copy the string true or false to an existing destination buffer.



Would this function be in a code review, there would be no way to tell that it, in fact, could crash if the bool parameter was an uninitialized value?



// Zero-filled global buffer of 16 characters
char destBuffer[16];

void Serialize(bool boolValue) {
// Determine which string to print based on boolValue
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
const size_t len = strlen(whichString);

// Copy string into destination buffer, which is zero-filled (thus already null-terminated)
memcpy(destBuffer, whichString, len);
}


If this code is executed with clang 5.0.0 + optimizations, it will/can crash.



The expected ternary-operator boolValue ? "true" : "false" looked safe enough for me, I was assuming, "Whatever garbage value is in boolValue doesn't matter, since it will evaluate to true or false anyhow."



I have setup a Compiler Explorer example that shows the problem in the disassembly, here the complete example. Note: in order to repro the issue, the combination I've found that worked is by using Clang 5.0.0 with -O2 optimisation.



#include <iostream>
#include <cstring>

// Simple struct, with an empty constructor that doesn't initialize anything
struct FStruct {
bool uninitializedBool;

__attribute__ ((noinline)) // Note: the constructor must be declared noinline to trigger the problem
FStruct() {};
};

char destBuffer[16];

// Small utility function that allocates and returns a string "true" or "false" depending on the value of the parameter
void Serialize(bool boolValue) {
// Determine which string to print depending if 'boolValue' is evaluated as true or false
const char* whichString = boolValue ? "true" : "false";

// Compute the length of the string we selected
size_t len = strlen(whichString);

memcpy(destBuffer, whichString, len);
}

int main()
{
// Locally construct an instance of our struct here on the stack. The bool member uninitializedBool is uninitialized.
FStruct structInstance;

// Output "true" or "false" to stdout
Serialize(structInstance.uninitializedBool);
return 0;
}


The problem arises because of the optimizer: It was clever enough to deduce that the strings "true" and "false" only differs in length by 1. So instead of really calculating the length, it uses the value of the bool itself, which should technically be either 0 or 1, and goes like this:



const size_t len = strlen(whichString); // original code
const size_t len = 5 - boolValue; // clang clever optimization


While this is "clever", so to speak, my question is: Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



Or is this a case of implementation-defined, in which case the implementation assumed that all its bools will only ever contain 0 or 1, and any other value is undefined behaviour territory?







c++ llvm undefined-behavior abi






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Jan 27 at 16:52









double-beep

3,10641432




3,10641432










asked Jan 10 at 1:39









RemzRemz

1,5612310




1,5612310








  • 182





    It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

    – John Kugelman
    Jan 10 at 2:04








  • 6





    Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

    – Euro Micelli
    Jan 10 at 3:48






  • 2





    Comments are not for extended discussion; this conversation has been moved to chat.

    – Samuel Liew
    Jan 11 at 12:28






  • 3





    On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

    – TLW
    Jan 12 at 19:36






  • 1





    @TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

    – supercat
    Jan 12 at 22:14














  • 182





    It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

    – John Kugelman
    Jan 10 at 2:04








  • 6





    Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

    – Euro Micelli
    Jan 10 at 3:48






  • 2





    Comments are not for extended discussion; this conversation has been moved to chat.

    – Samuel Liew
    Jan 11 at 12:28






  • 3





    On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

    – TLW
    Jan 12 at 19:36






  • 1





    @TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

    – supercat
    Jan 12 at 22:14








182




182





It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04







It's a great question. It's a solid illustration of how undefined behavior isn't just a theoretical concern. When people say anything can happen as a result of UB, that "anything" can really be quite surprising. One might assume that undefined behavior still manifests in predictable ways, but these days with modern optimizers that's not at all true. OP took the time to create a MCVE, investigated the problem thoroughly, inspected the disassembly, and asked a clear, straightforward question about it. Couldn't ask for more.

– John Kugelman
Jan 10 at 2:04






6




6





Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48





Observe that the requirement that “non-zero evaluates to true” is a rule about Boolean operations including “assignment to a bool” (which might implicitly invoke a static_cast<bool>() depending on specifics). It is however not a requirement about the internal representation of a bool chosen by the compiler.

– Euro Micelli
Jan 10 at 3:48




2




2





Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew
Jan 11 at 12:28





Comments are not for extended discussion; this conversation has been moved to chat.

– Samuel Liew
Jan 11 at 12:28




3




3





On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
Jan 12 at 19:36





On a very related note, this is a "fun" source of binary incompatibility. If you have an ABI A that zero-pads values before calling a function, but compiles functions such that it assumes parameters are zero-padded, and an ABI B that's the opposite (doesn't zero-pad, but doesn't assume zero-padded parameters), it'll mostly work, but a function using the B ABI will cause issues if it calls a function using the A ABI that takes a 'small' parameter. IIRC you have this on x86 with clang and ICC.

– TLW
Jan 12 at 19:36




1




1





@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
Jan 12 at 22:14





@TLW: Although the Standard does not require that implementations provide any means of calling or being called by outside code, it would have been helpful to have a means of specifying such things for implementations where they are relevant (implementations where such details aren't relevant could ignore such attributes).

– supercat
Jan 12 at 22:14












5 Answers
5






active

oldest

votes


















263














Yes, ISO C++ allows (but doesn't require) implementations to make this choice.



But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.



It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.





You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.



(An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)



ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.



In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)



The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
5U - boolValue.
(BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)



Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)





Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.



5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.





Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.



ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)



You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)



So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)



Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.



GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.



Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.



See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.



Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.



Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)



Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.



Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)



Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?



GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.



There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.





Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.



(Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)



For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.



Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.



This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
So I was interested to see that it doesn't do the same thing for bool.)





Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.



OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.



If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.



Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.





Tools for detecting UB and usage of uninitialized values



In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).



See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/



To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)




It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.



MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).




It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.



Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.



Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.






share|improve this answer





















  • 1





    I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

    – Joshua
    Jan 11 at 3:27






  • 5





    xkcd.com/499 is pretty good explanation of what UB is.

    – val
    Jan 11 at 4:30






  • 7





    Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

    – The_Sympathizer
    Jan 11 at 7:04








  • 1





    And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

    – davidbak
    Jan 12 at 2:45








  • 3





    @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

    – supercat
    Jan 12 at 22:23



















55














The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.



So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:




50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)







share|improve this answer





















  • 11





    The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

    – ShadowRanger
    Jan 10 at 2:08








  • 3





    @ShadowRanger You can always inspect the object representation directly.

    – T.C.
    Jan 10 at 2:12






  • 6





    @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

    – rici
    Jan 10 at 2:28








  • 3





    Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

    – Peter Cordes
    Jan 10 at 8:21






  • 4





    The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

    – Holger
    Jan 10 at 10:47



















48














The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.



The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).



Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.



NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.






share|improve this answer



















  • 2





    Is the first clause true? Does merely copying an uninitialized bool trigger UB?

    – Joshua Green
    Jan 10 at 3:25






  • 10





    @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

    – M.M
    Jan 10 at 3:34








  • 8





    @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

    – David Schwartz
    Jan 10 at 11:15






  • 4





    Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

    – MSalters
    Jan 10 at 20:03






  • 3





    On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

    – supercat
    Jan 10 at 21:23



















22














A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:



     // the compile could make asm that "looks" like this, from your source
const static char *strings = {"false", "true"};
const char *whichString = strings[boolValue];


If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.






share|improve this answer





















  • 1





    @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

    – Barmar
    Jan 10 at 2:09






  • 1





    You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

    – Remz
    Jan 10 at 2:25






  • 3





    @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

    – Barmar
    Jan 10 at 2:28








  • 1





    @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

    – Havenard
    Jan 10 at 2:57








  • 2





    @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

    – Martin Bonner
    Jan 11 at 15:13



















15














Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).



So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!



As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to




  • Your code working as you expected it to

  • Your code failing at random times

  • Your code not being run at all.


See What every programmer should know about undefined behavior






share|improve this answer






















    protected by P.W Feb 26 at 9:45



    Thank you for your interest in this question.
    Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



    Would you like to answer one of these unanswered questions instead?














    5 Answers
    5






    active

    oldest

    votes








    5 Answers
    5






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    263














    Yes, ISO C++ allows (but doesn't require) implementations to make this choice.



    But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.



    It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.





    You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.



    (An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)



    ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.



    In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)



    The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
    5U - boolValue.
    (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)



    Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)





    Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.



    5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.





    Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.



    ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)



    You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)



    So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)



    Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.



    GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.



    Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.



    See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.



    Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.



    Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)



    Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.



    Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)



    Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?



    GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.



    There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.





    Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.



    (Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)



    For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.



    Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.



    This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
    So I was interested to see that it doesn't do the same thing for bool.)





    Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.



    OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.



    If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.



    Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.





    Tools for detecting UB and usage of uninitialized values



    In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).



    See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/



    To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)




    It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.



    MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).




    It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.



    Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.



    Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.






    share|improve this answer





















    • 1





      I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

      – Joshua
      Jan 11 at 3:27






    • 5





      xkcd.com/499 is pretty good explanation of what UB is.

      – val
      Jan 11 at 4:30






    • 7





      Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

      – The_Sympathizer
      Jan 11 at 7:04








    • 1





      And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

      – davidbak
      Jan 12 at 2:45








    • 3





      @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

      – supercat
      Jan 12 at 22:23
















    263














    Yes, ISO C++ allows (but doesn't require) implementations to make this choice.



    But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.



    It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.





    You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.



    (An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)



    ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.



    In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)



    The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
    5U - boolValue.
    (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)



    Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)





    Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.



    5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.





    Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.



    ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)



    You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)



    So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)



    Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.



    GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.



    Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.



    See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.



    Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.



    Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)



    Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.



    Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)



    Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?



    GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.



    There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.





    Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.



    (Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)



    For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.



    Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.



    This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
    So I was interested to see that it doesn't do the same thing for bool.)





    Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.



    OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.



    If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.



    Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.





    Tools for detecting UB and usage of uninitialized values



    In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).



    See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/



    To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)




    It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.



    MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).




    It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.



    Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.



    Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.






    share|improve this answer





















    • 1





      I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

      – Joshua
      Jan 11 at 3:27






    • 5





      xkcd.com/499 is pretty good explanation of what UB is.

      – val
      Jan 11 at 4:30






    • 7





      Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

      – The_Sympathizer
      Jan 11 at 7:04








    • 1





      And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

      – davidbak
      Jan 12 at 2:45








    • 3





      @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

      – supercat
      Jan 12 at 22:23














    263












    263








    263







    Yes, ISO C++ allows (but doesn't require) implementations to make this choice.



    But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.



    It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.





    You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.



    (An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)



    ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.



    In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)



    The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
    5U - boolValue.
    (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)



    Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)





    Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.



    5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.





    Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.



    ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)



    You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)



    So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)



    Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.



    GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.



    Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.



    See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.



    Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.



    Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)



    Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.



    Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)



    Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?



    GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.



    There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.





    Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.



    (Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)



    For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.



    Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.



    This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
    So I was interested to see that it doesn't do the same thing for bool.)





    Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.



    OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.



    If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.



    Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.





    Tools for detecting UB and usage of uninitialized values



    In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).



    See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/



    To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)




    It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.



    MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).




    It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.



    Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.



    Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.






    share|improve this answer















    Yes, ISO C++ allows (but doesn't require) implementations to make this choice.



    But also note that ISO C++ allows a compiler to emit code that crashes on purpose (e.g. with an illegal instruction) if the program encounters UB, e.g. as a way to help you find errors. (Or because it's a DeathStation 9000. Being strictly conforming is not sufficient for a C++ implementation to be useful for any real purpose). So ISO C++ would allow a compiler to make asm that crashed (for totally different reasons) even on similar code that read an uninitialized uint32_t. Even though that's required to be a fixed-layout type with no trap representations.



    It's an interesting question about how real implementations work, but remember that even if the answer was different, your code would still be unsafe because modern C++ is not a portable version of assembly language.





    You're compiling for the x86-64 System V ABI, which specifies that a bool as a function arg in a register is represented by the bit-patterns false=0 and true=1 in the low 8 bits of the register1. In memory, bool is a 1-byte type that again must have an integer value of 0 or 1.



    (An ABI is a set of implementation choices that compilers for the same platform agree on so they can make code that calls each other's functions, including type sizes, struct layout rules, and calling conventions.)



    ISO C++ doesn't specify it, but this ABI decision is widespread because it makes bool->int conversion cheap (just zero-extension). I'm not aware of any ABIs that don't let the compiler assume 0 or 1 for bool, for any architecture (not just x86). It allows optimizations like !mybool with xor eax,1 to flip the low bit: Any possible code that can flip a bit/integer/bool between 0 and 1 in single CPU instruction. Or compiling a&&b to a bitwise AND for bool types. Some compilers do actually take advantage Boolean values as 8 bit in compilers. Are operations on them inefficient?.



    In general, the as-if rule allows allows the compiler to take advantage of things that are true on the target platform being compiled for, because the end result will be executable code that implements the same externally-visible behaviour as the C++ source. (With all the restrictions that Undefined Behaviour places on what is actually "externally visible": not with a debugger, but from another thread in a well-formed / legal C++ program.)



    The compiler is definitely allowed to take full advantage of an ABI guarantee in its code-gen, and make code like you found which optimizes strlen(whichString) to
    5U - boolValue.
    (BTW, this optimization is kind of clever, but maybe shortsighted vs. branching and inlining memcpyas stores of immediate data2.)



    Or the compiler could have created a table of pointers and indexed it with the integer value of the bool, again assuming it was a 0 or 1. (This possibility is what @Barmar's answer suggested.)





    Your __attribute((noinline)) constructor with optimization enabled led to clang just loading a byte from the stack to use as uninitializedBool. It made space for the object in main with push rax (which is smaller and for various reason about as efficient as sub rsp, 8), so whatever garbage was in AL on entry to main is the value it used for uninitializedBool. This is why you actually got values that weren't just 0.



    5U - random garbage can easily wrap to a large unsigned value, leading memcpy to go into unmapped memory. The destination is in static storage, not the stack, so you're not overwriting a return address or something.





    Other implementations could make different choices, e.g. false=0 and true=any non-zero value. Then clang probably wouldn't make code that crashes for this specific instance of UB. (But it would still be allowed to if it wanted to.) I don't know of any implementations that choose anything other what x86-64 does for bool, but the C++ standard allows many things that nobody does or even would want to do on hardware that's anything like current CPUs.



    ISO C++ leaves it unspecified what you'll find when you examine or modify the object representation of a bool. (e.g. by memcpying the bool into unsigned char, which you're allowed to do because char* can alias anything. And unsigned char is guaranteed to have no padding bits, so the C++ standard does formally let you hexdump object representations without any UB. Pointer-casting to copy the object representation is different from assigning char foo = my_bool, of course, so booleanization to 0 or 1 wouldn't happen and you'd get the raw object representation.)



    You've partially "hidden" the UB on this execution path from the compiler with noinline. Even if it doesn't inline, though, interprocedural optimizations could still make a version of the function that depends on the definition of another function. (First, clang is making an executable, not a Unix shared library where symbol-interposition can happen. Second, the definition in inside the class{} definition so all translation units must have the same definition. Like with the inline keyword.)



    So a compiler could emit just a ret or ud2 (illegal instruction) as the definition for main, because the path of execution starting at the top of main unavoidably encounters Undefined Behaviour. (Which the compiler can see at compile time if it decided to follow the path through the non-inline constructor.)



    Any program that encounters UB is totally undefined for its entire existence. But UB inside a function or if() branch that never actually runs doesn't corrupt the rest of the program. In practice that means that compilers can decide to emit an illegal instruction, or a ret, or not emit anything and fall into the next block / function, for the whole basic block that can be proven at compile time to contain or lead to UB.



    GCC and Clang in practice do actually sometimes emit ud2 on UB, instead of even trying to generate code for paths of execution that make no sense. Or for cases like falling off the end of a non-void function, gcc will sometimes omit a ret instruction. If you were thinking that "my function will just return with whatever garbage is in RAX", you are sorely mistaken. Modern C++ compilers don't treat the language like a portable assembly language any more. Your program really has to be valid C++, without making assumptions about how a stand-alone non inlined version of your function might look in asm.



    Another fun example is Why does unaligned access to mmap'ed memory sometimes segfault on AMD64?. x86 doesn't fault on unaligned integers, right? So why would a misaligned uint16_t* be a problem? Because alignof(uint16_t) == 2, and violating that assumption led to a segfault when auto-vectorizing with SSE2.



    See also What Every C Programmer Should Know About Undefined Behavior #1/3, an article by a clang developer.



    Key point: if the compiler noticed the UB at compile time, it could "break" (emit surprising asm) the path through your code that causes UB even if targeting an ABI where any bit-pattern is a valid object representation for bool.



    Expect total hostility toward many mistakes by the programmer, especially things modern compilers warn about. This is why you should use -Wall and fix warnings. C++ is not a user-friendly language, and something in C++ can be unsafe even if it would be safe in asm on the target you're compiling for. (e.g. signed overflow is UB in C++ and compilers will assume it doesn't happen, even when compiling for 2's complement x86, unless you use clang/gcc -fwrapv.)



    Compile-time-visible UB is always dangerous, and it's really hard to be sure (with link-time optimization) that you've really hidden UB from the compiler and can thus reason about what kind of asm it will generate.



    Not to be over-dramatic; often compilers do let you get away with some things and emit code like you're expecting even when something is UB. But maybe it will be a problem in the future if compiler devs implement some optimization that gains more info about value-ranges (e.g. that a variable is non-negative, maybe allowing it to optimize sign-extension to free zero-extension on x86-64). For example, in current gcc and clang, doing tmp = a+INT_MIN doesn't let them optimize a<0 as always-true, only that tmp is always negative. (So they don't backtrack from the inputs of a calculation to derive range info, only on the results based on the assumption of no signed overflow: example on Godbolt. I don't know if this is intentional user-friendliness or simply a missed optimization.)



    Also note that implementations (aka compilers) are allowed to define behaviour that ISO C++ leaves undefined. For example, all compilers that support Intel's intrinsics (like _mm_add_ps(__m128, __m128) for manual SIMD vectorization) must allow forming mis-aligned pointers, which is UB in C++ even if you don't dereference them. __m128i _mm_loadu_si128(const __m128i *) does unaligned loads by taking a misaligned __m128i* arg, not a void* or char*. Is `reinterpret_cast`ing between hardware vector pointer and the corresponding type an undefined behavior?



    GNU C/C++ also defines the behaviour of left-shifting a negative signed number (even without -fwrapv), separately from the normal signed-overflow UB rules. (This is UB in ISO C++, while right shifts of signed numbers are implementation-defined (logical vs. arithmetic); good quality implementations choose arithmetic on HW that has arithmetic right shifts, but ISO C++ doesn't specify). This is documented in the GCC manual's Integer section, along with defining implementation-defined behaviour that C standards require implementations to define one way or another.



    There are definitely quality-of-implementation issues that compiler developers care about; they generally aren't trying to make compilers that are intentionally hostile, but taking advantage of all the UB potholes in C++ (except ones they choose to define) to optimize better can be nearly indistinguishable at times.





    Footnote 1: The upper 56 bits can be garbage which the callee must ignore, as usual for types narrower than a register.



    (Other ABIs do make different choices here. Some do require narrow integer types to be zero- or sign-extended to fill a register when passed to or returned from functions, like MIPS64 and PowerPC64. See the last section of this x86-64 answer which compares vs. those earlier ISAs.)



    For example, a caller might have calculated a & 0x01010101 in RDI and used it for something else, before calling bool_func(a&1). The caller could optimize away the &1 because it already did that to the low byte as part of and edi, 0x01010101, and it knows the callee is required to ignore the high bytes.



    Or if a bool is passed as the 3rd arg, maybe a caller optimizing for code-size loads it with mov dl, [mem] instead of movzx edx, [mem], saving 1 byte at the cost of a false dependency on the old value of RDX (or other partial-register effect, depending on CPU model). Or for the first arg, mov dil, byte [r10] instead of movzx edi, byte [r10], because both require a REX prefix anyway.



    This is why clang emits movzx eax, dil in Serialize, instead of sub eax, edi. (For integer args, clang violates this ABI rule, instead depending on the undocumented behaviour of gcc and clang to zero- or sign-extend narrow integers to 32 bits. Is a sign or zero extension required when adding a 32bit offset to a pointer for the x86-64 ABI?
    So I was interested to see that it doesn't do the same thing for bool.)





    Footnote 2: After branching, you'd just have a 4-byte mov-immediate, or a 4-byte + 1-byte store. The length is implicit in the store widths + offsets.



    OTOH, glibc memcpy will do two 4-byte loads/stores with an overlap that depends on length, so this really does end up making the whole thing free of conditional branches on the boolean. See the L(between_4_7): block in glibc's memcpy/memmove. Or at least, go the same way for either boolean in memcpy's branching to select a chunk size.



    If inlining, you could use 2x mov-immediate + cmov and a conditional offset, or you could leave the string data in memory.



    Or if tuning for Intel Ice Lake (with the Fast Short REP MOV feature), an actual rep movsb might be optimal. glibc memcpy might start using rep movsb for small sizes on CPUs with that feature, saving a lot of branching.





    Tools for detecting UB and usage of uninitialized values



    In gcc and clang, you can compile with -fsanitize=undefined to add run-time instrumentation that will warn or error out on UB that happens at runtime. That won't catch unitialized variables, though. (Because it doesn't increase type sizes to make room for an "uninitialized" bit).



    See https://developers.redhat.com/blog/2014/10/16/gcc-undefined-behavior-sanitizer-ubsan/



    To find usage of uninitialized data, there's Address Sanitizer and Memory Sanitizer in clang/LLVM. https://github.com/google/sanitizers/wiki/MemorySanitizer shows examples of clang -fsanitize=memory -fPIE -pie detecting uninitialized memory reads. It might work best if you compile without optimization, so all reads of variables end up actually loading from memory in the asm. They show it being used at -O2 in a case where the load wouldn't optimize away. I haven't tried it myself. (In some cases, e.g. not initializing an accumulator before summing an array, clang -O3 will emit code that sums into a vector register that it never initialized. So with optimization, you can have a case where there's no memory read associated with the UB. But -fsanitize=memory changes the generated asm, and might result in a check for this.)




    It will tolerate copying of uninitialized memory, and also simple logic and arithmetic operations with it. In general, MemorySanitizer silently tracks the spread of uninitialized data in memory, and reports a warning when a code branch is taken (or not taken) depending on an uninitialized value.



    MemorySanitizer implements a subset of functionality found in Valgrind (Memcheck tool).




    It should work for this case because the call to glibc memcpy with a length calculated from uninitialized memory will (inside the library) result in a branch based on length. If it had inlined a fully branchless version that just used cmov, indexing, and two stores, it might not have worked.



    Valgrind's memcheck will also look for this kind of problem, again not complaining if the program simply copies around uninitialized data. But it says it will detect when a "Conditional jump or move depends on uninitialised value(s)", to try to catch any externally-visible behaviour that depends on uninitialized data.



    Perhaps the idea behind not flagging just a load is that structs can have padding, and copying the whole struct (including padding) with a wide vector load/store is not an error even if the individual members were only written one at a time. At the asm level, the information about what was padding and what is actually part of the value has been lost.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 13 at 14:08

























    answered Jan 10 at 9:42









    Peter CordesPeter Cordes

    134k18203342




    134k18203342








    • 1





      I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

      – Joshua
      Jan 11 at 3:27






    • 5





      xkcd.com/499 is pretty good explanation of what UB is.

      – val
      Jan 11 at 4:30






    • 7





      Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

      – The_Sympathizer
      Jan 11 at 7:04








    • 1





      And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

      – davidbak
      Jan 12 at 2:45








    • 3





      @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

      – supercat
      Jan 12 at 22:23














    • 1





      I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

      – Joshua
      Jan 11 at 3:27






    • 5





      xkcd.com/499 is pretty good explanation of what UB is.

      – val
      Jan 11 at 4:30






    • 7





      Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

      – The_Sympathizer
      Jan 11 at 7:04








    • 1





      And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

      – davidbak
      Jan 12 at 2:45








    • 3





      @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

      – supercat
      Jan 12 at 22:23








    1




    1





    I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

    – Joshua
    Jan 11 at 3:27





    I've seen a worse case where the variable took a value not in range of an 8 bit integer, but only of the entire CPU register. And Itanium has a worse one yet, use of an uninitialized variable can crash outright.

    – Joshua
    Jan 11 at 3:27




    5




    5





    xkcd.com/499 is pretty good explanation of what UB is.

    – val
    Jan 11 at 4:30





    xkcd.com/499 is pretty good explanation of what UB is.

    – val
    Jan 11 at 4:30




    7




    7





    Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

    – The_Sympathizer
    Jan 11 at 7:04







    Moreover, this also illustrates why the UB featurebug was introduced in the design of the languages C and C++ in the first place: because it gives the compiler exactly this kind of freedom, which has now permitted the most modern compilers to perform these high-quality optimizations that make C/C++ such high-performance mid-level languages.

    – The_Sympathizer
    Jan 11 at 7:04






    1




    1





    And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

    – davidbak
    Jan 12 at 2:45







    And so the war between C++ compiler writers and C++ programmers trying to write useful programs continues. This answer, totally comprehensive in answering this question, could also be used as is as convincing ad copy for vendors of static analysis tools ...

    – davidbak
    Jan 12 at 2:45






    3




    3





    @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

    – supercat
    Jan 12 at 22:23





    @The_Sympathizer: UB was included to allow implementations to behave in whatever ways would be most useful to their customers. It was not intended to suggest that all behaviors should be considered equally useful.

    – supercat
    Jan 12 at 22:23













    55














    The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.



    So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:




    50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)







    share|improve this answer





















    • 11





      The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

      – ShadowRanger
      Jan 10 at 2:08








    • 3





      @ShadowRanger You can always inspect the object representation directly.

      – T.C.
      Jan 10 at 2:12






    • 6





      @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

      – rici
      Jan 10 at 2:28








    • 3





      Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

      – Peter Cordes
      Jan 10 at 8:21






    • 4





      The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

      – Holger
      Jan 10 at 10:47
















    55














    The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.



    So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:




    50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)







    share|improve this answer





















    • 11





      The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

      – ShadowRanger
      Jan 10 at 2:08








    • 3





      @ShadowRanger You can always inspect the object representation directly.

      – T.C.
      Jan 10 at 2:12






    • 6





      @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

      – rici
      Jan 10 at 2:28








    • 3





      Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

      – Peter Cordes
      Jan 10 at 8:21






    • 4





      The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

      – Holger
      Jan 10 at 10:47














    55












    55








    55







    The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.



    So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:




    50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)







    share|improve this answer















    The compiler is allowed to assume that a boolean value passed as an argument is a valid boolean value (i.e. one which has been initialised or converted to true or false). The true value doesn't have to be the same as the integer 1 -- indeed, there can be various representations of true and false -- but the parameter must be some valid representation of one of those two values, where "valid representation" is implementation-defined.



    So if you fail to initialise a bool, or if you succeed in overwriting it through some pointer of a different type, then the compiler's assumptions will be wrong and Undefined Behaviour will ensue. You had been warned:




    50) Using a bool value in ways described by this International Standard as “undefined”, such as by examining the value of an uninitialized automatic object, might cause it to behave as if it is neither true nor false. (Footnote to para 6 of §6.9.1, Fundamental Types)








    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 10 at 2:32

























    answered Jan 10 at 1:59









    ricirici

    158k20139207




    158k20139207








    • 11





      The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

      – ShadowRanger
      Jan 10 at 2:08








    • 3





      @ShadowRanger You can always inspect the object representation directly.

      – T.C.
      Jan 10 at 2:12






    • 6





      @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

      – rici
      Jan 10 at 2:28








    • 3





      Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

      – Peter Cordes
      Jan 10 at 8:21






    • 4





      The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

      – Holger
      Jan 10 at 10:47














    • 11





      The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

      – ShadowRanger
      Jan 10 at 2:08








    • 3





      @ShadowRanger You can always inspect the object representation directly.

      – T.C.
      Jan 10 at 2:12






    • 6





      @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

      – rici
      Jan 10 at 2:28








    • 3





      Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

      – Peter Cordes
      Jan 10 at 8:21






    • 4





      The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

      – Holger
      Jan 10 at 10:47








    11




    11





    The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

    – ShadowRanger
    Jan 10 at 2:08







    The "true value doesn't have to be the same as the integer 1" is kind of misleading. Sure, the actual bit pattern could be something else, but when implicitly converted/promoted (the only way you'd see a value other than true/false), true is always 1, and false is always 0. Of course, such a compiler would also be unable to use the trick this compiler was trying to use (using the fact that bools actual bit pattern could only be 0 or 1), so it's kind of irrelevant to the OP's problem.

    – ShadowRanger
    Jan 10 at 2:08






    3




    3





    @ShadowRanger You can always inspect the object representation directly.

    – T.C.
    Jan 10 at 2:12





    @ShadowRanger You can always inspect the object representation directly.

    – T.C.
    Jan 10 at 2:12




    6




    6





    @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

    – rici
    Jan 10 at 2:28







    @shadowranger: my point is that the implementation is in charge. If it limits valid representations of true to the bit pattern 1, that's its prerogative. If it chooses some other set of representations, then it indeed could not use the optimisation noted here. If it does choose that particular representation, then it can. It only needs to be internally consistent. You can examine the representation of a bool by copying it into a byte array; that is not UB (but it is implementation-defined)

    – rici
    Jan 10 at 2:28






    3




    3





    Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

    – Peter Cordes
    Jan 10 at 8:21





    Yes, optimizing compilers (i.e. real-world C++ implementation) often will sometimes emit code that depends on a bool having a bit-pattern of 0 or 1. They don't re-booleanize a bool every time they read it from memory (or a register holding a function arg). That's what this answer is saying. examples: gcc4.7+ can optimize return a||b to or eax, edi in a function returning bool, or MSVC can optimize a&b to test cl, dl. x86's test is a bitwise and, so if cl=1 and dl=2 test sets flags according to cl&dl = 0.

    – Peter Cordes
    Jan 10 at 8:21




    4




    4





    The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

    – Holger
    Jan 10 at 10:47





    The point about undefined behavior is that the compiler is allowed to draw far more conclusions about it, e.g. to assume that a code path which would lead to accessing an uninitialized value is never taken at all, as ensuring that is precisely the responsibility of the programmer. So it’s not just about the possibility that the low level values could be different than zero or one.

    – Holger
    Jan 10 at 10:47











    48














    The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.



    The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).



    Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.



    NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.






    share|improve this answer



















    • 2





      Is the first clause true? Does merely copying an uninitialized bool trigger UB?

      – Joshua Green
      Jan 10 at 3:25






    • 10





      @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

      – M.M
      Jan 10 at 3:34








    • 8





      @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

      – David Schwartz
      Jan 10 at 11:15






    • 4





      Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

      – MSalters
      Jan 10 at 20:03






    • 3





      On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

      – supercat
      Jan 10 at 21:23
















    48














    The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.



    The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).



    Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.



    NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.






    share|improve this answer



















    • 2





      Is the first clause true? Does merely copying an uninitialized bool trigger UB?

      – Joshua Green
      Jan 10 at 3:25






    • 10





      @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

      – M.M
      Jan 10 at 3:34








    • 8





      @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

      – David Schwartz
      Jan 10 at 11:15






    • 4





      Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

      – MSalters
      Jan 10 at 20:03






    • 3





      On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

      – supercat
      Jan 10 at 21:23














    48












    48








    48







    The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.



    The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).



    Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.



    NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.






    share|improve this answer













    The function itself is correct, but in your test program, the statement that calls the function causes undefined behaviour by using the value of an uninitialized variable.



    The bug is in the calling function, and it could be detected by code review or static analysis of the calling function. Using your compiler explorer link, the gcc 8.2 compiler does detect the bug. (Maybe you could file a bug report against clang that it doesn't find the problem).



    Undefined behaviour means anything can happen, which includes the program crashing a few lines after the event that triggered the undefined behaviour.



    NB. The answer to "Can undefined behaviour cause _____ ?" is always "Yes". That's literally the definition of undefined behaviour.







    share|improve this answer












    share|improve this answer



    share|improve this answer










    answered Jan 10 at 2:12









    M.MM.M

    107k11120244




    107k11120244








    • 2





      Is the first clause true? Does merely copying an uninitialized bool trigger UB?

      – Joshua Green
      Jan 10 at 3:25






    • 10





      @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

      – M.M
      Jan 10 at 3:34








    • 8





      @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

      – David Schwartz
      Jan 10 at 11:15






    • 4





      Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

      – MSalters
      Jan 10 at 20:03






    • 3





      On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

      – supercat
      Jan 10 at 21:23














    • 2





      Is the first clause true? Does merely copying an uninitialized bool trigger UB?

      – Joshua Green
      Jan 10 at 3:25






    • 10





      @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

      – M.M
      Jan 10 at 3:34








    • 8





      @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

      – David Schwartz
      Jan 10 at 11:15






    • 4





      Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

      – MSalters
      Jan 10 at 20:03






    • 3





      On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

      – supercat
      Jan 10 at 21:23








    2




    2





    Is the first clause true? Does merely copying an uninitialized bool trigger UB?

    – Joshua Green
    Jan 10 at 3:25





    Is the first clause true? Does merely copying an uninitialized bool trigger UB?

    – Joshua Green
    Jan 10 at 3:25




    10




    10





    @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

    – M.M
    Jan 10 at 3:34







    @JoshuaGreen see [dcl.init]/12 "If an indeterminate value is produced by an evaluation, the behaviour is undefined except in the following cases:" (and none of those cases have an exception for bool). Copying requires evaluating the source

    – M.M
    Jan 10 at 3:34






    8




    8





    @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

    – David Schwartz
    Jan 10 at 11:15





    @JoshuaGreen And the reason for that is that you might have a platform that triggers a hardware fault if you access some invalid values for some types. These are sometimes called "trap representations".

    – David Schwartz
    Jan 10 at 11:15




    4




    4





    Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

    – MSalters
    Jan 10 at 20:03





    Itanium, while obscure, is a CPU that's still in production, has trap values, and has two at least semi-modern C++ compilers (Intel/HP). It literally has true, false and not-a-thing values for booleans.

    – MSalters
    Jan 10 at 20:03




    3




    3





    On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

    – supercat
    Jan 10 at 21:23





    On the flip side, the answer to "Does the standard require all compilers to process something a certain way" is generally "no", even/especially in cases where it's obvious that any quality compiler should do so; the more obvious something is, the less need there should be for the authors of the Standard to actually say it.

    – supercat
    Jan 10 at 21:23











    22














    A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:



         // the compile could make asm that "looks" like this, from your source
    const static char *strings = {"false", "true"};
    const char *whichString = strings[boolValue];


    If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.






    share|improve this answer





















    • 1





      @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

      – Barmar
      Jan 10 at 2:09






    • 1





      You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

      – Remz
      Jan 10 at 2:25






    • 3





      @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

      – Barmar
      Jan 10 at 2:28








    • 1





      @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

      – Havenard
      Jan 10 at 2:57








    • 2





      @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

      – Martin Bonner
      Jan 11 at 15:13
















    22














    A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:



         // the compile could make asm that "looks" like this, from your source
    const static char *strings = {"false", "true"};
    const char *whichString = strings[boolValue];


    If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.






    share|improve this answer





















    • 1





      @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

      – Barmar
      Jan 10 at 2:09






    • 1





      You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

      – Remz
      Jan 10 at 2:25






    • 3





      @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

      – Barmar
      Jan 10 at 2:28








    • 1





      @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

      – Havenard
      Jan 10 at 2:57








    • 2





      @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

      – Martin Bonner
      Jan 11 at 15:13














    22












    22








    22







    A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:



         // the compile could make asm that "looks" like this, from your source
    const static char *strings = {"false", "true"};
    const char *whichString = strings[boolValue];


    If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.






    share|improve this answer















    A bool is only allowed to hold the values 0 or 1, and the generated code can assume that it will only hold one of these two values. The code generated for the ternary in the assignment could use the value as the index into an array of pointers to the two strings, i.e. it might be converted to something like:



         // the compile could make asm that "looks" like this, from your source
    const static char *strings = {"false", "true"};
    const char *whichString = strings[boolValue];


    If boolValue is uninitialized, it could actually hold any integer value, which would then cause accessing outside the bounds of the strings array.







    share|improve this answer














    share|improve this answer



    share|improve this answer








    edited Jan 10 at 9:45









    Peter Cordes

    134k18203342




    134k18203342










    answered Jan 10 at 2:02









    BarmarBarmar

    435k36260364




    435k36260364








    • 1





      @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

      – Barmar
      Jan 10 at 2:09






    • 1





      You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

      – Remz
      Jan 10 at 2:25






    • 3





      @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

      – Barmar
      Jan 10 at 2:28








    • 1





      @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

      – Havenard
      Jan 10 at 2:57








    • 2





      @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

      – Martin Bonner
      Jan 11 at 15:13














    • 1





      @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

      – Barmar
      Jan 10 at 2:09






    • 1





      You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

      – Remz
      Jan 10 at 2:25






    • 3





      @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

      – Barmar
      Jan 10 at 2:28








    • 1





      @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

      – Havenard
      Jan 10 at 2:57








    • 2





      @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

      – Martin Bonner
      Jan 11 at 15:13








    1




    1





    @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

    – Barmar
    Jan 10 at 2:09





    @SidS Thanks. Theoretically, the internal representations could be the opposite of how they cast to/from integers, but that would be perverse.

    – Barmar
    Jan 10 at 2:09




    1




    1





    You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

    – Remz
    Jan 10 at 2:25





    You are right, and your example will also crash. However it is "visible" to a code review that you are using an uninitialized variable as an index to an array. Also, it would crash even in debug (for example some debugger/compiler will initialize with specific patterns to make it easier to see when it crashes). In my example, the surprising part is that the usage of the bool is invisible: The optimizer decided to use it in a calculation not present in the source code.

    – Remz
    Jan 10 at 2:25




    3




    3





    @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

    – Barmar
    Jan 10 at 2:28







    @Remz I'm just using the array to show what the generated code could be equivalent to, not suggesting that anyone would actually write that.

    – Barmar
    Jan 10 at 2:28






    1




    1





    @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

    – Havenard
    Jan 10 at 2:57







    @Remz Recast the bool to int with *(int *)&boolValue and print it for debugging purposes, see if it is anything other than 0 or 1 when it crashes. If that's the case, it pretty much confirms the theory that the compiler is optimizing the inline-if as an array which explains why it is crashing.

    – Havenard
    Jan 10 at 2:57






    2




    2





    @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

    – Martin Bonner
    Jan 11 at 15:13





    @MSalters: std::bitset<8> doesn't give me nice names for all my different flags. Depending on what they are, that may be important.

    – Martin Bonner
    Jan 11 at 15:13











    15














    Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



    The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).



    So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!



    As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to




    • Your code working as you expected it to

    • Your code failing at random times

    • Your code not being run at all.


    See What every programmer should know about undefined behavior






    share|improve this answer




























      15














      Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



      The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).



      So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!



      As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to




      • Your code working as you expected it to

      • Your code failing at random times

      • Your code not being run at all.


      See What every programmer should know about undefined behavior






      share|improve this answer


























        15












        15








        15







        Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



        The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).



        So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!



        As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to




        • Your code working as you expected it to

        • Your code failing at random times

        • Your code not being run at all.


        See What every programmer should know about undefined behavior






        share|improve this answer













        Summarising your question a lot, you are asking Does the C++ standard allow a compiler to assume a bool can only have an internal numerical representation of '0' or '1' and use it in such a way?



        The standard says nothing about the internal representation of a bool. It only defines what happens when casting a bool to an int (or vice versa). Mostly, because of these integral conversions (and the fact that people rely rather heavily on them), the compiler will use 0 and 1, but it doesn't have to (although it has to respect the constraints of any lower level ABI it uses).



        So, the compiler, when it sees a bool is entitled to consider that said bool contains either of the 'true' or 'false' bit patterns and do anything it feels like. So if the values for true and false are 1 and 0, respectively, the compiler is indeed allowed to optimise strlen to 5 - <boolean value>. Other fun behaviours are possible!



        As gets repeatedly stated here, undefined behaviour has undefined results. Including but not limited to




        • Your code working as you expected it to

        • Your code failing at random times

        • Your code not being run at all.


        See What every programmer should know about undefined behavior







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Jan 10 at 11:48









        Tom TannerTom Tanner

        8,13322351




        8,13322351

















            protected by P.W Feb 26 at 9:45



            Thank you for your interest in this question.
            Because it has attracted low-quality or spam answers that had to be removed, posting an answer now requires 10 reputation on this site (the association bonus does not count).



            Would you like to answer one of these unanswered questions instead?



            Popular posts from this blog

            Bressuire

            Cabo Verde

            Gyllenstierna