c++ - Explanation of the UB while changing data -
i trying demonstrate work pal can change value of constant-qualified variable if wants (and knows how to) using trickery, during demostration, i've discovered exists 2 "flavours" of constant values: ones cannot change whatever do, , ones can change using dirty tricks.
a constant value unchangeable when compiler uses literal value instead of value stored on stack (readed here), here piece of code shows mean:
// test 1 #define log(index, cv, ncv) std::cout \ << std::dec << index << ".- address = " \ << std::hex << &cv << "\tvalue = " << cv << '\n' \ << std::dec << index << ".- address = " \ << std::hex << &ncv << "\tvalue = " << ncv << '\n' const unsigned int const_value = 0xcafe01e; // try no-const reference unsigned int &no_const_ref = const_cast<unsigned int &>(const_value); no_const_ref = 0xfabada; log(1, const_value, no_const_ref); // try no-const pointer unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value); *no_const_ptr = 0xb0bada; log(2, const_value, (*no_const_ptr)); // try c-style cast no_const_ptr = (unsigned int *)&const_value; *no_const_ptr = 0xdeda1; log(3, const_value, (*no_const_ptr)); // try memcpy unsigned int brute_force = 0xba51c; std::memcpy(no_const_ptr, &brute_force, sizeof(const_value)); log(4, const_value, (*no_const_ptr)); // try union union bad_idea { const unsigned int *const_ptr; unsigned int *no_const_ptr; } u; u.const_ptr = &const_value; *u.no_const_ptr = 0xbeb1da; log(5, const_value, (*u.no_const_ptr));
this produces following output:
1.- address = 0xbfffbe2c value = cafe01e 1.- address = 0xbfffbe2c value = fabada 2.- address = 0xbfffbe2c value = cafe01e 2.- address = 0xbfffbe2c value = b0bada 3.- address = 0xbfffbe2c value = cafe01e 3.- address = 0xbfffbe2c value = deda1 4.- address = 0xbfffbe2c value = cafe01e 4.- address = 0xbfffbe2c value = ba51c 5.- address = 0xbfffbe2c value = cafe01e 5.- address = 0xbfffbe2c value = beb1da
since i'm relying in ub (change value of const data) expected program acts weird; weirdness more expecting.
let's supose compiler using literal value, then, when code reach instruction change value of constant (by reference, pointer or memcpy
ing), ignores order long value literal (is undefined behaviour though). explains why value remains unchanged but:
- why same memory address in both variables contained value differs?
afaik same memory address cannot point different values, so, 1 of outputs lying:
- what's happening? memory address fake 1 (if any)?
making few changes on code above can try avoid use of literal value, trickery work (source here):
// test 2 // try no-const reference void change_with_no_const_ref(const unsigned int &const_value) { unsigned int &no_const_ref = const_cast<unsigned int &>(const_value); no_const_ref = 0xfabada; log(1, const_value, no_const_ref); } // try no-const pointer void change_with_no_const_ptr(const unsigned int &const_value) { unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value); *no_const_ptr = 0xb0bada; log(2, const_value, (*no_const_ptr)); } // try c-style cast void change_with_cstyle_cast(const unsigned int &const_value) { unsigned int *no_const_ptr = (unsigned int *)&const_value; *no_const_ptr = 0xdeda1; log(3, const_value, (*no_const_ptr)); } // try memcpy void change_with_memcpy(const unsigned int &const_value) { unsigned int *no_const_ptr = const_cast<unsigned int *>(&const_value); unsigned int brute_force = 0xba51c; std::memcpy(no_const_ptr, &brute_force, sizeof(const_value)); log(4, const_value, (*no_const_ptr)); } void change_with_union(const unsigned int &const_value) { // try union union bad_idea { const unsigned int *const_ptr; unsigned int *no_const_ptr; } u; u.const_ptr = &const_value; *u.no_const_ptr = 0xbeb1da; log(5, const_value, (*u.no_const_ptr)); } int main(int argc, char **argv) { unsigned int value = 0xcafe01e; change_with_no_const_ref(value); change_with_no_const_ptr(value); change_with_cstyle_cast(value); change_with_memcpy(value); change_with_union(value); return 0; }
which produces following output:
1.- address = 0xbff0f5dc value = fabada 1.- address = 0xbff0f5dc value = fabada 2.- address = 0xbff0f5dc value = b0bada 2.- address = 0xbff0f5dc value = b0bada 3.- address = 0xbff0f5dc value = deda1 3.- address = 0xbff0f5dc value = deda1 4.- address = 0xbff0f5dc value = ba51c 4.- address = 0xbff0f5dc value = ba51c 5.- address = 0xbff0f5dc value = beb1da 5.- address = 0xbff0f5dc value = beb1da
as can see, const-qualified variable changed on each change_with_*
call, , behaviour same before except fact, tempted assume weird behaviour of memory address manifests when const data used literal instead of value.
so, in order ensure assumption, i've made last test, changing unsigned int value
in main
const unsigned int value
:
// test 3 const unsigned int value = 0xcafe01e; change_with_no_const_ref(value); change_with_no_const_ptr(value); change_with_cstyle_cast(value); change_with_memcpy(value); change_with_union(value);
surprisingly output same test 2
(code here), suppose data passed variable not literal value due usage parameter, makes me wonder:
- what things make compiler decide optimize const value literal value?
in brief, questions are:
- in
test 1
.- why const value , no-const value shares same memory address contained value differs?
- what steps follows program produce output? memory address fake 1 (if any)?
- in
test 3
- what things make compiler decide optimize const value literal value?
in general, pointless analyse undefined behaviour, because there no guarantee can transfer results of analysis different program.
in case, behaviour can explained assuming compiler has applied optimisation technique called constant propagation. in technique, if use value of const
variable compiler knows value, compiler replaces use of const
variable value of variable (as known @ compile time). other uses of variable, such taking address, not replaced.
this optimisation valid, precisely because changing variable defined const
results in undefined behaviour , compiler allowed assume program not invoke undefined behaviour.
so, in test 1
, addresses same, because same variable, values differ because first of each pair reflects compiler presumes (rightly) value of variable , second reflects stored there. in test 2
, test 3
, compiler can't make optimisation, because compiler can't 100% sure function argument refer constant value (and in test 2
, doesn't).
Comments
Post a Comment