ตัวดำเนินการโมดูโล (%) ให้ผลลัพธ์ที่แตกต่างกันสำหรับ. NET เวอร์ชันต่างๆใน C #


89

ฉันกำลังเข้ารหัสอินพุตของผู้ใช้เพื่อสร้างสตริงสำหรับรหัสผ่าน แต่โค้ดบรรทัดหนึ่งให้ผลลัพธ์ที่แตกต่างกันในเวอร์ชันต่างๆของเฟรมเวิร์ก รหัสบางส่วนพร้อมค่าของคีย์ที่ผู้ใช้กด:

คีย์ที่กด: 1. ตัวแปรasciiคือ 49 ค่าของ 'e' และ 'n' หลังจากการคำนวณบางส่วน:

e = 103, 
n = 143,

Math.Pow(ascii, e) % n

Result of above code:

  • In .NET 3.5 (C#)

    Math.Pow(ascii, e) % n
    

    gives 9.0.

  • In .NET 4 (C#)

    Math.Pow(ascii, e) % n
    

    gives 77.0.

Math.Pow() gives the correct (same) result in both versions.

What is the cause, and is there a solution?


12
Of course, both answers in the question are wrong. The fact that you don't seem to care about that is, well, worrying.
David Heffernan

34
You need to go back several steps. "I am encrypting the user's input for generating a string for password" this part is already dubious. What do you actually want to do? Do you want to store a password in encrypted or hashed form? Do you want to use this as entropy to generate a random value? What are your security goals?
CodesInChaos

49
While this question illustrates an interesting issue with floating point arithmetic, if the OP's goal is "encrypting the user's input for generating a string for password", I don't think rolling your own encryption is a good idea, so I wouldn't recommend actually implementing any of the answers.
Harrison Paine

18
Nice demonstration why other languages forbid the use of % with floating-point numbers.
Ben Voigt

5
While the answers are good, none of them answer the question of what has changed between .NET 3.5 and 4 that is causing the different behaviour.
msell

คำตอบ:


160

Math.Pow works on double-precision floating-point numbers; thus, you shouldn't expect more than the first 15–17 digits of the result to be accurate:

All floating-point numbers also have a limited number of significant digits, which also determines how accurately a floating-point value approximates a real number. A Double value has up to 15 decimal digits of precision, although a maximum of 17 digits is maintained internally.

However, modulo arithmetic requires all digits to be accurate. In your case, you are computing 49103, whose result consists of 175 digits, making the modulo operation meaningless in both your answers.

To work out the correct value, you should use arbitrary-precision arithmetic, as provided by the BigInteger class (introduced in .NET 4.0).

int val = (int)(BigInteger.Pow(49, 103) % 143);   // gives 114

Edit: As pointed out by Mark Peters in the comments below, you should use the BigInteger.ModPow method, which is intended specifically for this kind of operation:

int val = (int)BigInteger.ModPow(49, 103, 143);   // gives 114

20
+1 for pointing out the real problem, namely that the code in the question is plain wrong
David Heffernan

36
It's worth noting that BigInteger provides a ModPow() method that performs (in my quick test just now) about 5 times faster for this operation.
Mark Peters

8
+1 With the edit. ModPow is not just fast, it is numerically stable!
Ray

2
@maker No, the answer is meaningless, not invalid.
Cody Gray

3
@makerofthings7: I agree with you in principle. However, imprecision is inherent to floating-point arithmetic, and it is deemed more practical to expect developers to be aware of the risks, than to impose restrictions on operations in general. If one wanted to be truly "safe", then the language would also need to forbid floating-point equality comparisons, to avoid unexpected results such as 1.0 - 0.9 - 0.1 == 0.0 evaluating to false.
Douglas

72

Apart from the fact that your hashing function is not a very good one *, the biggest problem with your code is not that it returns a different number depending on the version of .NET, but that in both cases it returns an entirely meaningless number: the correct answer to the problem is

49103 mod 143 = is 114. (link to Wolfram Alpha)

You can use this code to compute this answer:

private static int PowMod(int a, int b, int mod) {
    if (b == 0) {
        return 1;
    }
    var tmp = PowMod(a, b/2, mod);
    tmp *= tmp;
    if (b%2 != 0) {
        tmp *= a;
    }
    return tmp%mod;
}

The reason why your computation produces a different result is that in order to produce an answer, you use an intermediate value that drops most of the significant digits of the 49103 number: only the first 16 of its 175 digits are correct!

1230824813134842807283798520430636310264067713738977819859474030746648511411697029659004340261471771152928833391663821316264359104254030819694748088798262075483562075061997649

The remaining 159 digits are all wrong. The mod operation, however, seeks a result that requires every single digit to be correct, including the very last ones. Therefore, even the tiniest improvement to the precision of Math.Pow that may have been implemented in .NET 4, would result in a drastic difference of your calculation, which essentially produces an arbitrary result.

* Since this question talks about raising integers to high powers in the context of password hashing, it may be a very good idea to read this answerlink before deciding if your current approach should be changed for a potentially better one.


20
Good answer. The real point is that this is a terrible hash function. OP needs to rethink the solution and use a more appropriate algorithm.
david.pfx

1
Isaac Newton: Is it possible that the moon is attracted to the earth in the same way that the apple is attracted to the earth? @david.pfx: The real point is that this is a terrible way to pick apples. Newton needs to rethink the solution and perhaps hire a man with a ladder.
jwg

2
@jwg David's comment got that many upvotes for a reason. The original question made it clear that the algorithm was being used to hash passwords, and it is indeed a terrible algorithm for that purpose - it is extremely likely to break between versions of the .NET framework, as has already been demonstrated. Any answer that doesn't mention that the OP needs to replace his algorithm rather than "fix" it is doing him a disservice.
Chris

@Chris Thanks for the comment, I edited to include David's suggestion. I didn't word it as strongly as you, because OP's system may be a toy or a throw-away piece of code that he builds for his own amusement. Thanks!
Sergey Kalinichenko

27

What you see is rounding error in double. Math.Pow works with double and the difference is as below:

.NET 2.0 and 3.5 => var powerResult = Math.Pow(ascii, e); returns:

1.2308248131348429E+174

.NET 4.0 and 4.5 => var powerResult = Math.Pow(ascii, e); returns:

1.2308248131348427E+174

Notice the last digit before E and that is causing the difference in the result. It's not the modulus operator (%).


3
holy cow is this the ONLY answer to the OPs question? I read all the meta "blah blah security wrong question I know more than you n00b" and still wondered "why the consistent discrepancy between 3.5 and 4.0? Ever stubbed your toe on a rock while looking at the moon and asked "what kind of rock is this?" Only to be told "Your real problem is not looking at your feet" or "What do you expect when wearing home-made sandals at a night?!!!" THANKS!
Michael Paulukonis

1
@MichaelPaulukonis: That's a false analogy. Study of rocks is a legitimate pursuit; performing arbitrary-precision arithmetic using fixed-precision data types is just plain wrong. I'd compare this to a software recruiter inquiring why dogs are worse than cats at writing C#. If you're a zoologist, the question might hold some merit; for everyone else, it's pointless.
Douglas

24

Floating-point precision can vary from machine to machine, and even on the same machine.

However, the .NET make a virtual machine for your apps... but there are changes from version to version.

Therefore you shouldn't rely on it to produce consistent results. For encryption, use the classes that the Framework provides rather than rolling your own.


10

There are a lot of answers about the way the code is bad. However, as to why the result is different…

Intel's FPUs use the 80-bit format internally to get more precision for intermediate results. So if a value is in the processor register it gets 80 bits, but when it is written to the stack it gets stored at 64 bits.

I expect that the newer version of .NET has a better optimizer in its Just in Time (JIT) compilation, so it is keeping a value in a register rather than writing it to the stack and then reading it back from the stack.

It may be that the JIT can now return a value in a register rather than on the stack. Or pass the value to the MOD function in a register.

See also Stack Overflow question What are the applications/benefits of an 80-bit extended precision data type?

Other processors, e.g. the ARM will give different results for this code.


6

Maybe it's best to calculate it yourself using only integer arithmetic. Something like:

int n = 143;
int e = 103;
int result = 1;
int ascii = (int) 'a';

for (i = 0; i < e; ++i) 
    result = result * ascii % n;

You can compare the performance with the performance of the BigInteger solution posted in the other answers.


7
That would require 103 multiplications and modulus reductions. One can do better by computing e2=e*e % n, e4=e2*e2 % n, e8=e4*e4 % n, etc. and then result = e *e2 %n *e4 %n *e32 %n *e64 %n. A total of 11 multiplications and modulus reductions. Given the size of numbers involved, one could eliminate a few more modulus reductions, but that would be minor compared to reducing 103 operations to 11.
supercat

2
@supercat Nice mathematics, but in practice only relevant if you're running this on a toaster.
alextgordon

7
@alextgordon: Or if one is planning to use larger exponent values. Expanding the exponent value to e.g. 65521 would take about 28 multiplies and modulus reductions if one uses strength reduction, versus 65,520 if one doesn't.
supercat

+1 for giving an accessible solution where it's clear exactly how the calculation is done.
jwg

2
@Supercat: you're absolutely right. It's easy to improve the algorithm, which is relevant if either it is calculated very often or the exponents are large. But the main message is that it can and should be calculated using integer arithmetic.
Ronald
โดยการใช้ไซต์ของเรา หมายความว่าคุณได้อ่านและทำความเข้าใจนโยบายคุกกี้และนโยบายความเป็นส่วนตัวของเราแล้ว
Licensed under cc by-sa 3.0 with attribution required.