What is idempotence anyway?
I stopped having difficulty to spell the work correctly until I read the wikipedia page and realized that idem means identical or the same, and potence is the power. So in mathematical notation, it means the output does not change no matter how many times a function is applied to the input, or
f(x) = f(f(x)),
where f is the function, and x is the input. Obviously, if the output does not change when the function is applied twice, it will not when the function is applied more times.f(x) = f(f(x)),
Idempotence issue for string sanitizers
Different from mathematical functions whose idempotence can be proved, we are very difficult to prove the idempotence of a sanitizer by testing. It is because we construct the sanitizers in a case-based way. We will only see the idempotence issue when the problem string instance is inputed. The idempotence issue become more complicated when a string goes through several sanitizers from user input to browser rendering.Idempotentise it
The solution turns out to be really simple. We just need a wrapper function for any given sanitizer such thatw(w(s(x))) = w(s(x)),
where s is the sanitizer function, and w is the wrapper function. For any sanitizer, the wrapper can be implemented by apply s recursively up to k times:
w(x) = s(s(s(…s(x))))
such that s(w(x)) = w(x). If s(w(x)) != w(x), then let w(x) = empty string. We will want to log the input string, and harden the sanitizer function so that it converges within k steps.