Using Bit operations to find non duplicated integers

I had a really hard time understanding this particular solution for finding non duplicated integers in an array - in constant time and constant memory. It involved bit operations on counts stored on bits that were distributed across integers. This rather lengthy post is my attempt to understand it.,

https://leetcode.com/problems/single-number-ii/discuss/43295/Detailed-explanation-and-generalization-of-the-bitwise-operation-method-for-single-numbers

The solution is quite simple and elegant, however arriving at this solution from scratch was quite difficult

    public int singleNumber(int[] nums) {
        int x1 = 0, x2 = 0, mask = 0;
         
        for (int i : nums) {
            x2 ^= x1 & i;
            x1 ^= i;
            mask = ~(x1 & x2);
            x2 &= mask;
            x1 &= mask;
        }

        return x1;  // Since p = 1, in binary form p = '01', then p1 = 1, so we should return x1. 
                    // If p = 2, in binary form p = '10', then p2 = 1, and we should return x2.
                    // Or alternatively we can simply return (x1 | x2).
    }

Understanding the problem

https://leetcode.com/problems/single-number-ii/

  • The problem asks us to find a number that is not duplicated.
  • It also guarantees the number of dupes that will be present
  • Should be done in constant time and memory

This would normally be a simple problem to solve with a hash table that tracks counts, however this problem calls for a solution in constant memory. I was unable to come up with a solution but came across the solution that was in my opinion very challenging to understand. I therefore decided to diagram it in an attempt to comprehend it.


Using only 1 bit numbers

The solution first tries to establish certain truths that are then combined to form part of the solution. The first of which is, to break down the problem by only considering single bit numbers.

  • How do we count one bit number ?
  • How do we track it ?
  • How do we reset the count when we reach the guaranteed dupe count specified ?

Since this is exploring a solution rooted in binary operations, the counts will be stored as binary numbers

Observation 1

The first observation of binary number addition is that a bit flips from 0 → 1 → 0. This only happens if the value is non zero. 

We can represent this as an equation in 2 different ways with both OR and XOR operators since we know these will always result in a zero if X is zero -

    X OR 0 = 0    if X == 0
and the same is possible if we use an XOR operator
    X XOR 0 = 0   if X == 0

Observation 2

The bit flip from 0 → 1 or 1 → 0 during addition only happens when the element is 1. 

We therefore are limited to using the XOR Operator since that would work with this constraint.

    0 XOR 1 = 1

    0 XOR 0 = 0 

    1 XOR 1 = 0 (flipped, would not have flipped with OR)

    1 XOR 0 = 1 

Observation 3

The most significant bit of the binary number is set to 1 when ALL least significant bits have filled up and are all set to 1. Representing this in an equation involves checking all least significant bits

Tying the Observations together

Combining all the observations above, we can generate a formula for what the values for each counter bit should be

    value of counter bit = (previous value XOR element value) AND (All least significant bits are 1)

Representing the binary number as separate bits

At this point, it was important for me to represent the counters in separate bits as this would let me understand the remainder of the solution

Resetting the Counter

The problem has guaranteed the number of dupes. For this example I am going to use 3.

  • We know that a bit can be reset to 0 by AND-ing any bit with 0.
  • We want to be intentional and only do this is if the count is 3

Applying a Mask

Being intentional, is where a mask comes into play. Applying a mask with the AND operation can reset the counter to 0, however the mask needs to be dynamically generated based on the previous count. We know that the binary number that represents 3 is 11. Since the count is now represented in bits we can actually safely create a mask based on those values. 

In this case, for the binary number 11 our mask is -

    NOT (x1  &  x2)       where x1-first bit  x2- second bit

If and only if - the both bits are 1 (the integer 3), the above equation will result in 0.

Generalizing the solutions to an Integer

Ok we know how to count using bits and how to reset a counter based on the number of dupes. But how do we expand this to integers. Considering integers are 32 bits - we can track the count for each of these bits

We can rearrange the bits above into a new diagram (below)

We know that bit operations only operate on a single bit. So if you have a 1 bits number or a 32 bit number, the bit operation is exclusive to a single bit. There is no carry over like integer math. We can thus safely refactor the solution and only use 2 integers (for this example of 3 dupes)

But this unfortunately, is where things got confusing for me. According to the solution at the end of the iteration the second integer should hold our non duplicate integer.

In order to understand this I did draw out some simple example and ran through the iteration. They all did have the correct non duplicated value in the integer 2. 

My understanding is that

  • we keep track of the 1s.
  • we apply a mask when the counter has hit the duplicate count (3 in this case)
  • This reset may happen early but that is fine because the inverse of the result is actually stored in integer 1, and revealed when the final mask is applied



Notes

The leetcode solution goes into a much deeper analysis that I do in this post. I have also simplified this post to deal with 3 element dupes while the leetcode solution also answers how to deal with an arbitrary number of dupes.

It also accounts for finding the "unique" number that only appear a certain number of times.

Example - unique number appears 2 times whilst all other numbers appear 3 times scenarios.

I would highly recommend checking it out -

https://leetcode.com/problems/single-number-ii/discuss/43295/Detailed-explanation-and-generalization-of-the-bitwise-operation-method-for-single-numbers