LeetCode 528: Random Pick with Weight

Question Given an array of positive numbers, return an index in this array based on the weight of the number occupying that index. For example, for w = [1, 3] , the probability of picking the index 0 is 1 / (1 + 3) = 0.25 (i.e 25%) while the probability of picking the index 1 is 3 / (1 + 3) = 0.75 (i.e 75%).

More formally, the probability of picking index i is w[i] / sum(w).

Thought Process A brute force is to construct a new vector and repeat each element of the original vector as many times as its weight. Now, pick a random index in [0, size_of_new_vector-1] and return the number present there. Find this number in the original vector and return its index. But this bumps up the space complexity enormously. Also, finding the element in the original unsorted array is expensive. We can sort it once but sorting destroys the original indexes of the numbers. Hence we need to remember the indexes somehow before sorting! Complex solution.

Instead, we can slightly modify the idea of repeating the numbers based on their weight: Just add the weights and create a new sum_ vector. This vector is sorted and hence looking up anything is cheap. Now pick a random number between [0, sum_[last]]. Find the upper_bound of this random number and map the index of the upper bound to the original vector.

Solution

 1  class Solution {                                                                                 
 2  public:                                                                                          
 3    Solution(std::vector<int>& w) {                                                                   
 4      sum_.resize(w.size());                                                                          
 5      for(int i = 0; i < w.size(); i++) {                                                             
 6        sum_[i] = w[i] + (i > 0? sum_[i-1] : 0);                                                      
 7      }                                                                                               
 8      original_ = w;                                                                                  
 9    }                                                                                                 
10                                                                                                      
11    int pickIndex() {                                                                                 
12      int rand_no = rand() % sum_[sum_.size()-1];                                                     
13      auto itr = std::upper_bound(sum_.begin(), sum_.end(), rand_no);                                 
14      return itr-sum_.begin();                                                                        
15    }                                                                                                 
16  private:                                                                                            
17    std::vector<int> sum_;                                                                            
18    std::vector<int> original_;                                                                       
19  };

Gotchas

  1. Line 4: I had missed resizing the sum_ vector. We are so used to NOT resizing a vector by virtue of using push_back() that its easy to miss it and hit a strange segfault when trying to access the vector using the [] operator

  2. Line 6: I had missed enclosing the ?: operator in parentheses and hit into a stranger segfault. Always remember that ?: has extremely low operator precedence. Higher only than the , operator. It's better to be safe using parentheses than rely on operator precedence.

  3. Line 14: Initially, I was returning original_[itr-sum_.begin()]. But if you read the problem carefully, it asks us to return the index and NOT the value of the element in the array.