Fair partitioning of elements of a list

Note: Edited to better handle the case when the sum of all numbers is odd.

Backtracking is a possibility for this problem.

It allows examining all the possibilities recursively, without the need of a large amount of memory.

It stops as soon as an optimal solution is found: sum = 0, where sum is the difference between the sum of elements of set A and the sum of elements of set B. EDIT: it stops as soon sum < 2, to handle the case when the sum of all numbers is odd, i.e. corresponding to a minimum difference of 1. If this global sum is even, the min difference cannot be equal to 1.

It allows to implement a simple procedure of premature abandon:
at a given time, if sum is higher then the sum of all remaining elements (i.e. not placed in A or B) plus the absolute value of current minimum obtained, then we can give up examining the current path, without examining the remaining elements. This procedure is optimized with:

sort the input data in decreasing order
A each step, first examine the most probable choice: this allow to go rapidly to a near-optimum solution

Here is a pseudo-code

Initialization:

sort elements a[]
Calculate the sum of remaining elements: sum_back[i] = sum_back[i+1] + a[i];
Set the min "difference" to its maximum value: min_diff = sum_back[0];
Put a[0] in A -> the index i of examined element is set to 1
Set up_down = true; : this boolean indicates if we are currently going forward (true) or backward (false)

While loop:

If (up_down): forward
- Test premature abandon, with help of sum_back
- Select most probable value, adjust sum according to this choice
- if (i == n-1): LEAF -> test if the optimum value is improved and return if the new value is equal to 0 (EDIT: if (... < 2)); go backward
- If not in a leaf: continue going forward
If (!updown): backward
- If we arrive at i == 0 : return
- If it is the second walk in this node: select the second value, go up
- else: go down
- In both cases: recalculate the new sum value

Here is a code, in C++ (Sorry, don't know Python)

#include    <iostream>
#include    <vector>
#include    <algorithm>
#include    <tuple>

std::tuple<int, std::vector<int>> partition(std::vector<int> &a) {
    int n = a.size();
    std::vector<int> parti (n, -1);     // current partition studies
    std::vector<int> parti_opt (n, 0);  // optimal partition
    std::vector<int> sum_back (n, 0);   // sum of remaining elements
    std::vector<int> n_poss (n, 0);     // number of possibilities already examined at position i

    sum_back[n-1] = a[n-1];
    for (int i = n-2; i >= 0; --i) {
        sum_back[i] = sum_back[i+1] + a[i];
    }

    std::sort(a.begin(), a.end(), std::greater<int>());
    parti[0] = 0;       // a[0] in A always !
    int sum = a[0];     // current sum

    int i = 1;          // index of the element being examined (we force a[0] to be in A !)
    int min_diff = sum_back[0];
    bool up_down = true;

    while (true) {          // UP
        if (up_down) {
            if (std::abs(sum) > sum_back[i] + min_diff) {  //premature abandon
                i--;
                up_down = false;
                continue;
            }
            n_poss[i] = 1;
            if (sum > 0) {
                sum -= a[i];
                parti[i] = 1;
            } else {
                sum += a[i];
                parti[i] = 0;
            }

            if (i == (n-1)) {           // leaf
                if (std::abs(sum) < min_diff) {
                    min_diff = std::abs(sum);
                    parti_opt = parti;
                    if (min_diff < 2) return std::make_tuple (min_diff, parti_opt);   // EDIT: if (... < 2) instead of (... == 0)
                }
                up_down = false;
                i--;
            } else {
                i++;        
            }

        } else {            // DOWN
            if (i == 0) break;
            if (n_poss[i] == 2) {
                if (parti[i]) sum += a[i];
                else sum -= a[i];
                //parti[i] = 0;
                i--;
            } else {
                n_poss[i] = 2;
                parti[i] = 1 - parti[i];
                if (parti[i]) sum -= 2*a[i];
                else sum += 2*a[i];
                i++;
                up_down = true;
            }
        }
    }
    return std::make_tuple (min_diff, parti_opt);
}

int main () {
    std::vector<int> a = {5, 6, 2, 10, 2, 3, 4, 13, 17, 38, 42};
    int diff;
    std::vector<int> parti;
    std::tie (diff, parti) = partition (a);

    std::cout << "Difference = " << diff << "\n";

    std::cout << "set A: ";
    for (int i = 0; i < a.size(); ++i) {
        if (parti[i] == 0) std::cout << a[i] << " ";
    }
    std::cout << "\n";

    std::cout << "set B: ";
    for (int i = 0; i < a.size(); ++i) {
        if (parti[i] == 1) std::cout << a[i] << " ";
    }
    std::cout << "\n";
}

I think that you should do the next exercise by yourself, otherwise you don't learn much. As for this one, here is a solution that tries to implement the advice by your instructor:

def partition(ratings):

    def split(lst, bits):
        ret = ([], [])
        for i, item in enumerate(lst):
            ret[(bits >> i) & 1].append(item)
        return ret

    target = sum(ratings) // 2
    best_distance = target
    best_split = ([], [])
    for bits in range(0, 1 << len(ratings)):
        parts = split(ratings, bits)
        distance = abs(sum(parts[0]) - target)
        if best_distance > distance:
            best_distance = distance
            best_split = parts
    return best_split

ratings = [5, 6, 2, 10, 2, 3, 4]
print(ratings)
print(partition(ratings))

Output:

[5, 6, 2, 10, 2, 3, 4]
([5, 2, 2, 3, 4], [6, 10])

Note that this output is different from your desired one, but both are correct.

This algorithm is based on the fact that, to pick all possible subsets of a given set with N elements, you can generate all integers with N bits, and select the I-th item depending on the value of the I-th bit. I leave to you to add a couple of lines in order to stop as soon as the best_distance is zero (because it can't get any better, of course).

A bit on bits (note that 0b is the prefix for a binary number in Python):

A binary number: 0b0111001 == 0·2⁶+1·2⁵+1·2⁴+1·2³+0·2²+0·2¹+1·2⁰ == 57

Right shifted by 1: 0b0111001 >> 1 == 0b011100 == 28

Left shifted by 1: 0b0111001 << 1 == 0b01110010 == 114

Right shifted by 4: 0b0111001 >> 4 == 0b011 == 3

Bitwise & (and): 0b00110 & 0b10101 == 0b00100

To check whether the 5th bit (index 4) is 1: (0b0111001 >> 4) & 1 == 0b011 & 1 == 1

A one followed by 7 zeros: 1 << 7 == 0b10000000

7 ones: (1 << 7) - 1 == 0b10000000 - 1 == 0b1111111

All 3-bit combinations: 0b000==0, 0b001==1, 0b010==2, 0b011==3, 0b100==4, 0b101==5, 0b110==6, 0b111==7 (note that 0b111 + 1 == 0b1000 == 1 << 3)

Fair partitioning of elements of a list

Tags:

Python

Algorithm

List

Related

Recent Posts