Find the minimum number of edits to balance parentheses?
Given a string consisting of left and right parentheses, we are asked to balance it by performing a minimal number of delete, insert, and replace operations.
To begin with, let's look at the input string and distinguish matched pairs from unmatched characters. We can mark all the characters belonging to matched pairs by executing the following algorithm:
- Find an unmarked '(' that is followed by an unmarked ')', with zero or more marked characters between the two.
- If there is no such pair of characters, terminate the algorithm.
- Otherwise, mark the '(' and the ')'.
- Return to step 1.
The marked pairs are already balanced at zero cost, so the optimal course of action is to do nothing further with them.
Now let's consider the unmarked characters. Notice that no unmarked '(' is followed by an unmarked ')', or else the pair would have been marked. Therefore, if we scan the unmarked characters from left to right, we will find zero or more ')' characters followed by zero or more '(' characters.
To balance the sequence of ')' characters, it is optimal to rewrite every other one to '(', starting with the first one and excluding the last one. If there is an odd number of ')' characters, it is optimal to delete the last one.
As for the sequence of '(' characters, it is optimal to rewrite every other one to ')', starting with the second one. If there is a leftover '(' character, we delete it. The following Python code implements the steps described above and displays the intermediate results.
def balance(s): # s is a string of '(' and ')' characters in any order
n = len(s)
print('original string: %s' % s)
# Mark all matched pairs
marked = n * [ False ]
left_parentheses = []
for i, ch in enumerate(s):
if ch == '(':
left_parentheses.append(i)
else:
if len(left_parentheses) != 0:
marked[i] = True
marked[left_parentheses.pop()] = True
# Display the matched pairs and unmatched characters.
matched, remaining = [], []
for i, ch in enumerate(s):
if marked[i]:
matched.append(ch)
remaining.append(' ')
else:
matched.append(' ')
remaining.append(ch)
print(' matched pairs: %s' % ''.join(matched))
print(' unmatched: %s' % ''.join(remaining))
cost = 0
deleted = n * [ False ]
new_chars = list(s)
# Balance the unmatched ')' characters.
right_count, last_right = 0, -1
for i, ch in enumerate(s):
if not marked[i] and ch == ')':
right_count += 1
if right_count % 2 == 1:
new_chars[i] = '('
cost += 1
last_right = i
if right_count % 2 == 1: # Delete the last ')' if we couldn't match it.
deleted[last_right] = True # The cost was incremented during replacement.
# Balance the unmatched '(' characters.
left_count, last_left = 0, -1
for i, ch in enumerate(s):
if not marked[i] and ch == '(':
left_count += 1
if left_count % 2 == 0:
new_chars[i] = ')'
cost += 1
else:
last_left = i
if left_count % 2 == 1: # Delete the last '(' if we couldn't match it.
deleted[last_left] = True # This character wasn't replaced, so we must
cost += 1 # increment the cost now.
# Display the outcome of replacing and deleting.
balanced = []
for i, ch in enumerate(new_chars):
if marked[i] or deleted[i]:
balanced.append(' ')
else:
balanced.append(ch)
print(' balance: %s' % ''.join(balanced))
# Display the cost of balancing and the overall balanced string.
print(' cost: %d' % cost)
result = []
for i, ch in enumerate(new_chars):
if not deleted[i]: # Skip deleted characters.
result.append(ch)
print(' new string: %s' % ''.join(result))
balance(')()(()())))()((())((')
For the test case ')()(()())))()((())(('
, the output is as follows.
original string: )()(()())))()((())((
matched pairs: ()(()()) () (())
unmatched: ) )) ( ((
balance: ( ) ( )
cost: 4
new string: (()(()()))()((()))
The idea is simple:
min edit to balance will be to change half close brackets to open brackets.
So minEdit = closeBracketCount/2
. (2) If it is of odd length:
min edit to balance will be to do above step 1 and remove the remaining 1 bracket.
So minEdit = closeBracketCount/2 + 1
(1) if it is of even length:
min edit to balance will be to change half open brackets to close brackets.
So minEdit = openBracketCount/2.
(2) If it is of odd length:
min edit to balance will be to do above step 1 and remove the remaining 1 bracket.
So minEdit = openBracketCount/2 + 1
Here is the running code: http://codeshare.io/bX1Dt
Let me know your thoughts.