Separate multiple SQL statements using Regex in Java
You can try the following regular expression:
\s*;\s*(?=([^']*'[^']*')*[^']*$)
Here is the example:
public static void main(String[] args) {
String input = "select * from table1 where col1 = 'abc;de'; select * from table2;";
System.out.println(java.util.Arrays.toString(
input.split("\\s*;\\s*(?=([^']*'[^']*')*[^']*$)")
)); // prints "[select * from table1 where col1 = 'abc;de', select * from table2]"
}
Explanation of the regular expression:
NODE EXPLANATION
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
; ';'
--------------------------------------------------------------------------------
\s* whitespace (\n, \r, \t, \f, and " ") (0 or
more times (matching the most amount
possible))
--------------------------------------------------------------------------------
(?= look ahead to see if there is:
--------------------------------------------------------------------------------
( group and capture to \1 (0 or more times
(matching the most amount possible)):
--------------------------------------------------------------------------------
[^']* any character except: ''' (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
' '\''
--------------------------------------------------------------------------------
[^']* any character except: ''' (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
' '\''
--------------------------------------------------------------------------------
)* end of \1 (NOTE: because you are using a
quantifier on this capture, only the
LAST repetition of the captured pattern
will be stored in \1)
--------------------------------------------------------------------------------
[^']* any character except: ''' (0 or more
times (matching the most amount
possible))
--------------------------------------------------------------------------------
$ before an optional \n, and the end of
the string
--------------------------------------------------------------------------------
) end of look-ahead