Python Regex Match Before Character AND Ignore White Space
That's a little bit tricky. You first start matching from a non-whitespace character then continue matching slowly but surely up to the position that is immediately followed by an optional number of spaces and a slash mark:
\S.*?(?= *\/)
See live demo here
If slash mark could be the first non-whitespace character in input string then replace \S
with [^\s\/]
:
[^\s\/].*?(?= *\/)
This expression is what you might want to explore:
^(.*?)(\s+\/.*)$
Here, we have two capturing groups where the first one collects your desired output, and the second one is your undesired pattern, bounded by start and end chars, just to be safe that can be removed if you want:
(.*?)(\s+\/.*)
Python Test
# coding=utf8
# the above tag defines encoding for this document and is for Python 2.x compatibility
import re
regex = r"^(.*?)(\s+\/.*)$"
test_str = ("123 / some text 123\n"
"anything else / some text 123")
subst = "\\1"
# You can manually specify the number of replacements by changing the 4th argument
result = re.sub(regex, subst, test_str, 0, re.MULTILINE)
if result:
print (result)
# Note: for Python 2.7 compatibility, use ur"" to prefix the regex and u"" to prefix the test string and substitution.
JavaScript Demo
const regex = /^(.*?)(\s+\/.*)$/gm;
const str = `123 / some text 123
anything else / some text 123`;
const subst = `\n$1`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
RegEx
If this wasn't your desired expression, you can modify/change your expressions in regex101.com.
RegEx Circuit
You can also visualize your expressions in jex.im:
Spaces
For spaces before your desired output, we can simply add a capturing group with negative lookbehind:
^(\s+)?(.*?)(\s+\/.*)$
JavaScript Demo
const regex = /^(\s+)?(.*?)(\s+\/.*)$/gm;
const str = ` 123 / some text 123
anything else / some text 123
123 / some text 123
anything else / some text 123`;
const subst = `$2`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);
Demo
Here is a possible solution
Regex
(?<!\/)\S.*\S(?=\s*\/)
Example
# import regex # or re
string = ' 123 / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints '123'
string = 'a test / some text 123'
test = regex.search(r'(?<!\/)\S.*\S(?=\s*\/)', string)
print(test.group(0))
# prints 'a test'
Short explanation
(?<!\/)
says before a possible match there can be no/
symbol.\S.*\S
matches lazily anything (.*
) while making sure it does not start or end with a white space (\S
)(?=\s*\/)
means a possible match must be followed by a/
symbol or by white spaces + a/
.