Detect and remove URLs from textarea

Try (Corrected and improved after comments):

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+$\s*/mg, '');

Peeling the expression from end to start:

An address might have two or three 'parts', besides the scheme
An address might start with www or not
It my be preceeded by http:// or https://
It may be enclosed inside [url=...]...[/url]

This expression does not enforce the full correct syntax, that is a much tougher regex to write.
A few improvements you might want:

1.Awareness of spaces

value = value.replace(/^\s*(\[\s*url\s*=\s*)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?\S+\s*$\s*/mg, '');

2.Enforce no dots on the last part

value = value.replace(/^(\[url=)?(https?:\/\/)?(www\.|\S+?\.)(\S+?\.)?[^.\s]+$\s*/mg, '');

Regarding your attempt at checking if there is a URL in the textarea.

if ($('textarea[name="test"]').val().indexOf('[url') >= 0 ||
    $('textarea[name="test"]').val().match(/^http([s]?):\/\/.*/) ||
    $('textarea[name="test"]').val().match(/^www.[0-9a-zA-Z',-]./)) {

Firstly, rather than getting the textarea value three times using multiple function calls it would better to store it in a variable before the checking, i.e.

var value = $('textarea[name="test"]').val();

The /^http([s]?):\/\/.*/, because of the ^ will only match if the "http://..." is found right at the beginning of the textarea value. The same applies to the ^www.. Adding the multiline flag m to the end of the regex would make ^ match the start of each line, rather than just the start of the string.

The .* in /^http([s]?):\/\/.*/ serves no purpose as it matches zero or more characters. The ([s]?) is better as s?.

In /^www.[0-9a-zA-Z',-]./, the . needs to be escaped to match a literal . if that is your intention, i.e. \., and I assume you mean to match more than one of the characters in the character class so you need to follow it with +.

It is more efficient to use the RegExp test method rather than match when the actual matches are not required, so, combining the above, you could have

if ( /^(\[url|https?:\/\/|www\.)/m.test( value ) ) {

There is little point in the check anyway if you are only using it to decide whether you need to call replace, because the check is implicit in the replace call itself

Using the simple criteria that strings of non-space characters at the start of a line and beginning with http[s]://, [url or www., should be removed, you could use

value = value.replace( /^(?:https?:\/\/|\[url|www\.)\S+\s*/gm, '' );

If the urls can appear anywhere you could use \b, meaning word boundary, instead of ^, and remove the m flag.

value = value.replace( /(?:\bhttps?:\/\/|\bwww\.|\[url)\S+\s*/g, '' );

It would be a waste of effort to try to offer a better regex solution without precise details of what forms of url may appear in the textarea, where they may appear and what characters may adjoin them.

If any valid url can appear anywhere in the textarea and be surrounded by any other characters than there is no watertight solution.

Detect and remove URLs from textarea

Tags:

Jquery

Regex

Related

Recent Posts