Rails: What's a good way to validate links (URLs)?

Validating an URL is a tricky job. It's also a very broad request.

What do you want to do, exactly? Do you want to validate the format of the URL, the existence, or what? There are several possibilities, depending on what you want to do.

A regular expression can validate the format of the URL. But even a complex regular expression cannot ensure you are dealing with a valid URL.

For instance, if you take a simple regular expression, it will probably reject the following host

http://invalid##host.com

but it will allow

http://invalid-host.foo

that is a valid host, but not a valid domain if you consider the existing TLDs. Indeed, the solution would work if you want to validate the hostname, not the domain because the following one is a valid hostname

http://host.foo

as well the following one

http://localhost

Now, let me give you some solutions.

If you want to validate a domain, then you need to forget about regular expressions. The best solution available at the moment is the Public Suffix List, a list maintained by Mozilla. I created a Ruby library to parse and validate domains against the Public Suffix List, and it's called PublicSuffix.

If you want to validate the format of an URI/URL, then you might want to use regular expressions. Instead of searching for one, use the built-in Ruby URI.parse method.

require 'uri'

def valid_url?(uri)
  uri = URI.parse(uri) && uri.host
rescue URI::InvalidURIError
  false
end

You can even decide to make it more restrictive. For instance, if you want the URL to be an HTTP/HTTPS URL, then you can make the validation more accurate.

require 'uri'

def valid_url?(url)
  uri = URI.parse(url)
  uri.is_a?(URI::HTTP) && !uri.host.nil?
rescue URI::InvalidURIError
  false
end

Of course, there are tons of improvements you can apply to this method, including checking for a path or a scheme.

Last but not least, you can also package this code into a validator:

class HttpUrlValidator < ActiveModel::EachValidator

  def self.compliant?(value)
    uri = URI.parse(value)
    uri.is_a?(URI::HTTP) && !uri.host.nil?
  rescue URI::InvalidURIError
    false
  end

  def validate_each(record, attribute, value)
    unless value.present? && self.class.compliant?(value)
      record.errors.add(attribute, "is not a valid HTTP URL")
    end
  end

end

# in the model
validates :example_attribute, http_url: true

I use a one liner inside my models:

validates :url, format: URI::regexp(%w[http https])

I think is good enough and simple to use. Moreover it should be theoretically equivalent to the Simone's method, as it use the very same regexp internally.


Following Simone's idea, you can easily create you own validator.

class UrlValidator < ActiveModel::EachValidator
  def validate_each(record, attribute, value)
    return if value.blank?
    begin
      uri = URI.parse(value)
      resp = uri.kind_of?(URI::HTTP)
    rescue URI::InvalidURIError
      resp = false
    end
    unless resp == true
      record.errors[attribute] << (options[:message] || "is not an url")
    end
  end
end

and then use

validates :url, :presence => true, :url => true

in your model.