How to filter IPv4 and IPv6 addresses?

There are several usual notations for IPv4 and IPv6 addresses. Here's an extended regular expression, suitable for Perl m//x, that captures the usual notations. If you remove the comments and whitespace, you can use it with grep -E, awk, or any other utility that uses extended regular expressions (ERE).

^(
  ([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]|0+[0-3][0-7][0-7]|0x0*[0-9a-fA-F][0-9a-fA-F])
  (\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]|0+[0-3][0-7][0-7]|0x0*[0-9a-fA-F][0-9a-fA-F])){3}   # IPv4 dotted quad
| 0x[0-9a-fA-F]{1-8}             # IPv4 hexadecimal
| 0+[0-9]{0-10} | 0+[1-3]{11}    # IPv4 octal
| [1-9][0-9]{1-8}                # IPv4 decimal, small
| [1-3][0-9]{9}                  # IPv4 decimal, medium
| 4[0-9]{9}                      # IPv4 decimal, large (needs a further range check)
| [0-9a-fA-F]{1-4}(:[0-9a-fA-F]{1-4}){7}            # IPv6 with all groups
| ([0-9a-fA-F]{1-4}:){1-1}(:[0-9a-fA-F]{1-4}){1-6}  # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-2}(:[0-9a-fA-F]{1-4}){1-5}  # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-3}(:[0-9a-fA-F]{1-4}){1-4}  # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-4}(:[0-9a-fA-F]{1-4}){1-3}  # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-5}(:[0-9a-fA-F]{1-4}){1-2}  # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-6}(:[0-9a-fA-F]{1-4}){1-1}  # IPv6 with 1-6 middle groups omitted
)$

In case of a decimal value, you need a further range check (you can make a regexp of it, but it would be big):

if (!/[^0-9]/ && /^[^0]/) { # if it's a decimal number
    die if $_ > 4294967295 # reject numbers above 2^32-1
}

If the tool you use only supports 32-bit numbers, you can do the test only if the number starts with 4, and strip the 4 before doing the check.

if (!/[^0-9]/ && /^4/) { # if it's a decimal number beginning with 4
    my $tmp = $_;
    $tmp =~ s/^4//;
    die if $tmp > 294967295;
}

Tags:

Ip

Perl