How to filter IPv4 and IPv6 addresses?
There are several usual notations for IPv4 and IPv6 addresses. Here's an extended regular expression, suitable for Perl m//x
, that captures the usual notations. If you remove the comments and whitespace, you can use it with grep -E
, awk
, or any other utility that uses extended regular expressions (ERE).
^(
([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]|0+[0-3][0-7][0-7]|0x0*[0-9a-fA-F][0-9a-fA-F])
(\.([0-9]|[1-9][0-9]|1[0-9][0-9]|2[0-4][0-9]|25[0-5]|0+[0-3][0-7][0-7]|0x0*[0-9a-fA-F][0-9a-fA-F])){3} # IPv4 dotted quad
| 0x[0-9a-fA-F]{1-8} # IPv4 hexadecimal
| 0+[0-9]{0-10} | 0+[1-3]{11} # IPv4 octal
| [1-9][0-9]{1-8} # IPv4 decimal, small
| [1-3][0-9]{9} # IPv4 decimal, medium
| 4[0-9]{9} # IPv4 decimal, large (needs a further range check)
| [0-9a-fA-F]{1-4}(:[0-9a-fA-F]{1-4}){7} # IPv6 with all groups
| ([0-9a-fA-F]{1-4}:){1-1}(:[0-9a-fA-F]{1-4}){1-6} # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-2}(:[0-9a-fA-F]{1-4}){1-5} # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-3}(:[0-9a-fA-F]{1-4}){1-4} # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-4}(:[0-9a-fA-F]{1-4}){1-3} # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-5}(:[0-9a-fA-F]{1-4}){1-2} # IPv6 with 1-6 middle groups omitted
| ([0-9a-fA-F]{1-4}:){1-6}(:[0-9a-fA-F]{1-4}){1-1} # IPv6 with 1-6 middle groups omitted
)$
In case of a decimal value, you need a further range check (you can make a regexp of it, but it would be big):
if (!/[^0-9]/ && /^[^0]/) { # if it's a decimal number
die if $_ > 4294967295 # reject numbers above 2^32-1
}
If the tool you use only supports 32-bit numbers, you can do the test only if the number starts with 4, and strip the 4 before doing the check.
if (!/[^0-9]/ && /^4/) { # if it's a decimal number beginning with 4
my $tmp = $_;
$tmp =~ s/^4//;
die if $tmp > 294967295;
}