c# UK postcode splitting
I've written something similar in the past. I think you can just split before the last digit. (e.g. remove all spaces, find the last digit and then insert a space before it):
static readonly char[] Digits = "0123456789".ToCharArray();
...
string noSpaces = original.Replace(" ", "");
int lastDigit = noSpaces.LastIndexOfAny(Digits);
if (lastDigit == -1)
{
throw new ArgumentException("No digits!");
}
string normalized = noSpaces.Insert(lastDigit, " ");
The Wikipedia entry has a lot of detail including regular expressions for validation (after normalisation :)
I'm not sure how UK Post Codes work, so is the last part considered the last 3 characters with the first part being everything before?
If it is, something like this should work, assuming you've already handled appropriate validation: (Edited thanks to Jon Skeets commment)
string postCode = "AB111AD".Replace(" ", "");
string firstPart = postCode.Substring(0, postCode.Length - 3);
That will return the Post Code minus the last 3 characters.
UK-postcodes format explained:
Ref: http://www.mrs.org.uk/pdf/postcodeformat.pdf
POSTCODE FORMAT
A Postcode is made up of the following elements:
PO1 3AX
- PO the area. There are 124 postcode areas in the UK
- 1 the district. There are approximately 20 Postcode districts in an area
- 3 the sector. There are approximately 3000 addresses in a sector.
- AX the Unit. There are approximately 15 addresses per unit.
The following list shows all valid Postcode formats. "A" indicates an alphabetic character and "N" indicates a numeric character.
FORMAT EXAMPLE:
AN NAA - M1 1AA
ANN NAA - M60 1NW
AAN NAA - CR2 6XH
AANN NAA - DN55 1PT
ANA NAA - W1A 1HQ
AANA NAA - EC1A 1BB
Please note the following:
- The letters Q, V and X are not used in the first position
- The letters I,J and Z are not used in the second position.
- The only letters to appear in the third position are A, B, C, D, E, F, G, H, J, K, S, T, U and W.
- The second half of the postcode is always consistent numeric, alpha, alpha format and the letters C, I, K, M, O and V are never used.
And it is safe to assume that the space
will be the forth character from the end, ie., if a postcode is missing a space, SW109RL
, you can blindly put a space at the 4th position from the end, SW10 9RL