Drawbacks of storing an integer as a string in a database
It really depends on what kind of id you are talking about. If it's a code like a phone number it would actually be better to use a varchar for the id and then have your own id to be a serial for the db and use for primary key. In a case where the integer have no numerical value, varchars are generally prefered.
Do NOT consider performance. Consider meaning.
ID "numbers" are not numeric except that they are written with an alphabet of all digits.
If I have part number 12 and part number 14, what is the difference between the two? Is part number 2 or -2 meaningful? No.
Part numbers (and anything that doesn't have units of measure) are not "numeric". They're just strings of digits.
Zip codes in the US, for example. Phone numbers. Social security numbers. These are not numbers. In my town the difference between zip code 12345 and 12309 isn't the distance from my house to downtown.
Do not conflate numbers -- with units -- where sums and differences mean something with strings of digits without sums or differences.
Part ID numbers are -- properly -- strings. Not integers. They'll never be integers because they don't have sums, differences or averages.
Unless you really need the features of an integer (that is, the ability to do arithmetic), then it is probably better for you to store the product IDs as strings. You will never need to do anything like add two product IDs together, or compute the average of a group of product IDs, so there is no need for an actual numeric type.
It is unlikely that storing product IDs as strings will cause a measurable difference in performance. While there will be a slight increase in storage size, the size of a product ID string is likely to be much smaller than the data in the rest of your database row anyway.
Storing product IDs as strings today will save you much pain in the future if the data provider decides to start using alphabetic or symbol characters. There is no real downside.
I've just spent the last year dealing with a database that has almost all IDs as strings, some with digits only, and others mixed. These are the problems:
- Grossly restricted ID space. A 4 char (digit-only) ID has capacity for 10,000 unique values. A 4 byte numeric has capacity for over 4 billion.
- Unpredictable ID space coverage. Once IDs start including non-digits it becomes hard to predict where you can create new IDs without collisions.
- Conversion and display problems in certain circumstances, when scripting or on export for instance. If the ID gets interpreted as a number and there is a leading zero, the ID gets altered.
- Sorting problems. You can't rely on the natural order being helpful.
Of course, if you run out of IDs, or don't know how to create new IDs, your app is dead. I suggest that if you can't control the format of your incoming IDs then you need to create your own (numeric) IDs and relate the user provided ID to that. You can then ensure that your own ID is reliable and unique (and numeric) but provide a user-viewable ID that can have whatever format your users want, and doesn't even have to be unique across the whole app. This is more work, but if you'd been through what I have you'd know which way to go.
Anil G