SQL address data is messy, how to clean it up in a query?
Using group by soundex(name)
you will get result like this. You have to test on your data to figure out if this is helpful in your situation or not. I can not test this on SQL Server 2000 so I am not sure if soundex is available.
declare @T table (Code char(5), Name varchar(50), Address1 varchar(50))
insert into @T values
('10003', 'AMERICAN NUTRITON INC', '2183 BALL STREET'),
('10003', 'AMEICAN NUTRITION INC', '2183 BALL STREET'),
('10003', 'AMERICAN NUTRITION INC', '2183 BALL STREET'),
('10003', 'AMERICAN NUTRITION INC', '2183 BALL STREET'),
('10003', 'Samantha Brooks', '506 S. Main Street'),
('10003', 'BEMIS COMPANY', '1401 W. FOURTH PLAIN BLVD.'),
('10003', 'CEI', '597 VANDYRE BOULEVARD'),
('10003', 'Pacific Pet', '28th Avenue'),
('10003', 'PETSMART, INC.', '16091 NORTH 25TH STREET'),
('10003', 'THE PET FIRM', '16418 NORTH 37TH STREET')
select
min(Code) as Code,
min(Name) as Name,
min(Address1) as Address1
from @T
group by soundex(Name)
________________________________________________________
Code Name Address1
10003 AMEICAN NUTRITION INC 2183 BALL STREET
10003 AMERICAN NUTRITION INC 2183 BALL STREET
10003 BEMIS COMPANY 1401 W. FOURTH PLAIN BLVD.
10003 CEI 597 VANDYRE BOULEVARD
10003 Pacific Pet 28th Avenue
10003 PETSMART, INC. 16091 NORTH 25TH STREET
10003 Samantha Brooks 506 S. Main Street
10003 THE PET FIRM 16418 NORTH 37TH STREET
The best solution is to use a CASS certified address standardization program or service that will format and validate the address. Beyond the USPS which has tools for this, there are many third-party programs or services which provide this functionality. Address parsing is far more complicated than you might imagine and thus trying whip up a few queries to do it will be fraught with peril.
Google's Geocoding is another place to look.. Apparently Google requires you display the results to use their Geocoding service. That leaves using dedicated address parsers like the USPS or a third-party program.