I have to write a little script that goes through a set of strings and compares them. The script basically counts the strings and will return any row that is duplicate (or has a count > 1)
select data_string, count(data_string)
from #facts
group by data_string
having count(data_string) > 1
This works well for most cases, what we realized is that some users enter their strings with new line breaks and others don't. The below are two strings exactly the same and should be consolidated into one count.
However the first one has a new line break after it and therefore it is not consolidating them into a single count.
ZPT3Z02-C4-988
ZPT3Z02-C4-988
Is there a way to remove/limit these?
I tried doing this
select RTRIM(LTRIM(data_string)), count (RTRIM(LTRIM(data_string)))
from #facts
group by data_string
having count (RTRIM(LTRIM(data_string))) > 1
and this
select RTRIM(LTRIM(REPLACE(data_string, CHAR(13), ''))), count (RTRIM(LTRIM(REPLACE(data_string, CHAR(13), ''))))
from #facts
group by data_string
--having count (RTRIM(LTRIM(data_string))) > 1
It's not working. Please help
I think the second version should work . . . with the right GROUP BY
:
select RTRIM(LTRIM(REPLACE(data_string, CHAR(13), ''))),
count(*)
from #facts
group by RTRIM(LTRIM(REPLACE(data_string, CHAR(13), '')))
having count(*) > 1;
However, SQL Server supports multi-line strings, so you can also do:
select RTRIM(LTRIM(REPLACE(data_string, '
', ''))),
count(*)
from #facts
group by RTRIM(LTRIM(REPLACE(data_string, '
', '')))
having count(*) > 1;