I'm trying to get a count on how many times BNxxxx has been commented in the comments cell. So far, I can make each cell be counted once, but there may be multiple comments in a cell containing BNxxxx.
For example, this:
-------
BN0012
-------
BN0012
-------
BN0012
BN0123
-------
should show an output of BN0012 3 times and BN0123 once. Instead, I get BN0012 3 times only.
Here's my code:
select COMMENTS, count(*) as TOTAL
from NOTE
Where COMMENTS like '%BN%' AND CREATE_DATE between '01/1/2015' AND '11/03/2015'
group by COMMENTS
order by Total desc;
Any ideas?
edit
My code now looks like
select BRIDGE_NO, count(*)
from IACD_ASSET b join
IACD_NOTE c
on c.COMMENTS like concat(concat('BN',b.BRIDGE_NO),'%')
Where c.CREATE_DATE between '01/1/2015' AND '11/03/2015' AND length(b.BRIDGE_NO) > 1
group by b.BRIDGE_NO
order by count(*);
Problem with this is the BN44 is the same as BN4455 .. have tried concat(concat('BN',b.BRIDGE_NO),'_') comes back with nothing , any ideas how i can get exact likes
I'm going to assume that your COMMENTS
table has a primary key column (such as comment_id
) or at least that comments
isn't a CLOB
. If it is a CLOB
then you're not going to be able to use GROUP BY
on that column.
You can accomplish this as follows without even a lookup table of BN....
values. No guarantees as to the performance:
WITH d1 AS (
SELECT 1 AS comment_id, 'BN0123 is a terrible thing BN0121 also BN0000' AS comments
, date'2015-01-03' AS create_date
FROM dual
UNION ALL
SELECT 2 AS comment_id, 'BN0125 is a terrible thing BN0120 also BN1000' AS comments
, date'2015-02-03' AS create_date
FROM dual
)
SELECT comment_id, comments, COUNT(*) AS total FROM (
SELECT comment_id, comments, TRIM(REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i')) AS bn
FROM d1
WHERE create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
CONNECT BY REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i') IS NOT NULL
AND PRIOR comment_id = comment_id
AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL
) GROUP BY comment_id, comments;
Note that I corrected your filter:
CREATE_DATE between '01/1/2015' AND '11/03/2015'
First, you should be using ANSI date literals (e.g., date'2015-01-01'
); second, using BETWEEN
for dates is often a bad idea as Oracle DATE
values contain a time portion. So this should be rewritten as:
create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
Note that the later date is November 4, to make sure we capture all possible comments that were made on November 3.
If you want to see the matched comments without aggregating the counts, then do the following (taking out the outer query, basically):
WITH d1 AS (
SELECT 1 AS comment_id, 'BN0123 is a terrible thing BN0121 also BN0000' AS comments
, date'2015-01-03' AS create_date
FROM dual
UNION ALL
SELECT 2 AS comment_id, 'BN0125 is a terrible thing BN0120 also BN1000' AS comments
, date'2015-02-03' AS create_date
FROM dual
)
SELECT comment_id, comments, TRIM(REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i')) AS bn
FROM d1
WHERE create_date >= date'2015-01-01'
AND create_date < date'2015-11-04'
CONNECT BY REGEXP_SUBSTR(comments, '(^|\s)BN\d+(\s|$)', 1, LEVEL, 'i') IS NOT NULL
AND PRIOR comment_id = comment_id
AND PRIOR DBMS_RANDOM.VALUE IS NOT NULL;
Given the edits to your question, I think you want something like the following:
SELECT b.bridge_no, COUNT(*) AS comment_cnt
FROM iacd_asset b INNER JOIN iacd_note c
ON REGEXP_LIKE(c.comments, '(^|\W)BN' || b.bridge_no || '(\W|$)', 'i')
WHERE c.create_dt >= date'2015-01-01'
AND c.create_dt < date'2015-03-12' -- It just struck me that your dates are dd/mm/yyyy
AND length(b.bridge_no) > 1
GROUP BY b.bridge_no
ORDER BY comment_cnt;
Note that I am using \W
in the regex above instead of \s
as I did earlier to make sure that it captures things like BN1234/BN6547
.