Search code examples
sqlsql-servercharindexstring-agg

SQL Server [PATSTAT] query | Multiple charindex values &


Hello Stack Overflow Community.

I am retrieving data with SQL from PATSTAT (patent data base from the European Patent Office). I have two issues (see below). For your info the PATSAT sql commands are quite limited.


I. Charindex with multiple values

I am looking for specific two specific patent groups ["Y02E" and "Y02C"] and want to retrieve data on these. I have found that using the charindex function works if I insert one group;

and charindex ('Y02E', cpc_class_symbol) > 0

But if I want to use another charindex function the query just times out;

and charindex ('Y02E', cpc_class_symbol) > 0 or charindex ('Y02C', cpc_class_symbol) >0

I am an absolute SQL rookie but would really appreciate your help!


II. List values from column in one cell with comma separation

Essentially I want to apply what I found as the "string_agg"-command, however, it does not work for this database. I have entries with a unique ID, which have multiple patent categories. For example:

appln_nr_epodoc | cpc_class_symbol

EP20110185794 | Y02E  10/125

EP20110185794 | Y02E  10/127

I would like to have it like this, however:

appln_nr_epodoc | cpc_class_symbol

EP20110185794 | Y02E  10/125, Y02E  10/127

Again, I am very new to sql, so any help is appreciated! Thank you!

I will also attach the full code here for transparency

SELECT  a.appln_nr_epodoc, a.appln_nr_original, psn_name, person_ctry_code, person_name, person_address, appln_auth+appln_nr,  
appln_filing_date, cpc_class_symbol
FROM
tls201_appln a
join tls207_pers_appln b on a.appln_id = b.appln_id
join tls206_person c on b.person_id = c.person_id
join tls801_country on c.person_ctry_code= tls801_country.ctry_code
join tls224_appln_cpc on a.appln_id = tls224_appln_cpc.appln_id
WHERE appln_auth = 'EP'
and appln_filing_year between 2005 and 2012
and eu_member = 'Y'
and granted = 'Y'
and psn_sector = 'company'
and charindex ('Y02E', cpc_class_symbol) > 0

Solution

  • For your part 2 here is a sample data i created And here is the code. It gives me YOUR requested output.

    create table #test_1 (
                appln_nr_epodoc varchar(20) null
                ,cpc_class_symbol varchar(20) null
                )
    
                insert into #test_1 values
                ('EP20110185794','Y02E  10/125')
                ,('EP20110185794','Y02E  10/127')
                ,('EP20110185795','Y02E  10/130')
                ,('EP20110185796','Y02E  20/140')
                ,('EP20110185796','Y02E  21/142')
    
    
    
                with CTE_1 as (select *
                from (
                select * 
                ,R1_1 =  Rank() over(partition by appln_nr_epodoc order by cpc_class_symbol )
                from #test_1
                ) as a
                where R1_1 = 1
                )
    
                ,CTE_2 as (select *
                from (
                select * 
                ,R1_1 =  Rank() over(partition by appln_nr_epodoc order by cpc_class_symbol )
                from #test_1
                ) as a
                where R1_1 = 2 ) 
    
                select a.appln_nr_epodoc
                ,a.cpc_class_symbol+','+c.cpc_class_symbol
                from CTE_1 a
                join CTE_2 c on c.appln_nr_epodoc = a.appln_nr_epodoc
    

    Out put enter image description here