Search code examples
apache-sparkpysparkapache-spark-sqllogical-operators

Filter non equal values in pyspark, using condition.\ where(array_contains())


I have a pyspark code

condition_no_hypertension = condition.\
where(array_contains('clinicalStatus.coding.code', 'active')).\
where(array_contains('verificationStatus.coding.code', 'confirmed')).\
where(array_contains('code.coding.code', '38341003')).\
where(condition.onsetDateTime > '1900-01-01').\
withColumn('condition_status', condition['clinicalStatus.coding.code'].getItem(0)).\
withColumn('verification_status', condition['verificationStatus.coding.code'].getItem(0)).\
withColumn('snomed_code', condition['code.coding.code'].getItem(0)).\
withColumn('snomed_name', condition['code.coding.display'].getItem(0)).\
select(\
   (condition['subject.reference'].substr(10, 40).alias('patient_id')),
   'condition_status',\
   'verification_status',\
   'snomed_code', \
   'snomed_name',\
   to_date(condition['onsetDateTime']).alias('first_observation_date'))

How to change this code and pick up everything but code?

I tried

where(array_contains('code.coding.code', !='38341003')).\

but it does not work.


Solution

  • You can use ~ (not):

    where(~array_contains('code.coding.code', '38341003'))