I'm trying to run the example code on this Microsoft documentation, but I am presented with a package not found error. I'm on a MAC and my friend had the same problem on his machine too. I'm sure that a I have installed the transformers package. I imported with no error. I'm on a virtual environment, on a jupyter notebook on vs code.
If I remove the config.yaml file, it runs with no errors, so maybe is something that's in it. But is kinda the same version that is on documentation.
Code:
from presidio_analyzer import AnalyzerEngine, RecognizerRegistry
from presidio_analyzer.nlp_engine import NlpEngineProvider
conf_file = 'config.yaml'
provider = NlpEngineProvider(conf_file=conf_file)
nlp_engine = provider.create_engine()
analyzer = AnalyzerEngine(
nlp_engine=nlp_engine,
supported_languages=["en"]
)
results_english = analyzer.analyze(text="My name is Morris", language="en")
print(results_english)
Error stack:
ValueError Traceback (most recent call last)
Cell In[3], line 6
4 # Create NLP engine based on configuration
5 provider = NlpEngineProvider(conf_file=conf_file)
----> 6 nlp_engine = provider.create_engine()
8 # Pass the created NLP engine and supported_languages to the AnalyzerEngine
9 analyzer = AnalyzerEngine(
10 nlp_engine=nlp_engine,
11 supported_languages=["en"]
12 )
File ~/Projects/pii/lib/python3.12/site-packages/presidio_analyzer/nlp_engine/nlp_engine_provider.py:81, in NlpEngineProvider.create_engine(self)
79 nlp_engine_name = self.nlp_configuration["nlp_engine_name"]
80 if nlp_engine_name not in self.nlp_engines:
---> 81 raise ValueError(
82 f"NLP engine '{nlp_engine_name}' is not available. "
83 "Make sure you have all required packages installed"
84 )
85 try:
86 nlp_engine_class = self.nlp_engines[nlp_engine_name]
ValueError: NLP engine 'transformers' is not available. Make sure you have all required packages installed
My config.yaml:
nlp_engine_name: transformers
models:
-
lang_code: en
model_name:
spacy: en_core_web_sm
transformers: StanfordAIMI/stanford-deidentifier-base
ner_model_configuration:
labels_to_ignore:
- O
aggregation_strategy: simple # "simple", "first", "average", "max"
stride: 16
alignment_mode: strict # "strict", "contract", "expand"
model_to_presidio_entity_mapping:
PER: PERSON
LOC: LOCATION
EMAIL: EMAIL
PHONE: PHONE_NUMBER
low_confidence_score_multiplier: 0.4
low_score_entity_names:
- ID
You will need to install the Presidio Analyzer package with the transformers
extra dependency specifier:
pip install "presidio-analyzer[transformers]"
This will install the extra dependencies needed for the transformers based NLP engine.