Search code examples
pythonsentiment-analysis

How to batch sentiment with PYABSA


I followed this tutorial to do sentiment inference. I could run the code with a list of sentences (in the section Aspect Sentiment Inference of the Colab Notebook). However, I don't know how to modify the following code (in the section Batch Sentiment Inference) to infer sentiment for a file of my own (containing just 2 lines, each has two sentences).

# inference_sets = ABSADatasetList.Phone # original code
inference_sets = 'test.dat.apc' # this is my own file that I want to infer sentiment for each sentence
results = sent_classifier.batch_infer(target_file=inference_sets,
                     print_result=True,
                     save_result=True,
                     ignore_error=False,
                     )

Running the modified code caused the following error

RuntimeError                              Traceback (most recent call last)
Input In [56], in <cell line: 2>()
      1 test = 'test.dat.apc'
----> 2 results = sent_classifier.batch_infer(target_file=test,
      3                      print_result=False,
      4                      save_result=True,
      5                      ignore_error=False,
      6                      )

File ~\Anaconda3\envs\spacy\lib\site-packages\pyabsa-1.16.15-py3.9.egg\pyabsa\core\apc\prediction\sentiment_classifier.py:197, in SentimentClassifier.batch_infer(self, target_file, print_result, save_result, ignore_error, clear_input_samples)
    193     self.clear_input_samples()
    195 save_path = os.path.join(os.getcwd(), 'apc_inference.result.json')
--> 197 target_file = detect_infer_dataset(target_file, task='apc')
    198 if not target_file:
    199     raise FileNotFoundError('Can not find inference datasets!')

File ~\Anaconda3\envs\spacy\lib\site-packages\pyabsa-1.16.15-py3.9.egg\pyabsa\functional\dataset\dataset_manager.py:302, in detect_infer_dataset(dataset_path, task)
    300     if os.path.isdir(dataset_path.dataset_name):
    301         print('No inference set found from: {}, unrecognized files: {}'.format(dataset_path, ', '.join(os.listdir(dataset_path.dataset_name))))
--> 302     raise RuntimeError(
    303         'Fail to locate dataset: {}. If you are using your own dataset, you may need rename your dataset according to {}'.format(
    304             dataset_path,
    305             'https://github.com/yangheng95/ABSADatasets#important-rename-your-dataset-filename-before-use-it-in-pyabsa')
    306     )
    307 if len(dataset_path) > 1:
    308     print(colored('Please DO NOT mix datasets with different sentiment labels for training & inference !', 'yellow'))

RuntimeError: Fail to locate dataset: ['test.dat.apc']. If you are using your own dataset, you may need rename your dataset according to https://github.com/yangheng95/ABSADatasets#important-rename-your-dataset-filename-before-use-it-in-pyabsa

What did I do wrong?


Solution

  • You need to rename your dataset file name, by ending with .inference