Skip to content

Conversation

@nbertagnolli
Copy link

Thanks for the great library! When running this without a GPU I had problems. I think there is a simple fix. The simple transformer NER model defaults to enabling cuda. This PR allows the user to pass a dictionary of arguments specifically for the simpletransformers NER model. So you can now run the code on a CPU by initializing rpunct like so

rpunct = RestorePuncts(ner_args={"use_cuda": False})

Before this change, when running rpunct examples on the CPU the following error occurs:

from rpunct import RestorePuncts
# The default language is 'english'
rpunct = RestorePuncts()
rpunct.punctuate("""in 2018 cornell researchers built a high-powered detector that in combination with an algorithm-driven process called ptychography set a world record
by tripling the resolution of a state-of-the-art electron microscope as successful as it was that approach had a weakness it only worked with ultrathin samples that were
a few atoms thick anything thicker would cause the electrons to scatter in ways that could not be disentangled now a team again led by david muller the samuel b eckert
professor of engineering has bested its own record by a factor of two with an electron microscope pixel array detector empad that incorporates even more sophisticated
3d reconstruction algorithms the resolution is so fine-tuned the only blurring that remains is the thermal jiggling of the atoms themselves""")


ValueError Traceback (most recent call last)
/var/folders/hx/dhzhl_x51118fm5cd13vzh2h0000gn/T/ipykernel_10548/194907560.py in
1 from rpunct import RestorePuncts
2 # The default language is 'english'
----> 3 rpunct = RestorePuncts()
4 rpunct.punctuate("""in 2018 cornell researchers built a high-powered detector that in combination with an algorithm-driven process called ptychography set a world record
5 by tripling the resolution of a state-of-the-art electron microscope as successful as it was that approach had a weakness it only worked with ultrathin samples that were

~/repos/rpunct/rpunct/punctuate.py in init(self, wrds_per_pred, ner_args)
19 if ner_args is None:
20 ner_args = {}
---> 21 self.model = NERModel("bert", "felflare/bert-restore-punctuation", labels=self.valid_labels,
22 args={"silent": True, "max_seq_length": 512}, **ner_args)
23

~/repos/transformers/transformer-env/lib/python3.8/site-packages/simpletransformers/ner/ner_model.py in init(self, model_type, model_name, labels, args, use_cuda, cuda_device, onnx_execution_provider, **kwargs)
209 self.device = torch.device(f"cuda:{cuda_device}")
210 else:
--> 211 raise ValueError(
212 "'use_cuda' set to True when cuda is unavailable."
213 "Make sure CUDA is available or set use_cuda=False."

ValueError: 'use_cuda' set to True when cuda is unavailable.Make sure CUDA is available or set use_cuda=False.

@MisterCapi
Copy link

@Felflare any chance you could merge it soon? I've encountered the same problem on non-cuda device

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants