-
Notifications
You must be signed in to change notification settings - Fork 8
Description
I am trying to execute the LSTM to LSTM auto-encoder with word embedding (RNN to RNN architecture) example. I have already trained my own word2vec model via gensim and saved it with the command
model.save('/home/estathop/Documents/word2vecmodel/w2v1model') #save model
when trying to use the
# load Gensim word2vec from word2vec_model_path
word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model')
the following error occurs:
Traceback (most recent call last):
File "", line 5, in
word2vec = GensimWord2vec('/home/estathop/Documents/word2vecmodel/w2v1model')File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/seq2vec/word2vec/gensim_word2vec.py", line 9, in init
model_path, binary=TrueFile "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/keyedvectors.py", line 1120, in load_word2vec_format
limit=limit, datatype=datatype)File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/models/utils_any2vec.py", line 174, in _load_word2vec_format
header = utils.to_unicode(fin.readline(), encoding=encoding)File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/site-packages/gensim/utils.py", line 359, in any2unicode
return unicode(text, encoding, errors=errors)File "/home/estathop/anaconda2/envs/tensorflow/lib/python2.7/encodings/utf_8.py", line 16, in decode
return codecs.utf_8_decode(input, errors, True)UnicodeDecodeError: 'utf8' codec can't decode byte 0x80 in position 0: invalid start byte
any ideas how to fix/bypass this ?