NLLB 200 inference in google colab for language translation
NLLB ( No Language left behind ) 200 is a facebook AI model for language translation . which can convert any language to any other language to any other language . It is really great breakthrough in the field of AI .
We have developed NLLB inference notebook in google colab . We have used github repo for inference . we really thanks facebook and owner of this repo for such a good explanation
First need to install all the dependencies
pip install fairscale
install fairscale
git clone https://github.com/facebookresearch/fairseq
cd fairseq
git checkout nllb
pip install -e .
cd ..
clone repor from facebook research and checkout nllb branch and install it in the same directory as fairseq repo
git clone https://github.com/pytorch/fairseq
cd /content/fairseq/
python -m pip install --editable .
cd /content
echo $PYTHONPATH
import os
os.environ['PYTHONPATH'] += ":/content/fairseq/"
echo $PYTHONPATH
clone the repo from github and install it in the same directory as fairseq repo and then run the notebook in google colab . We have set path of fairseq in the python path variable .
%cd ..
! git clone https://github.com/google/sentencepiece.git
% cd sentencepiece
! mkdir build
% cd build
! cmake ..
! make -j $(nproc)
! make -j $(nproc)
! sudo make install
! sudo ldconfig -v
% cd ..
% cd ..
Then we need to clone and build sentencepiece repo and install it . pip install for sentencepiece is not working because it is just the wrapper around sentencepiece C++ library . So we need to clone the repo and build it .
wget --trust-server-names https://tinyurl.com/nllb200densedst600mcheckpoint
Now download the weight file from above url
git clone https://github.com/pluiez/NLLB-inference
Then you need to git clone the inference repo which is given by this github user. This is awesome repo.
bash translate.sh /content/checkpoint.pt eng_Latn hin_Deva <<< 'You are very very good person'
In the above command we have used translate.sh script which is given by this github user. To translate through this script we need to pass the following arguments. /content/checkpoint.pt is the weight file , eng_Latn is the source language , hin_Deva is the target language and you can pass any sentence to translate.
All of the above is available on this google colab page.
Leave a Comment
No Comments Yet