Use mirror to download models and datasets¶
While the official Hugging Face repository offers numerous high-quality models and datasets, they may not be always accessible due to network issues. To make the access easier, MindNLP enables you to download models and datasets from a variety of huggingface mirrors or other model repositories.
Here we show you how to set your desired mirror.
You can either set the Hugging Face mirror through the environment variable, or more locally, specify the mirror in the from_pretrained
method when downloading models.
Set Hugging Face mirror through the environment variable¶
The Huggingface mirror used in MindNLP is controlled throught the HF_ENDPOINT
environment variable.
You can either set this variable in the terminal before excuting your python script:
export HF_ENDPOINT="https://hf-mirror.com"
os
package:
import os
os.environ['HF_ENDPOINT'] = 'https://hf-mirror.com'
If the HF_ENDPOINT
variable is not set explicitly by the user, MindNLP will use 'https://hf-mirror.com' by default. You can change this to the official Huggingface repository, 'https://huggingface.co'.
Important:
The URL should not include the last '/'. Setting the varialble to 'https://hf-mirror.com' will work, while setting it to 'https://hf-mirror.com/' will result in an error.
Important:
As the HF_ENDPOINT
variable is read during the initial import of MindNLP, it is important to set the HF_ENDPOINT
before importing MindNLP. If you are in a Jupyter Notebook, and MindNLP package is already imported, you may need to restart the notebook for the change to take effect.
Now you can download the model you want, for example:
from mindnlp.transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased')
Specify Hugging Face mirror in the from_pretrained
method¶
Instead of setting the Hugging Face mirror globally through the environment variable, you can also specify the mirror for a single download operation in the from_pretrained
method.
For example:
from mindnlp.transformers import AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained('bert-base-uncased', mirror='modelscope', revision='master')
MindNLP accepts the following options for the mirror
argument:
-
'huggingface'
Download from the Hugging Face mirror specified through the
HF_ENDPOINT
environment variable. By default, it points to HF-Mirror. -
'modelscope'
Download from ModelScope.
-
'wisemodel'
Download from 始智AI.
-
'gitee'
Dowload from the Gitee AI Hugging Face repository.
-
'aifast'
Download from AI快站.
Note that not all models can be found from a single mirror, you may need to check whether the model you want to download is actually provided by the mirror you choose.
In addition to specifying the mirror, you also need to specify the revision
argument. The revision
argument can either be 'master' or 'main' depending on the mirror you choose. By default, revision='main'
.
-
If the
mirror
is 'huggingface', 'wisemodel' or 'gitee', setrevision='main'
. -
If the
mirror
is 'modelscope', setrevision='master'
. -
If the
mirror
is 'aifast',revision
does not need to be specified.