hfmirror.storage.huggingface

HuggingfaceStorage

class hfmirror.storage.huggingface.HuggingfaceStorage(repo: str, repo_type: str = 'dataset', revision: str = 'main', hf_client: Optional[huggingface_hub.hf_api.HfApi] = None, access_token: Optional[str] = None, namespace: Optional[Union[List[str], str]] = None)[source]
__init__(repo: str, repo_type: str = 'dataset', revision: str = 'main', hf_client: Optional[huggingface_hub.hf_api.HfApi] = None, access_token: Optional[str] = None, namespace: Optional[Union[List[str], str]] = None)[source]

Initialize self. See help(type(self)) for accurate signature.

hf_local_upload_check

hfmirror.storage.huggingface.hf_local_upload_check(uploads: List[Tuple[Optional[str], str]], repo_id: str, repo_type='dataset', revision='main', chunk_for_hash: int = 1048576, session=None) → List[Tuple[bool, str]][source]
Overview:

Check resource on huggingface repo and local.

Parameters:
  • uploads – Tuples of uploads, the first item is the local file, second item is the file in repo. When first item is None, it means delete this item in repo.

  • repo_id – Repository id, the same as that in huggingface library.

  • repo_type – Repository type, the same as that in huggingface library.

  • revision – Revision of repository, the same as that in huggingface library.

  • chunk_for_hash – Chunk size for hashing calculation.

  • session – Session of requests, will be auto created when not given.

Returns:

Uploads are necessary or not, in form of lists of boolean.