Created by: suchenzang
Splitting https://github.com/facebookresearch/metaseq/pull/231 up into 2 PRs.
Removed:
- unused
moe_disable_paddingarg - unused
from_pretrainedmethods, since we currently depend onload_model_ensemble_and_taskfromcheckpoint_utils(not great but saving that for another PR) - unused
hub_modelsmethod