Created by: ruanslv
Fixes https://github.com/facebookresearch/metaseq/issues/529, where the problem is very well documented.
1/ best_of should control beam size, while n controls number of generations to be returned.
2/ nbest doesn't do anything, the number of generations returned is controlled in hub_utils.