Faster Inference using vllm

I have modified the model_inference script for a faster inference using vllm. I want to contribute it here. 

I have also modified the naming convention to make it more aligned with HF repo ids.