Possible to build a LoRA that doesn't inject into the transformer?

### Feature request

Is it possible to build a LoRA that doesn't inject into the transformer?  This would allow for reusing the same basic transformer with multiple adapters in the same process while saving on GPU memory (probably at the expense of some speed)

### Motivation

We've started using PEFT with LoRA for tasks such as sentiment analysis and constituency parsing in [Stanza](https://github.com/stanfordnlp/stanza), and one thing we found is that there is currently no memory savings compared to using a fully finetuned transformer.

For example, if the transformer loaded for sentiment analysis takes 3GB, with no finetuning we can reuse the same transformer weights when constituency parsing, making for a total of 3GB plus the prediction heads of the models.  If we use fully FT transformers, obviously that increases to 6GB assuming those are our only two tasks.

PEFT with LoRA uses `inject_adapter_in_model` to update the model with the As and Bs, AFAIK, meaning that loading those two models still takes 6GB.  If we could have a version of the transformer which does inference with the As and Bs not injected, but wrapping the base transformer's tensors, this would almost certainly be noticeably slower but would allow for a much smaller memory footprint.

Thanks for the extremely useful library, BTW

### Your contribution

I probably don't have much time to investigate this in the next couple months, but in the long term it is something I could attempt with some guidance on where to look

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Possible to build a LoRA that doesn't inject into the transformer? #1523

Feature request

Motivation

Your contribution

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Possible to build a LoRA that doesn't inject into the transformer? #1523

Description

Feature request

Motivation

Your contribution

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions