I need to calculate per-sample-gradient using vmap in my projects very often. However, vmap doesn't work when the model uses torch.utils.checkpoint, which are commonly used in most of the recent ...
Some results have been hidden because they may be inaccessible to you
Show inaccessible results