Skip to content

Mask2Former _init_weights #35877

@tommiekerssies

Description

@tommiekerssies

System Info

  • transformers version: 4.45.2
  • Platform: Linux-6.8.0-51-generic-x86_64-with-glibc2.39
  • Python version: 3.11.9
  • Huggingface_hub version: 0.24.6
  • Safetensors version: 0.4.4
  • Accelerate version: not installed
  • Accelerate config: not found
  • PyTorch version (GPU?): 2.4.0+cu121 (True)
  • Tensorflow version (GPU?): not installed (NA)
  • Flax version (CPU?/GPU?/TPU?): not installed (NA)
  • Jax version: not installed
  • JaxLib version: not installed
  • Using distributed or parallel set-up in script?: no
  • Using GPU in script?: yes
  • GPU type: NVIDIA RTX A6000

Who can help?

No response

Information

  • The official example scripts
  • My own modified scripts

Tasks

  • An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
  • My own task or dataset (give details below)

Reproduction

Run _init_weights

Expected behavior

The _init_weights method of Mask2Former has multiple problems. It initializes nn.Embeddings with an std of .02 (original Mask2Former code uses PyTorch's default init with std of 1.0). Similarly, the mask MLP is initialised wrongly with zero biases. Finally, another example of a problem is that the initialisation of the multi-scale deformable attention is overwritten by the branch for the Mask2FormerPixelDecoderEncoderOnly.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Core: ModelingInternals of the library; Models.VisionWIPLabel your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progressbug

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions