Skip to content

why layoutlmv3 did't use image data, when fintune xfund? #1731

@cqray1990

Description

@cqray1990

Expected behavior
A clear and concise description of what you expected to happen.

  • Platform:
  • Python version: 3.11
  • PyTorch version (GPU?):
    2.8
    transformers:4.56.2

here images key in features is removed

        def __call__(self, features):
              label_name = "label" if "label" in features[0].keys() else "labels"
              labels = [feature[label_name] for feature in features] if label_name in features[0].keys() else None
      
              images = None
              if "images" in features[0]:
                  images = torch.stack([torch.tensor(d.pop("images")) for d in features])
                  IMAGE_LEN = int(images.shape[-1] / 16) * int(images.shape[-1] / 16) + 1
      
              batch = self.tokenizer.pad(
                  features,
                  padding=self.padding,
                  max_length=self.max_length,
                  pad_to_multiple_of=self.pad_to_multiple_of,
                  # Conversion to tensors will fail if we have labels as they are not of the same length yet.
                  return_tensors="pt" if labels is None else None,
              )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions