`_pad_token` attribute?

Thanks for making this open-source! The following function checks for `_pad_token` attribute:
```python
    def _tokenize(self, text_sample):
        if self.tokenizer._pad_token is None:
            # Some tokenizers (e.g. GPT2 tokenizer) have no padding token which causes bugs
            raise RuntimeError("If tokenizing on-the-fly, tokenizer must have a pad_token_id")

        return self.tokenizer(text_sample["text"], truncation=True, padding="max_length", max_length=self.max_seq_len)
```

But shouldn't it simply check for `pad_token_id`? My tokenizer has `pad_token_id` and `pad_token`, but no `_pad_token`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`_pad_token` attribute? #241

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

_pad_token attribute? #241

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

`_pad_token` attribute? #241