[Tokenizer] GLM4 tokenizer does not support `padding_side` in `.pad()` (transformers latest version)

## Describe the problem

I'm currently using GLM4 with the latest version of HuggingFace's `transformers` library in a P-Tuning experiment. While preparing input batches, I encountered the following error:


This happens when I try to use `.pad(padding_side="right")` — a common approach in HuggingFace to pad a batch of tokenized inputs.

## My use case

I'm following the HuggingFace-style batching process for fine-tuning, where `.pad()` is typically used to ensure consistent input shapes. But the GLM4 tokenizer appears to lack support for `padding_side`, and perhaps even `.pad()` behavior in general.

##  What I’ve tried

- Looked into the tokenizer code — it seems that `GLMTokenizer` does not inherit the usual `pad()` method behavior from `PreTrainedTokenizerFast`.
- Tried manually padding the input sequences, but I’m concerned about whether that matches GLM4’s expected behavior, particularly for `attention_mask` and `position_ids`.

## Questions

1. What's the recommended way to apply padding when using GLM4 tokenizer?
2. Is there a compatible data collator or tokenizer wrapper that supports HuggingFace-style padding?
3. Would manually implementing padding + masks be sufficient, or is there a better way to ensure compatibility?

Environment:
- GLM model: GLM4
- Transformers version: latest 
- OS: Ubuntu 

Thanks a lot for your help!




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Tokenizer] GLM4 tokenizer does not support `padding_side` in `.pad()` (transformers latest version) #210

Describe the problem

My use case

What I’ve tried

Questions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Tokenizer] GLM4 tokenizer does not support padding_side in .pad() (transformers latest version) #210

Description

Describe the problem

My use case

What I’ve tried

Questions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

[Tokenizer] GLM4 tokenizer does not support `padding_side` in `.pad()` (transformers latest version) #210