Skip to content

TypeError: unsupported operand type(s) for //: 'int' and 'NoneType' #77

@vody-am

Description

@vody-am

Running the multilabel example from notebooks, I run into this error:

Traceback (most recent call last):
  File "/Users/user/dev/torchTextClassifiers/torchTextClassifiers/examples/multilabel.py", line 68, in <module>
    ttc_ragged = torchTextClassifiers(
        tokenizer=tokenizer,
        model_config=model_config,
        ragged_multilabel=True,  # Key for ragged list input!
    )
  File "/Users/user/dev/torchTextClassifiers/torchTextClassifiers/torchTextClassifiers.py", line 164, in __init__
    self.text_embedder = TextEmbedder(
                         ~~~~~~~~~~~~^
        text_embedder_config=text_embedder_config,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/user/dev/torchTextClassifiers/torchTextClassifiers/model/components/text_embedder.py", line 48, in __init__
    self.label_attention_module = LabelAttentionClassifier(self.config)
                                  ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^
  File "/Users/user/dev/torchTextClassifiers/torchTextClassifiers/model/components/text_embedder.py", line 312, in __init__
    self.head_dim = self.embedding_dim // self.n_head
                    ~~~~~~~~~~~~~~~~~~~^^~~~~~~~~~~~~
TypeError: unsupported operand type(s) for //: 'int' and 'NoneType'

Here is a minimal script to reproduce:

import numpy as np
import torch

from torchTextClassifiers import ModelConfig, TrainingConfig, torchTextClassifiers
from torchTextClassifiers.dataset import TextClassificationDataset
from torchTextClassifiers.model import TextClassificationModel, TextClassificationModule
from torchTextClassifiers.model.components import (
    AttentionConfig,
    CategoricalVariableNet,
    ClassificationHead,
    TextEmbedder,
    TextEmbedderConfig,
)
from torchTextClassifiers.tokenizers import HuggingFaceTokenizer

# Note: %load_ext autoreload and %autoreload 2 are specific to IPython/Notebooks 
# and are omitted here for a standard Python script.

# ==========================================
# 1. Ragged-lists approach
# ==========================================

# In multilabel classification, each instance can be assigned multiple labels simultaneously.
# Let's use fake data where labels is a list of lists (ragged array).
sample_text_data = [
    "This is a positive example",
    "This is a negative example",
    "Another positive case",
    "Another negative case",
    "Good example here",
    "Bad example here",
]

# Each inner list contains labels for the corresponding instance
labels_ragged = [[0, 1, 5], [0, 4], [1, 5], [0, 1, 4], [1, 5], [0]]

# Note: labels_ragged is a "jagged array." 
# np.array(labels_ragged) would not work directly as a standard numeric matrix.
# However, torchTextClassifiers handles this directly.

# Load a pre-trained tokenizer
tokenizer = HuggingFaceTokenizer.load_from_pretrained(
    "google-bert/bert-base-uncased", output_dim=126
)

X = np.array(sample_text_data)
Y_ragged = labels_ragged 

# Configure the model and training
# We use BCEWithLogitsLoss for multilabel tasks to treat each label 
# as a separate binary classification problem.
embedding_dim = 96
num_classes = max(max(label_list) for label_list in labels_ragged) + 1

model_config = ModelConfig(
    embedding_dim=embedding_dim,
    num_classes=num_classes,
)

training_config = TrainingConfig(
    lr=1e-3,
    batch_size=4,
    num_epochs=1,
    loss=torch.nn.BCEWithLogitsLoss(),  # Essential for multilabel
)

# Initialize the classifier with ragged_multilabel=True
ttc_ragged = torchTextClassifiers(
    tokenizer=tokenizer,
    model_config=model_config,
    ragged_multilabel=True,  # Key for ragged list input!
)

print("Starting training with ragged labels...")
ttc_ragged.train(
    X_train=X,
    y_train=Y_ragged,
    training_config=training_config,
)

# Behind the scenes, the ragged lists are converted into a binary matrix (one-hot version).

# ==========================================
# 2. One-hot / multidimensional output approach
# ==========================================

# You can also provide a one-hot/multidimensional array (or float probabilities).
# Here, each row is a vector of size equal to the number of labels.
labels_one_hot = [
    [1., 1., 0., 0., 0., 1.],
    [1., 0., 0., 0., 1., 0.],
    [0., 1., 0., 0., 0., 1.],
    [1., 1., 0., 0., 1., 0.],
    [0., 1., 0., 0., 0., 1.],
    [1., 0., 0., 0., 1., 0.]
]
Y_one_hot = np.array(labels_one_hot)

# When using one-hot/dense arrays, set ragged_multilabel=False (default)
ttc_dense = torchTextClassifiers(
    tokenizer=tokenizer,
    model_config=model_config,
)

print("\nStarting training with one-hot labels...")
ttc_dense.train(
    X_train=X,
    y_train=Y_one_hot,
    training_config=training_config,
)

# Final Note: 
# - Use BCEWithLogitsLoss for multilabel settings.
# - Use CrossEntropyLoss for "soft" multiclass (where probabilities sum to 1).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions