Skip to content

Issue: Replace iterrows().to_dict() with apply(...).tolist() for better performance #35

@SaFE-APIOpt

Description

@SaFE-APIOpt

return [row.to_dict() for _, row in df.iterrows()]

Current implementation:
data = [row.to_dict() for _, row in df.iterrows()]
Recommended replacement:
data = df.apply(lambda row: row.to_dict(), axis=1).tolist()
Using iterrows() introduces overhead because each row is returned as a Series object and to_dict() is repeatedly called in pure Python. This approach creates a large number of temporary objects and results in slow performance when the DataFrame becomes large.

By contrast, df.apply(lambda row: row.to_dict(), axis=1) keeps the row-wise transformation within Pandas' optimized Cython internals. Although still row-based, this method reduces Python-level overhead and improves performance while preserving the same output structure: List[Dict[str, Any]].

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions