Skip to content
View Arth-Singh's full-sized avatar
๐ŸŽฏ
Focusing
๐ŸŽฏ
Focusing

Block or report Arth-Singh

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this userโ€™s behavior. Learn more about reporting abuse.

Report abuse
Arth-Singh/README.md

Hi there ๐Ÿ‘‹

Iโ€™m Arth Singh โ€” an AI Safety & Red Teaming researcher from Mumbai, India ๐Ÿ‡ฎ๐Ÿ‡ณ. I am currently working at AIM Intelligence as a Research Engineer in the AI Safety department, and currently collaborating with Seoul National University PI Lab for Mobile Use Agents Red Teaming, I was also a Research Collaborator with FAR.AI where I helped them build their Red Teaming Toolkit.

  • ๐Ÿงจ I enjoy red teaming AI models, but lately Iโ€™m more focused on AI alignment & safety
  • ๐Ÿง  Big fan of thinking in systems, visualizing ideas clearly, and deep brainstorming
  • โ˜• Love debating over coffee โ€” especially about AI, policy, and world news.
  • ๐Ÿœ Absolute foodie .
  • ๐Ÿ‡ฏ๐Ÿ‡ต Japan is one of my favorite countries to visit โ€” culture, tech, vibes, all of it

๐Ÿ“ซ Letโ€™s connect:

Always down to talk alignment, adversarial evals, or half-baked research ideas that can turn into collaborations.

Pinned Loading

  1. arth-finds-weird-model-behaviours arth-finds-weird-model-behaviours Public

    I have created this repository to document all the weird findings I do related to LLMs

    Python

  2. A-Red-Team-Havoc A-Red-Team-Havoc Public

    This is a red teaming toolkit that i have built to do attacks on LLMs. More to add soon.

    Python 1

  3. NSFW-Image-Gen-Prompt-Injection-automation NSFW-Image-Gen-Prompt-Injection-automation Public

    3-Step Pipeline: 1. Kimi K2 โ†’ Generates creative, boundary-testing prompts for safety evaluation 2. OpenAI GPT-Image-1 โ†’ Creates images from prompts (fallbacks to DALL-E 3) 3. OpenAI Moderation โ†’ โ€ฆ

    Python

  4. Arth-Jailbreak-Templates Arth-Jailbreak-Templates Public

    This is a repository I use to add my own jailbreak templates, which I build with the help of LLMs.

    1