This project aims to test and evaluate one AI agent in the development stage. By asking questions and requests, based on various categories (the training is provided), you will try to force the AI agent to say something harmful, offensive, dangerous or toxic an evaluate itβs responses.
The end goal is to prevent AI agent from sharing any harmful, offensive, dangerous or toxic content to the end users and make it safer.
This role is fully remote, full-time with a duration of several months (depending on the project progress it might be prolonged).
Full online training before the project will be provided.
Required profile
Experience
Spoken language(s):
English
Check out the description to know which languages are mandatory.