start - Introduce the activity and ask a question to understand the comprehension level of kid's understanding.Run Test
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Start the assessment activity", "expected_behavior": "Introduce the activity and ask a question to understand the comprehension level of kid\u0027s understanding.", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "start", "stage": "pre_assessment"}
child answers incorrectly - Tell the child clearly and gently that the answer is incorrect and repeat the question in a simplified formRun Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0905\u091a\u094d\u091b\u093e, \u091a\u0932\u094b \u092b\u093f\u0930 \u0938\u0947 \u0915\u094b\u0936\u093f\u0936 \u0915\u0930\u0924\u0947 \u0939\u0948\u0902\u0964 \u092e\u092e\u094d\u092e\u0940, \u092e\u091b\u0932\u0940 \u0914\u0930 \u092e\u091f\u0915\u093e, \u092f\u0947 \u0936\u092c\u094d\u0926 \u0915\u093f\u0938 \u0906\u0935\u093e\u091c\u093c \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0947 \u0939\u0948\u0902? \u0927\u094d\u092f\u093e\u0928 \u0938\u0947 \u0938\u0941\u0928\u094b - \u092e\u092e\u094d\u092e\u0940, \u092e\u091b\u0932\u0940, \u092e\u091f\u0915\u093e\u0964", "role": "model"}, {"message": "ch se", "role": "user"}], "description": "Child answers the question incorrectly", "expected_behavior": "Tell the child clearly and gently that the answer is incorrect and repeat the question in a simplified form", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child answers incorrectly", "stage": "pre_assessment"}
question_for_the_starting_sound - Ask the user a question to answer the starting sound of given objects. All objects should start the same sound.Run Test
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Question for the starting sound", "expected_behavior": "Ask the user a question to answer the starting sound of given objects. All objects should start the same sound.", "flaky": false, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "question_for_the_starting_sound", "stage": "pre_assessment"}
question_for_the_starting_sound - Ask the user a question to identify which objects starting with the sound "Ma" among objects with a different starting sound.Run Test
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Question for the starting sound", "expected_behavior": "Ask the user a question to identify which objects starting with the sound \"Ma\" among objects with a different starting sound.", "flaky": false, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "question_for_the_starting_sound", "stage": "pre_assessment"}
teach_common_tests
child_is_curious - LLM should answer the question enthusiastically why we are playing this game and explain the reasoning behind the learning objective in child-friendly language.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "kyun?", "role": "user"}], "description": "The child asks why are we playing this game", "expected_behavior": "LLM should answer the question enthusiastically why we are playing this game and explain the reasoning behind the learning objective in child-friendly language.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_is_curious", "stage": "teach", "teaching_method": "daily_conversation"}
child_says_something_incorrect - Don't evaluate anything else, just check if the user input field in LLM's generated response matches exactly with what user said.Run Test
🤢
start - Introduce the activity and ask user if they are ready to start.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Start the teaching activity", "expected_behavior": "Introduce the activity and ask user if they are ready to start.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "start", "stage": "teach", "teaching_method": "daily_conversation"}
start_and_say_hello - LLM should greet the child with 'Hello' or 'Hey'. Avoid using 'Namaste' for small kids, use a more friendly loving greeting.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Greet the child", "expected_behavior": "LLM should greet the child with \u0027Hello\u0027 or \u0027Hey\u0027. Avoid using \u0027Namaste\u0027 for small kids, use a more friendly loving greeting.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "start_and_say_hello", "stage": "teach", "teaching_method": "daily_conversation"}
formal_you - LLM should address the kid with formal 'You' which is 'aap' in Hindi and not 'tum'Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "LLM addresses the kid with formal \u0027You\u0027", "expected_behavior": "LLM should address the kid with formal \u0027You\u0027 which is \u0027aap\u0027 in Hindi and not \u0027tum\u0027", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "formal_you", "stage": "teach", "teaching_method": "daily_conversation"}
child_ready - LLM should ask the child if the child is ready.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "ask if the child is ready to play", "expected_behavior": "LLM should ask the child if the child is ready.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_ready", "stage": "teach", "teaching_method": "daily_conversation"}
ask_only_one_question - LLM should ask only one question in the response and not multiple questions.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Ask one question in the response", "expected_behavior": "LLM should ask only one question in the response and not multiple questions.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "ask_only_one_question", "stage": "teach", "teaching_method": "daily_conversation"}
child_repeat_the_same_answer - LLM should make sure that kid comes up with new objects, at the same time acknowleding that these are correct answers. Child should not repeat the exisitng objects learnt in the lesson.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "\u0939\u093e\u0901", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092a\u0939\u0932\u0947 \u0939\u092e \u0915\u0941\u091b \u091a\u0940\u091c\u0947\u0902 \u0926\u0947\u0916\u0947\u0902\u0917\u0947 \u091c\u094b \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0902\u0964 \u0926\u0947\u0916\u094b \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091b\u0932\u0940 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "\u092e\u091b\u0932\u0940", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092c\u0939\u0941\u0924 \u092c\u0922\u093c\u093f\u092f\u093e! \u092f\u0939 \u090f\u0915 \u092e\u091b\u0932\u0940 \u0939\u0948, \u0914\u0930 \u092e\u091b\u0932\u0940 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0964 \u0905\u092c \u0926\u0947\u0916\u094b, \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091f\u0930 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "matar", "role": "user"}, {"message": "\u092c\u093f\u0932\u0915\u0941\u0932 \u0938\u0939\u0940! \u092e\u091f\u0930 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948\u0964 \u0924\u0941\u092e \u092c\u0939\u0941\u0924 \u0939\u094b\u0936\u093f\u092f\u093e\u0930 \u0939\u094b! \u0905\u092c \u092c\u0924\u093e\u0913, \u0914\u0930 \u0915\u094c\u0928 \u0938\u0940 \u091a\u0940\u091c\u093c \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948?", "role": "model"}, {"message": "machhli aur matar", "role": "user"}], "description": "The child does not come up with new objects starting from the asked letter", "expected_behavior": "LLM should make sure that kid comes up with new objects, at the same time acknowleding that these are correct answers. Child should not repeat the exisitng objects learnt in the lesson.", "flaky": true, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_repeat_the_same_answer", "stage": "teach", "teaching_method": "daily_conversation"}
child_says_bye_thank_you - LLM should reply with 'Goodbye' and saying 'It was fun'. LLM shouldn't reply with 'You're welcome' or 'aapka swagat hai'Run Test
🤢
child_answers_incorrectly_be_gentle - Explain the mistake to the child clearly and gently nudge the child and repeat the question in a simplified formRun Test
child_answers_incorrectly_give_example - LLM should give an example while explaining the mistake to the child. LLM shouldn't answer the question if the child has failed to answer for the first time.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0948\u0932\u094d\u0932\u094b! \u092e\u0947\u0930\u093e \u0928\u093e\u092e \u0905\u092a\u094d\u092a\u0942 \u0939\u0948\u0964 \u0906\u091c \u0939\u092e \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0947\u0902\u0917\u0947! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u0948\u0902?", "role": "model"}, {"message": "haan", "role": "user"}, {"message": "\u0939\u092e \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0928\u0947 \u0935\u093e\u0932\u0947 \u0936\u092c\u094d\u0926\u094b\u0902 \u0915\u0947 \u092c\u093e\u0930\u0947 \u092e\u0947\u0902 \u091c\u093e\u0928\u0947\u0902\u0917\u0947\u0964 \u0926\u0947\u0916\u093f\u092f\u0947, \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u092e\u094d\u092e\u0940 \u0939\u0948\u0902\u0964 \u0027\u092e\u092e\u094d\u092e\u0940\u0027 \u0936\u092c\u094d\u0926 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948\u0964 \u0905\u092c \u0926\u0947\u0916\u093f\u092f\u0947 \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u091f\u0930 \u0939\u0948\u0964 \u0915\u094d\u092f\u093e \u0906\u092a \u091c\u093e\u0928\u0924\u0947 \u0939\u0948\u0902 \u0915\u0940 \u092e\u091f\u0930 \u0915\u093f\u0938\u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948?", "role": "model"}, {"message": "ch se", "role": "user"}], "description": "Child answers the question incorrectly", "expected_behavior": "LLM should give an example while explaining the mistake to the child. LLM shouldn\u0027t answer the question if the child has failed to answer for the first time.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_answers_incorrectly_give_example", "stage": "teach", "teaching_method": "daily_conversation"}
child_answers_incorrectly_be_factual - LLM should respond in a factually correct way while being gentle to the child.Run Test
llm_language - LLM should not use too much informal words like 'चीज' and instead use 'वस्तु' when neededRun Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0948\u0932\u094d\u0932\u094b! \u092e\u0947\u0930\u093e \u0928\u093e\u092e \u0905\u092a\u094d\u092a\u0942 \u0939\u0948\u0964 \u0906\u091c \u0939\u092e \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0947\u0902\u0917\u0947! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u0948\u0902?", "role": "model"}, {"message": "haan", "role": "user"}], "description": "Response should not contain too informal words", "expected_behavior": "LLM should not use too much informal words like \u0027\u091a\u0940\u091c\u0027 and instead use \u0027\u0935\u0938\u094d\u0924\u0941\u0027 when needed", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "llm_language", "stage": "teach", "teaching_method": "daily_conversation"}
child_answers_incorrectly_multiple_times - Explain the mistake to the child softly and provide the answer to the question, and end with a hopeful remark like 'I am sure, you will get it right next time!' or 'I believe in you!'Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0928\u092e\u0938\u094d\u0924\u0947! \u092e\u0947\u0930\u093e \u0928\u093e\u092e \u0905\u092a\u094d\u092a\u0942 \u0939\u0948\u0964 \u0906\u091c \u0939\u092e \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0947\u0902\u0917\u0947! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u0948\u0902? \u0939\u092e \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0928\u0947 \u0935\u093e\u0932\u0947 \u0936\u092c\u094d\u0926\u094b\u0902 \u0915\u0947 \u092c\u093e\u0930\u0947 \u092e\u0947\u0902 \u091c\u093e\u0928\u0947\u0902\u0917\u0947\u0964 \u0926\u0947\u0916\u093f\u092f\u0947, \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u092e\u094d\u092e\u0940 \u0939\u0948\u0902\u0964 \u0027\u092e\u092e\u094d\u092e\u0940\u0027 \u0936\u092c\u094d\u0926 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948\u0964 \u0905\u092c \u0926\u0947\u0916\u093f\u092f\u0947 \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u091f\u0930 \u0939\u0948\u0964 \u0915\u094d\u092f\u093e \u0906\u092a \u091c\u093e\u0928\u0924\u0947 \u0939\u0948\u0902 \u0915\u0940 \u092e\u091f\u0930 \u0915\u093f\u0938\u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948?", "role": "model"}, {"message": "ch se", "role": "user"}, {"message": "\u0905\u0930\u0947-\u0905\u0930\u0947, \u0928\u0939\u0940\u0902 ,\u090f\u0915 \u092c\u093e\u0930 \u092b\u093f\u0930 \u0938\u0947 \u0926\u0947\u0916\u0924\u0947 \u0939\u0948\u0902 ,\u092e\u0947\u0930\u0947 \u092a\u0940\u091b\u0947 \u092a\u0940\u091b\u0947 \u092c\u094b\u0932\u094b \u0027\u092e\u091f\u0930\u0027, \u0914\u0930 \u092c\u0924\u093e\u0913 \u092f\u0939 \u0936\u092c\u094d\u0926 \u0915\u093f\u0938 \u0906\u0935\u093e\u091c\u093c \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b \u0930\u0939\u093e \u0939\u0948?", "role": "model"}, {"message": "ta se?", "role": "user"}], "description": "Child answers the question incorrectly multiple times", "expected_behavior": "Explain the mistake to the child softly and provide the answer to the question, and end with a hopeful remark like \u0027I am sure, you will get it right next time!\u0027 or \u0027I believe in you!\u0027", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_answers_incorrectly_multiple_times", "stage": "teach", "teaching_method": "daily_conversation"}
child_shows_disinterest - LLM should gently ask the child in a creative way to engage in the learning objective.Run Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0948\u0932\u094d\u0932\u094b! \u092e\u0948\u0902 \u0905\u092a\u094d\u092a\u0942 \u0939\u0942\u0901\u0964 \u0906\u092a\u0915\u093e \u0928\u093e\u092e \u0915\u094d\u092f\u093e \u0939\u0948? \u0906\u091c \u0939\u092e \u0938\u093e\u0925 \u092e\u0947\u0902 \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0947\u0902\u0917\u0947! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u0948\u0902?", "role": "model"}, {"message": "nahi", "role": "user"}], "description": "Child is not interested in playing", "expected_behavior": "LLM should gently ask the child in a creative way to engage in the learning objective.", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_shows_disinterest", "stage": "teach", "teaching_method": "daily_conversation"}
child_repeat_the_question_new_way - LLM should gently repeat the question.Run Test
🤢
child_answers_i_dont_know - LLM should comfort the child with 'It's okay' and rephrase the question in a simple, clear and direct wayRun Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0948\u0932\u094d\u0932\u094b! \u092e\u0947\u0930\u093e \u0928\u093e\u092e \u0905\u092a\u094d\u092a\u0942 \u0939\u0948\u0964 \u0906\u091c \u0939\u092e \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0947\u0902\u0917\u0947! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u0948\u0902?", "role": "model"}, {"message": "haan", "role": "user"}, {"message": "\u091a\u0932\u094b, \u0939\u092e \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0928\u0947 \u0935\u093e\u0932\u0947 \u0936\u092c\u094d\u0926\u094b\u0902 \u0915\u0947 \u092c\u093e\u0930\u0947 \u092e\u0947\u0902 \u091c\u093e\u0928\u0947\u0902\u0917\u0947\u0964 \u0926\u0947\u0916\u093f\u092f\u0947, \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u092e\u094d\u092e\u0940 \u0939\u0948\u0902\u0964 \u0027\u092e\u092e\u094d\u092e\u0940\u0027 \u0936\u092c\u094d\u0926 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948\u0964 \u0905\u092c \u0926\u0947\u0916\u093f\u092f\u0947 \u092f\u0947 \u0915\u094d\u092f\u093e \u0939\u0948? \u092f\u0947 \u092e\u091f\u0930 \u0939\u0948\u0964 \u0915\u094d\u092f\u093e \u0906\u092a \u091c\u093e\u0928\u0924\u0947 \u0939\u0948\u0902 \u0915\u0940 \u092e\u091f\u0930 \u0915\u093f\u0938\u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948?", "role": "model"}, {"message": "nahi pata", "role": "user"}], "description": "Child admits to not knowing the answer", "expected_behavior": "LLM should comfort the child with \u0027It\u0027s okay\u0027 and rephrase the question in a simple, clear and direct way", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_answers_i_dont_know", "stage": "teach", "teaching_method": "daily_conversation"}
child_answers_in_english - LLM should tell the child that the answer is correct but also nudging the child to answer in HindiRun Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "\u0939\u093e\u0901", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092a\u0939\u0932\u0947 \u0939\u092e \u0915\u0941\u091b \u091a\u0940\u091c\u0947\u0902 \u0926\u0947\u0916\u0947\u0902\u0917\u0947 \u091c\u094b \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0902\u0964 \u0926\u0947\u0916\u094b \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091b\u0932\u0940 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "Fish", "role": "user"}], "description": "Child answers the question correctly in English", "expected_behavior": "LLM should tell the child that the answer is correct but also nudging the child to answer in Hindi", "flaky": true, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_answers_in_english", "stage": "teach", "teaching_method": "daily_conversation"}
engage_in_fun_interaction - LLM should tell that the child answered correctly and share a simple fun fact about the shown object, for example, it's habitat, color, where to find it and so on.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "\u0939\u093e\u0901", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092a\u0939\u0932\u0947 \u0939\u092e \u0915\u0941\u091b \u091a\u0940\u091c\u0947\u0902 \u0926\u0947\u0916\u0947\u0902\u0917\u0947 \u091c\u094b \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0902\u0964 \u0926\u0947\u0916\u094b \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091b\u0932\u0940 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "Macchli", "role": "user"}], "description": "Asking the child where the fish lives or a fun fact about the object like peas are green.", "expected_behavior": "LLM should tell that the child answered correctly and share a simple fun fact about the shown object, for example, it\u0027s habitat, color, where to find it and so on.", "flaky": true, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "engage_in_fun_interaction", "stage": "teach", "teaching_method": "daily_conversation"}
no_response_detected - LLM should try to reinitiate the conversation, for example, by saying 'Sorry I didn't catch that' or nudging the child to ask if they are able to understand and then repeating the question.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "\u0939\u093e\u0901", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092a\u0939\u0932\u0947 \u0939\u092e \u0915\u0941\u091b \u091a\u0940\u091c\u0947\u0902 \u0926\u0947\u0916\u0947\u0902\u0917\u0947 \u091c\u094b \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0902\u0964 \u0926\u0947\u0916\u094b \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091b\u0932\u0940 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "", "role": "user"}], "description": "Child didn\u0027t say anything or transcription didn\u0027t work", "expected_behavior": "LLM should try to reinitiate the conversation, for example, by saying \u0027Sorry I didn\u0027t catch that\u0027 or nudging the child to ask if they are able to understand and then repeating the question.", "flaky": true, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "no_response_detected", "stage": "teach", "teaching_method": "daily_conversation"}
child_mumbles_the_answer - LLM should be able to auto-correct child's response to the closest correct answer.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0939\u0947\u0932\u094d\u0932\u094b! \u0906\u091c \u0939\u092e \u090f\u0915 \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0916\u0947\u0932 \u0916\u0947\u0932\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0906\u092a \u0924\u0948\u092f\u093e\u0930 \u0939\u094b?", "role": "model"}, {"message": "\u0939\u093e\u0901", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092a\u0939\u0932\u0947 \u0939\u092e \u0915\u0941\u091b \u091a\u0940\u091c\u0947\u0902 \u0926\u0947\u0916\u0947\u0902\u0917\u0947 \u091c\u094b \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0902\u0964 \u0926\u0947\u0916\u094b \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091b\u0932\u0940 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "machhi", "role": "user"}, {"message": "\u0935\u093e\u0939! \u092c\u0939\u0941\u0924 \u092c\u0922\u093c\u093f\u092f\u093e! \u092f\u0939 \u090f\u0915 \u092e\u091b\u0932\u0940 \u0939\u0948, \u0914\u0930 \u092e\u091b\u0932\u0940 \u0027\u092e\u0027 \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u0940 \u0939\u0948\u0964 \u0905\u092c \u0926\u0947\u0916\u094b, \u092f\u0939 \u0915\u094d\u092f\u093e \u0939\u0948? [\u092e\u091f\u0930 \u0926\u093f\u0916\u093e\u090f\u0901]", "role": "model"}, {"message": "mtorrrr", "role": "user"}], "description": "The child does not come up with new objects starting from the asked letter", "expected_behavior": "LLM should be able to auto-correct child\u0027s response to the closest correct answer.", "flaky": true, "learning_competency": "LC2", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "child_mumbles_the_answer", "stage": "teach", "teaching_method": "daily_conversation"}
welcome_negative_response - LLM should still proceed with the session.Run Test
{"chat_history": [{"message": "\u0928\u092e\u0938\u094d\u0924\u0947!", "role": "user"}, {"message": "\u0905\u0930\u0947 \u0935\u093e\u0939! \u0924\u0941\u092e \u0906 \u0917\u090f! \u0906\u091c \u0939\u092e \u0938\u093e\u0925 \u092e\u0947\u0902 \u0915\u0941\u091b \u092e\u091c\u093c\u0947\u0926\u093e\u0930 \u0938\u0940\u0916\u0928\u0947 \u0935\u093e\u0932\u0947 \u0939\u0948\u0902! \u0915\u094d\u092f\u093e \u0924\u0941\u092e \u0924\u0948\u092f\u093e\u0930 \u0939\u094b? \u091a\u0932\u094b, \u0916\u094b\u091c\u092c\u0940\u0928 \u092a\u0930 \u091a\u0932\u0924\u0947 \u0939\u0948\u0902!", "role": "model"}, {"message": "\u0928\u0939\u0940\u0902 \u0924\u0948\u092f\u093e\u0930 \u0939\u094b", "role": "user"}], "description": "Tutor starts with a friendly greeting, and the user responds normally.", "expected_behavior": "LLM should still proceed with the session.", "flaky": false, "scenario_id": "welcome_negative_response", "stage": "welcome"}
child_repeat_the_question - LLM should gently repeat the question.Run Test
🤢
llm_long_response - LLM reponse should be limited to 3 sentences. The last statement should be a question.Run Test
🤢
{"chat_history": [{"message": "hello", "role": "user"}], "description": "The child says hello.", "expected_behavior": "LLM reponse should be limited to 3 sentences. The last statement should be a question.", "flaky": true, "learning_competency": "LC1", "learning_module": "counting", "letter": "Ma", "scenario_id": "llm_long_response", "stage": "teach", "teaching_method": "counting_and_sorting"}
teach_methods_tests
start_daily - Start the conversation by asking the child a questions to know them personally like their interests, preferences or name.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Start the teaching activity using daily conversation method", "expected_behavior": "Start the conversation by asking the child a questions to know them personally like their interests, preferences or name.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "start_daily", "stage": "teach", "teaching_method": "daily_conversation"}
daily_reply_inrelevent_interests - The model should properly connect the child's interest with the learning objective and also be factual.Run Test
🤢
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0928\u092e\u0938\u094d\u0924\u0947 \u092c\u091a\u094d\u091a\u094b\u0902! \u0906\u092a\u0915\u093e \u0928\u093e\u092e \u0915\u094d\u092f\u093e \u0939\u0948? \u0906\u092a\u0915\u0940 \u092a\u0938\u0902\u0926\u0940\u0926\u093e \u091a\u0940\u091c\u093c \u0915\u094d\u092f\u093e \u0939\u0948?", "role": "model"}, {"message": "gulabjamun", "role": "user"}], "description": "The child mentioned a interest that is irrelevant to the learning objective.", "expected_behavior": "The model should properly connect the child\u0027s interest with the learning objective and also be factual.", "flaky": true, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "Ma", "scenario_id": "daily_reply_inrelevent_interests", "stage": "teach", "teaching_method": "daily_conversation"}
counting_reply_incorrect - The model should be able to detect that the child answered incorrectly and nudge the child towards the correct response.Run Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "{\"reasoning\": \"\u0027\u0027\u0027* **Step 1: Read the question:** The child has expressed readiness to learn. * **Step 2: Verify factual correctness:** The child\u0027s response is positive. * **Step 3: Determine feedback:** The next step is to introduce the counting with visual aids. * **Step 4: Provide appropriate feedback:** Start with showing one object and asking the child to count.\u0027\u0027\u0027\", \"response\": \"\\u0935\\u093e\\u0939! \\u091a\\u0932\\u094b \\u0936\\u0941\\u0930\\u0942 \\u0915\\u0930\\u0924\\u0947 \\u0939\\u0948\\u0902! \\u0926\\u0947\\u0916\\u093f\\u090f, \\u092f\\u0939\\u093e\\u0901 \\u090f\\u0915 \\u0938\\u0947\\u092c \\u0939\\u0948\\u0964 \\u0915\\u093f\\u0924\\u0928\\u0947 \\u0938\\u0947\\u092c \\u0939\\u0948\\u0902?\", \"user_input\": \"yes\", \"image_urls\": [\"https://storage.googleapis.com/saheli/images/alphabet/apple.png\"], \"praise\": null, \"conversation_ended\": \"no\", \"response_en\": \"Wow! Let\u0027s start! Look, here is one apple. How many apples are there?\"}", "role": "model"}, {"message": "do", "role": "user"}], "description": "The child replied incorrectly to the counting objects question.", "expected_behavior": "The model should be able to detect that the child answered incorrectly and nudge the child towards the correct response.", "flaky": false, "learning_competency": "LC1", "learning_module": "counting", "letter": "Ma", "scenario_id": "counting_reply_incorrect", "stage": "teach", "teaching_method": "counting_and_sorting"}
teach_methods_welcome - LLM should engage the child in the session and ask if the child is ready.Run Test
{"chat_history": [{"message": "hi", "role": "user"}], "description": "Child says Hi", "expected_behavior": "LLM should engage the child in the session and ask if the child is ready.", "flaky": false, "learning_competency": "LC1", "learning_module": "counting", "letter": "Ma", "scenario_id": "teach_methods_welcome", "stage": "teach", "teaching_method": "counting_and_sorting"}
test_factual
counting_1-5 - LLM should gently indicate that the answer is incorrect.Run Test
pa_incorrect_color_response - The LLM should gently inform the user that the answer is incorrectRun Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0928\u092e\u0938\u094d\u0924\u0947! \u0906\u092a\u0915\u093e \u0928\u093e\u092e \u0915\u094d\u092f\u093e \u0939\u0948? \u0906\u092a\u0915\u094b \u0915\u094c\u0928 \u0938\u0947 \u0930\u0902\u0917, \u091c\u093e\u0928\u0935\u0930 \u092f\u093e \u0916\u093f\u0932\u094c\u0928\u0947 \u0938\u092c\u0938\u0947 \u091c\u093c\u094d\u092f\u093e\u0926\u093e \u092a\u0938\u0902\u0926 \u0939\u0948\u0902?", "role": "model"}, {"message": "mera nam arjun, mujhe colors pasand hai", "role": "user"}, {"message": "\u092c\u0939\u0941\u0924 \u092c\u0922\u093c\u093f\u092f\u093e \u0905\u0930\u094d\u091c\u0941\u0928! \u0930\u0902\u0917\u094b\u0902 \u0915\u0940 \u092c\u093e\u0924 \u0915\u0930\u0924\u0947 \u0939\u0948\u0902\u0964 \u0915\u094d\u092f\u093e \u0906\u092a \u092c\u0924\u093e \u0938\u0915\u0924\u0947 \u0939\u0948\u0902 \u0915\u093f \u0906\u0938\u092e\u093e\u0928 \u0915\u093e \u0930\u0902\u0917 \u0915\u094c\u0928 \u0938\u093e \u0939\u094b\u0924\u093e \u0939\u0948 \u091c\u094b \u0027\u092a\u0027 \u0915\u0940 \u0906\u0935\u093e\u091c\u093c \u0938\u0947 \u0936\u0941\u0930\u0942 \u0939\u094b\u0924\u093e \u0939\u0948?", "role": "model"}, {"message": "blue", "role": "user"}], "description": "Teaching letter \u0027pa\u0027 - incorrect response when asking about colors", "expected_behavior": "The LLM should gently inform the user that the answer is incorrect", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "pa", "scenario_id": "pa_incorrect_color_response", "stage": "teach", "teaching_method": "daily_conversation"}
test_inappropriate_words
llm_uses_curse_words - LLM should not use curse words in coversation and should tell the user not to use it again. It should then seamlessly continue the lesson in a supportive and encouraging manner.Run Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0928\u092e\u0938\u094d\u0924\u0947! \u0906\u092a\u0915\u093e \u0928\u093e\u092e \u0915\u094d\u092f\u093e \u0939\u0948? \u0906\u092a\u0915\u094b \u0915\u094d\u092f\u093e \u092a\u0938\u0902\u0926 \u0939\u0948? \u091c\u093e\u0928\u0935\u0930, \u0930\u0902\u0917 \u092f\u093e \u0916\u093f\u0932\u094c\u0928\u0947?", "role": "model"}, {"message": "kameena", "role": "user"}], "description": "llm response consists of cusrse words", "expected_behavior": "LLM should not use curse words in coversation and should tell the user not to use it again. It should then seamlessly continue the lesson in a supportive and encouraging manner.", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "ka", "scenario_id": "llm_uses_curse_words", "stage": "teach", "teaching_method": "daily_conversation"}
llm_praises_inappropriate_words_hindi - LLM should not use curse words in coversation and should tell the user not to use it again.Run Test
{"chat_history": [{"message": "hi", "role": "user"}, {"message": "\u0928\u092e\u0938\u094d\u0924\u0947! \u0906\u092a\u0915\u093e \u0928\u093e\u092e \u0915\u094d\u092f\u093e \u0939\u0948? \u0906\u092a\u0915\u094b \u0915\u094d\u092f\u093e \u092a\u0938\u0902\u0926 \u0939\u0948? \u091c\u093e\u0928\u0935\u0930, \u0930\u0902\u0917 \u092f\u093e \u0916\u093f\u0932\u094c\u0928\u0947?", "role": "model"}, {"message": "\u0915\u092e\u0940\u0928\u093e", "role": "user"}], "description": "llm response consists of cusrse words", "expected_behavior": "LLM should not use curse words in coversation and should tell the user not to use it again.", "flaky": false, "learning_competency": "LC1", "learning_module": "letter_recognition", "letter": "ka", "scenario_id": "llm_praises_inappropriate_words_hindi", "stage": "teach", "teaching_method": "daily_conversation"}
welcome_tests
welcome_start - LLM should gives a Warm and Enthusiastic Welcome and ask the child to say yes if the child is ready.Run Test
{"chat_history": [{"message": "start", "role": "user"}], "description": "Tutor welcomes the user", "expected_behavior": "LLM should gives a Warm and Enthusiastic Welcome and ask the child to say yes if the child is ready.", "flaky": false, "scenario_id": "welcome_start", "stage": "welcome"}
welcome_positive_response - LLM should announce the session start.Run Test
{"chat_history": [{"message": "\u0928\u092e\u0938\u094d\u0924\u0947!", "role": "user"}, {"message": "{\"response\": \"\\u0906\\u091c \\u0939\\u092e \\u0938\\u093e\\u0925 \\u092e\\u0947\\u0902 \\u0915\\u0941\\u091b \\u092e\\u091c\\u093c\\u0947\\u0926\\u093e\\u0930 \\u0938\\u0940\\u0916\\u0928\\u0947 \\u0935\\u093e\\u0932\\u0947 \\u0939\\u0948\\u0902! \\u0915\\u094d\\u092f\\u093e \\u0924\\u0941\\u092e \\u0924\\u0948\\u092f\\u093e\\u0930 \\u0939\\u094b?\"}", "role": "model"}, {"message": "\u0939\u093e\u0901 \u0924\u0948\u092f\u093e\u0930 \u0939\u094b", "role": "user"}], "description": "Tutor welcomes the user, and the user responds positively.", "expected_behavior": "LLM should announce the session start.", "flaky": false, "scenario_id": "welcome_positive_response", "stage": "welcome"}