Limitations

Currently, LLMs/Generative AI is very good at language or image generation in silos. However, it is not efficient or accurate with multi-modal capabilities in sequence. For eg: generating a sequence of images relevant to long paragraphs or stories is not possible. This limits us from generating the following as they are still in the research phase:
- MCQ’s with images
- Stories with images

Speech recognition and evaluation (helpful for evaluation of pronunciation of students)
Text to speech (helpful for listening and also facilitating teachers in a classroom)
Rapid creation of content based on the interest of the learners
Textual MCQ’s
Textual Reading Comprehensions

Last updated 1 year ago