Limitations
Currently, LLMs/Generative AI is very good at language or image generation in silos. However, it is not efficient or accurate with multi-modal capabilities in sequence. For eg: generating a sequence of images relevant to long paragraphs or stories is not possible. This limits us from generating the following as they are still in the research phase:
MCQ’s with images
Stories with images
What is possible today?
Speech recognition and evaluation (helpful for evaluation of pronunciation of students)
Text to speech (helpful for listening and also facilitating teachers in a classroom)
Rapid creation of content based on the interest of the learners
Textual MCQ’s
Textual Reading Comprehensions
Last updated