# Limitations

* Currently, LLMs/Generative AI is very good at language or image generation in silos. However, it is not efficient or accurate with multi-modal capabilities in sequence. For eg: generating a sequence of images relevant to long paragraphs or stories is not possible. This limits us from generating the following as they are still in the research phase:
  * MCQ’s with images
  * Stories with images

### What is possible today?

* Speech recognition and evaluation (helpful for evaluation of pronunciation of students)
* Text to speech (helpful for listening and also facilitating teachers in a classroom)
* Rapid creation of content based on the interest of the learners
* Textual MCQ’s
* Textual Reading Comprehensions
