Nevertheless, the problem of determining what plugins it needs is actually harder than writing these plugins itself.
That is very arguable.
You can have unit tests, speed execution tests and all kind of tests to see if a plugin is good or not.
A program writing programs? I think we don't have that yet (well, there is
https://en.wikipedia.org/wiki/Evolutionary_algorithm but from what I read in another thread DAC-related, it takes huge computational power).