I looked at a couple of the example submissions and my initial reaction is: isn't your self-learning network supposed to find correlations to determine that, given a multitude of data? Shouldn't you just be throwing
everything at it, and seeing what sticks? Anything related to economy, markets, technology, the cryptocurrency and token service industries, for starters. Widely distributed mediums are obviously more interesting than items nobody reads.
The legend for news data suggests the model could be fed seemingly irrelevant news that doesn't mention the asset in question. The model should, over time, de-emphasize irrelevant news that has no effect and emphasize news that does has some effect. The model shouldn't depend on humans determining what's important and what's not. That defeats the purpose.