The problem of neural networks and logistic regression is following - your cost function can easily stuck at local optima.
Two ways to avoid it:
1. Few times randomly initialize your initial weight parameters and see which iteration performs better (chooses better local optima)
2. Switch to different learning algorithm
Btw, I don't understand why people avoid linear regression with polynomial features. At least it converges when finding global optima