Prithiviraj, thanks for the kind words. As for NER systems for English text only recognizing named entities in title case, that’s usually a function of how they are trained. If they’re trained on sentences from grammatical long-form documents, then it’s reasonable for them to expect named entities to be in title case. Indeed, restricting their attention to title case string and phrases is a great way to improve both accuracy and efficiency.

But you can certainly train a model that ignores case. For example, Stanford NLP includes a model for caseless English: And, as you’ve noted, Google also ignores case in its search queries.

How you train the model should be appropriate to where you will apply it. If you’re analyzing news articles, it’s probably a good idea to take advantage of capitalization as a signal. In contrast, most English search queries are lowercase.

Hope that helps!

