Prithiviraj, thanks for the kind words. As for NER systems for English text only recognizing named entities in title case, that’s usually a function of how they are trained. If they’re trained on sentences from grammatical long-form documents, then it’s reasonable for them to expect named entities to be in title case. Indeed, restricting their attention to title case string and phrases is a great way to improve both accuracy and efficiency.

But you can certainly train a model that ignores case. For example, Stanford NLP includes a model for caseless English: https://stanfordnlp.github.io/CoreNLP/download.html. And, as you’ve noted, Google also ignores case in its search queries.

How you train the model should be appropriate to where you will apply it. If you’re analyzing news articles, it’s probably a good idea to take advantage of capitalization as a signal. In contrast, most English search queries are lowercase.

Hope that helps!

High-Class Consultant.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store