Category: Programming

  • NLP – Natural language processing

    Natural Language Processing, or NLP, is broadly defined as the software automatically manipulating natural languages, like speech and text. One of the first steps required for Natural Language Processing (NLP) is the extraction of tokens in text. The process of tokenization splits text into tokens – that is, words. Usually, tokens are split based upon […]

  • Counting characters in Java

    There are many ways for counting the number of characters in a String. Below a simple/naive approach:

  • Counting words in Java

    This is a simple way to count words in a string in Java. StringTokenizer automatically takes care of whitespace for us, like tabs and carriage returns. In some cases like in “he-man”, we’d want “he” and “man” to be different words, but since there’s no whitespace between them, the defaults fail us. Fortunately, we can […]