Category: Java
-
Apache OpenNLP – Tokenization
Tokenization is a process of segmenting strings into smaller parts called tokens(say sub-strings). Usually, these tokens are words, numbers, or […]
-
Counting characters in Java
There are many ways for counting the number of characters in a String. Below a simple/naive approach:
-
Counting words in Java
This is a simple way to count words in a string in Java. StringTokenizer automatically takes care of whitespace for […]