What is semantic similarity measurement?

So you are interested in Semantic Similarity...

Semantic Similarity is one of the most interesting research topics that researchers and practitioners from the Artificial Intelligence community have to face [1]. For this reason, we are currently facing a boom of new semantic similarity measures to help complete this task effectively and efficiently.

The capability to automatically assess the degree of semantic similarity between text fragments has a lot of implications in a wide range of computational disciplines. For example query expansion, automatic translation, schema and ontology matching, document classification, real-time chatbots, etc.

For this reason, it is very important to develop new methods that can help to face this important challenge. Currently, researchers have many methods to complete this task, some of which are new and sophisticated, such as the calculation of word embeddings using a large corpus of text through deep learning techniques.

Of all the existing techniques, a reduced set of them stand out, which includes: the use of dictionaries such as Wordnet, the use of the so-called Google similarity distance, the fuzzy learning of semantic similarity controllers [2], or even the most recent semantic similarity aggregators.

The purpose of this website is to explain in a simple way each of these methods that are used in academia and industry by thousands of people every day. We will try to explain these methods through examples and illustrations that help to understand with much more ease, the sometimes complex ins and outs that make them work so well.

References:
[1] Jorge Martinez-Gil: An overview of textual semantic similarity measures based on web intelligence. Artif. Intell. Rev. 42(4): 935-943 (2014)

[2] Jorge Martinez-Gil, Jose M. Chaves-Gonzalez: Automatic design of semantic similarity controllers based on fuzzy logics. Expert Syst. Appl. 131: 45-59 (2019)