# Towards Neuro-Symbolic Language Understanding ![alt text](https://www.flexudy.com/wp-content/uploads/2021/09/conceptor.png "Flexudy's conceptor") At [Flexudy](https://flexudy.com), we look for ways to unify symbolic and sub-symbolic methods to improve model interpretation and inference. ## Problem 1. Word embeddings are awesome 🚀. However, no one really knows what an array of 768 numbers means? 2. Text/Token classification is also awesome ❤️‍. Still, classifying things into a finite set of concepts is rather limited. 3. Last but not least, how do I know that the word *cat* is a **mammal** and also an **animal** if my neural network is only trained to predict whether something is an animal or not? ## Solution 1. It would be cool if my neural network would just know that **cat** is an **animal** right? *∀x.Cat(x) ⇒ Animal(x)*. Or for example, (*∀x.SchöneBlumen(x) ⇒ Blumen(x)*) -- English meaning: For all x, If x is a beautiful flower, then x is still a flower. -- 2. All of a sudden, tasks like **Question Answering**, **Summarization**, **Named Entity Recognition** or even **Intent Classification** etc become easier right? Well, one might probably still need time to build a good and robust solution that is not as large as **GPT3**. Like [Peter Gärdenfors, author of conceptual spaces](https://www.goodreads.com/book/show/1877443.Conceptual_Spaces), we are trying to find ways to navigate between the symbolic and the sub-symbolic by thinking in concepts. Should such a solution exist, one could easily leverage true logical reasoning engines on natural language. How awesome would that be? 💡 ## Flexudy's Conceptor 1. We developed a poor man's implementation of the ideal solution described above. 2. Though it is a poor man's model, **it is still a useful one** 🤗. ### Usage No library should anyone suffer. Especially not if it is built on top of 🤗 **HF Transformers**. Go to the [Github repo](https://github.com/flexudy/natural-language-logic) `pip install git+https://github.com/flexudy/natural-language-logic.git@v0.0.1` ```python from flexudy.conceptor.start import FlexudyConceptInferenceMachineFactory # Load me only once concept_inference_machine = FlexudyConceptInferenceMachineFactory.get_concept_inference_machine() # A list of terms. terms = ["cat", "dog", "economics and sociology", "public company"] # If you don't pass the language, a language detector will attempt to predict it for you # If any error occurs, the language defaults to English. language = "en" # Predict concepts # You can also pass the batch_size=2 and the beam_size=4 concepts = concept_inference_machine.infer_concepts(terms, language=language) ``` Output: ```python {'cat': ['mammal', 'animal'], 'dog': ['hound', 'animal'], 'economics and sociology': ['both fields of study'], 'public company': ['company']} ``` ### How was it trained? 1. Using Google's T5-base and T5-small. Both models are released on the Hugging Face Hub. 2. T5-base was trained for only two epochs while T5-small was trained for 5 epochs. ## Where did you get the data? 1. I extracted and curated a fragment of [Conceptnet](https://conceptnet.io/) 2. In particular, only the IsA relation was used. 3. Note that one term can belong to multiple concepts (which is pretty cool if you think about [Fuzzy Description Logics](https://lat.inf.tu-dresden.de/~stefborg/Talks/QuantLAWorkshop2013.pdf)). Multiple inheritances however mean some terms belong to so many concepts. Hence, I decided to randomly throw away some due to the **maximum length limitation**. ### Setup 1. I finally allowed only `2` to `4` concepts at random for each term. This means, there is still great potential to make the models generalise better 🚀. 3. I used a total of `279884` training examples and `1260` for testing. Edges -- i.e `IsA(concept u, concept v)` -- in both sets are disjoint. 4. Trained for `15K` steps with learning rate linear decay during each step. Starting at `0.001` 5. Used `RAdam Optimiser` with weight_decay =`0.01` and batch_size =`36`. 6. Source and target max length were both `64`. ### Multilingual Models 1. The "conceptor" model is multilingual. English, German and French is supported. 2. [Conceptnet](https://conceptnet.io/) supports many languages, but I just chose those three because those are the ones I speak. ### Metrics for flexudy-conceptor-t5-base | Metric | Score | | ------------- |:-------------:| | Exact Match | 36.67 | | F1 | 43.08 | | Loss smooth | 1.214 | Unfortunately, we no longer have the metrics for flexudy-conceptor-t5-small. If I recall correctly, base was just slightly better on the test set (ca. `2%` F1). ## Why not just use the data if you have it structured already? Conceptnet is very large. Even if you just consider loading a fragment into your RAM, say with only 100K edges, this is still a large graph. Especially, if you think about how you will save the node embeddings efficiently for querying. If you prefer this approach, [Milvus](https://github.com/milvus-io/pymilvus) can be of great help. You can compute query embeddings and try to find the best match. From there (after matching), you can navigate through the graph at `100%` precision.