t5-base-conceptor / README.md
flexudy's picture
Update README.md
e61b199
|
raw
history blame
No virus
5.24 kB
# Flexudy's Conceptor: Towards Neuro-Symbolic Language Understanding
![alt text](https://www.flexudy.com/wp-content/uploads/2021/09/conceptor.png "Flexudy's conceptor")
At [Flexudy](https://flexudy.com), we look for ways to unify symbolic and sub-symbolic methods to improve model interpretation and inference.
## Problem
1. Word embeddings are awesome 🚀. However, no one really knows what an array of 768 numbers means?
2. Text/Token classification is also awesome ❤️‍. Still, classifying things into a finite set of concepts is rather limited.
3. Last but not least, how do I know that the word *cat* is a **mammal** and also an **animal** if my neural network is only trained to predict whether something is an animal or not?
## Solution
1. It would be cool if my neural network would just know that **cat** is an **animal** right? *∀x.Cat(x) ⇒ Animal(x)*.
Or for example, (*∀x.SchöneBlumen(x) ⇒ Blumen(x)*) -- English meaning: For all x, If x is a beautiful flower, then x is still a flower. --
2. All of a sudden, tasks like **Question Answering**, **Summarization**, **Named Entity Recognition** or even **Intent Classification** etc become easier right?
Well, one might probably still need time to build a good and robust solution that is not as large as **GPT3**.
Like [Peter Gärdenfors, author of conceptual spaces](https://www.goodreads.com/book/show/1877443.Conceptual_Spaces), we are trying to find ways to navigate between the symbolic and the sub-symbolic by thinking in concepts.
Should such a solution exist, one could easily leverage true logical reasoning engines on natural language.
How awesome would that be? 💡
## Flexudy's Conceptor
1. We developed a poor man's implementation of the ideal solution described above.
2. Though it is a poor man's model, **it is still a useful one** 🤗.
### Usage
No library should anyone suffer. Especially not if it is built on top of **HF Transformers**.
Go to the [Github repo](https://github.com/flexudy/natural-language-logic)
```python
from core.conceptor.start import FlexudyConceptInferenceMachineFactory
# Load me only once
concept_inference_machine = FlexudyConceptInferenceMachineFactory.get_concept_inference_machine()
# A list of terms.
terms = ["cat", "dog", "economics and sociology", "public company"]
# If you don't pass the language, a language detector will attempt to predict it for you
# If any error occurs, the language defaults to English.
language = "en"
# Predict concepts
# You can also pass the batch_size=2 and the beam_size=4
concepts = concept_inference_machine.infer_concepts(terms, language=language)
```
Output:
```{<br/>'cat': ['mammal', 'animal'], <br/> 'dog': ['hound', 'animal'], <br/>'economics and sociology': ['both fields of study'], <br/>'public company': ['company']<br/>}```
### How was it trained?``
1. Using Google's T5-base and T5-small. Both models are released on the Hugging Face Hub.
2. T5-base was trained for only two epochs while T5-small was trained for 5 epochs.
## Where did you get the data?
1. I extracted and curated a fragment of [Conceptnet](https://conceptnet.io/)
2. In particular, only the IsA relation was used.
3. Note that one thing can belong to multiple concepts (which is pretty cool if you think about [Fuzzy Description Logics](https://lat.inf.tu-dresden.de/~stefborg/Talks/QuantLAWorkshop2013.pdf)).
Multiple inheritances however mean some terms belong to so many concepts. Hence, I decided to randomly throw away some due to the **maximum length limitation**.
### Setup
1. I finally allowed only `2` to `4` concepts at random for each term. This means, there is still great potential to make the models generalise better 🚀.
3. I used a total of `279884` training examples and `1260` for testing. Edges -- i.e `IsA(concept u, concept v)` -- in both sets are disjoint.
4. Trained for `15K` steps with learning rate linear decay during each step. Starting at `0.001`
5. Used `RAdam Optimiser` with weight_decay =`0.01` and batch_size =`36`.
6. Source and target max length were both `64`.
### Multilingual Models
1. The "conceptor" model is multilingual. English, German and French is supported.
2. [Conceptnet](https://conceptnet.io/) supports many languages, but I just chose those three because those are the ones I speak.
### Metrics for flexudy-conceptor-t5-base
| Metric | Score |
| ------------- |:-------------:|
| Exact Match | 36.67 |
| F1 | 43.08 |
| Loss smooth | 1.214 |
Unfortunately, we no longer have the metrics for flexudy-conceptor-t5-small. If I recall correctly, base was just slightly better on the test set (ca. `2%` F1).
## Why not just use the data if you have it structured already?
Conceptnet is very large. Even if you just consider loading a fragment into your RAM, say with only 100K edges, this is still a large graph.
Especially, if you think about how you will save the node embeddings efficiently for querying.
If you prefer this approach, [Milvus](https://github.com/milvus-io/pymilvus) can be of great help.
You can compute query embeddings and try to find the best match. From there (after matching), you can navigate through the graph at `100%` precision.