LeroyDyer commited on
Commit
d4b7c4e
1 Parent(s): 15519b0

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +25 -20
README.md CHANGED
@@ -36,25 +36,6 @@ transformers.AutoModelForCausalLM.from_pretrained
36
  ```
37
 
38
  ### Model Description
39
-
40
-
41
-
42
- This is the model card of a 🤗 transformers model that has been pushed on the Hub.
43
- Previous vision models have been 50/50 as the multimodel model actully requires a lot of memory and gpu and harddrive space to create;
44
- the past versions have been attempts to Merge the capabilitys into the main mistral model whilst still retaining its mistral tag!
45
- After reading many hugging face articles:
46
-
47
- The BackBone Issue is the main cause of creating multi modals !:
48
-
49
- with the advent of tiny models we are able to leverage the decoder abilitys as a single expert-ish... within the model :
50
- by reducing the size to a fully trainined tiny model!
51
- this will only produce decodings and not conversations so it needs to be smart and respond with defined answers: but in general it will produce captions: but as domain based it may be specialized in medical or art etc:
52
-
53
- The main llm still needs to retain these models within hence the back bone method of instigating a VisionEncoderDecoder model: istead of a llava model which still need wrangling to work correctly without spoiling the original transformers installation:
54
- Previous experiments proved that the mistral large model could be used as a decoder but the total model jumped to 13b so the when applying the tiny model it was only effected by the weight of the model 248M
55
-
56
-
57
-
58
  This is an experiment in vision - the model has been created as a mistral/VisionEncoder/Decoder
59
 
60
  Customized from:
@@ -80,6 +61,25 @@ Encoder:
80
  - **Language(s) (NLP):** [English]
81
 
82
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
83
  ## How to Get Started with the Model
84
 
85
 
@@ -168,7 +168,12 @@ loss = model(pixel_values=pixel_values, labels=labels).loss
168
  ```
169
 
170
 
171
- ### Model Architecture and Objective
 
 
 
 
 
172
 
173
  ``` python
174
 
 
36
  ```
37
 
38
  ### Model Description
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
39
  This is an experiment in vision - the model has been created as a mistral/VisionEncoder/Decoder
40
 
41
  Customized from:
 
61
  - **Language(s) (NLP):** [English]
62
 
63
 
64
+ ## Summary
65
+
66
+ This is the model card of a 🤗 transformers model that has been pushed on the Hub.
67
+ Previous vision models have been 50/50 as the multimodel model actully requires a lot of memory and gpu and harddrive space to create;
68
+ the past versions have been attempts to Merge the capabilitys into the main mistral model whilst still retaining its mistral tag!
69
+ After reading many hugging face articles:
70
+
71
+ The BackBone Issue is the main cause of creating multi modals !:
72
+
73
+ with the advent of tiny models we are able to leverage the decoder abilitys as a single expert-ish... within the model :
74
+ by reducing the size to a fully trainined tiny model!
75
+ this will only produce decodings and not conversations so it needs to be smart and respond with defined answers: but in general it will produce captions: but as domain based it may be specialized in medical or art etc:
76
+
77
+ The main llm still needs to retain these models within hence the back bone method of instigating a VisionEncoderDecoder model: istead of a llava model which still need wrangling to work correctly without spoiling the original transformers installation:
78
+ Previous experiments proved that the mistral large model could be used as a decoder but the total model jumped to 13b so the when applying the tiny model it was only effected by the weight of the model 248M
79
+
80
+
81
+
82
+
83
  ## How to Get Started with the Model
84
 
85
 
 
168
  ```
169
 
170
 
171
+ ### Model Architecture
172
+
173
+
174
+ Aha !!! Here is how you create such a model ::
175
+
176
+
177
 
178
  ``` python
179