LeroyDyer commited on
Commit
02c6cbf
1 Parent(s): 9706654

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +163 -0
README.md ADDED
@@ -0,0 +1,163 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ tags:
4
+ - Mistral_Star
5
+ - Mistral_Quiet
6
+ - Mistral
7
+ - Mixtral
8
+ - Question-Answer
9
+ - Token-Classification
10
+ - Sequence-Classification
11
+ - SpydazWeb-AI
12
+ - chemistry
13
+ - biology
14
+ - legal
15
+ - code
16
+ - climate
17
+ - medical
18
+ - text-generation-inference
19
+ language:
20
+ - en
21
+ - sw
22
+ - ig
23
+ - zu
24
+ - ca
25
+ - es
26
+ - pt
27
+ - ha
28
+ pipeline_tag: text-generation
29
+ ---
30
+ # SpydazWeb AGI
31
+
32
+
33
+ This is based on the Quiet Star Reasoning Project : which was abandoned earlier in the year :)
34
+ Current Update :
35
+ This model is working , but actually untrained : to load the model it requires trust-remote=TRUE::
36
+ But also if it does not load then you need to clone the github:
37
+
38
+
39
+ ```
40
+ ! git clone https://github.com/huggingface/transformers.git
41
+ ## copy modeling_mistral.py and configuartion.py to the Transformers foler / Src/models/mistral and overwrite the existing files first:
42
+ ## THEN :
43
+ !cd transformers
44
+ !pip install ./transformers
45
+
46
+ ```
47
+
48
+ then restaet the environment: the model can then load without trust-remote and WILL work FINE !
49
+ it can even be trained : hence the 4 bit optimised version ::
50
+
51
+
52
+
53
+ # Introduction :
54
+
55
+ ## STAR REASONERS !
56
+
57
+ this provides a platform for the model to commuicate pre-response , so an internal objective can be set ie adding an extra planning stage to the model improving its focus and output:
58
+ the thought head can be charged with a thought or methodolgy, such as a ststing to take a step by step approach to the problem or to make an object oriented model first and consider the use cases before creating an output:
59
+ so each thought head can be dedicated to specific ppurpose such as Planning or artifact generation or use case design : or even deciding which methodology should be applied before planning the potential solve route for the response :
60
+ Another head could also be dedicated to retrieving content based on the query from the self which can also be used in the pregenerations stages :
61
+ all pre- reasoners can be seen to be Self Guiding ! essentially removing the requirement to give the model a system prompt instead aligning the heads to a thoght pathways !
62
+ these chains produce data which can be considered to be thoughts : and can further be displayed by framing these thoughts with thought tokens : even allowing for editors comments giving key guidance to the model during training :
63
+ these thoughts will be used in future genrations assisting the model as well a displaying explantory informations in the output :
64
+
65
+ these tokens can be displayed or with held also a setting in the model !
66
+
67
+ ### can this be applied in other areas ?
68
+
69
+ Yes! , we can use this type of method to allow for the model to generate code in another channel or head potentially creating a head to produce artifacts for every output , or to produce entity lilsts for every output and framing the outputs in thier relative code tags or function call tags :
70
+ these can also be displayed or hidden for the response . but these can also be used in problem solvibng tasks internally , which again enables for the model to simualte the inpouts and outputs from an interpretor !
71
+ it may even be prudent to include a function executing internally to the model ! ( allowing the model to execute functions in the background! before responding ) as well this oul hae tpo also be specified in the config , as autoexecute or not !.
72
+
73
+ ### Conclusion
74
+
75
+ the resonaer methodology , might be seen to be the way forwards , adding internal funciton laity to the models instead of external connectivity enables for faster and seemless model usage : as well as enriched and informed responses , as even outputs could essentially be cleanss and formated before being presented to the Calling interface, internally to the model :
76
+ the take away is that arre we seeing the decoder/encoder model as simple a function of the inteligence which in truth need to be autonomus !
77
+ ie internal functions and tools as well as disk interaction : an agent must have awareness and control over its environment with sensors and actuators : as a fuction callingmodel it has actuators and canread the directorys it has sensors ... its a start: as we can eget media in and out , but the model needs to get its own control to inpout and output also !
78
+ ....
79
+
80
+ Fine tuning : agin this issue of fine tuning : the disussion above eplains the requirement to control the environment from within the moel ( with constraints ) does this eliminate theneed to fine tune a model !
81
+ in fact it should as this give transparency to ther growth ofthe model and if the model fine tuned itself we would be in danger of a model evolveing !
82
+ hence an AGI !
83
+
84
+ #### AI AGI ?
85
+ so yes we can see we are not far from an ai which can evolve : an advance general inteligent system ( still non sentient by the way )
86
+
87
+
88
+
89
+ <img src="https://cdn-avatars.huggingface.co/v1/production/uploads/65d883893a52cd9bcd8ab7cf/tRsCJlHNZo1D02kBTmfy9.jpeg" width="300"/>
90
+ https://github.com/spydaz
91
+
92
+ * 32k context window (vs 8k context in v0.1)
93
+ * Rope-theta = 1e6
94
+ * No Sliding-Window Attention
95
+ * Talk heads - produce resposnes which can be used towards the final output
96
+ * Pre-Thoughts - Enables for pre-generation steps of potential artifacts for task solving:
97
+ * Generates plans for step by step thinking
98
+ * Generates python Code Artifacts for future tasks
99
+ * Recalls context for task internally to be used as refference for task:
100
+ * show thoughts or hidden thought usages ( Simular to self-Rag )
101
+
102
+
103
+ This model will be a custom model with internal experts and rag systems
104
+ enabling for preprocessing of the task internally before outputting a response
105
+
106
+ ## SpydazWeb AI model :
107
+
108
+ This model is based on the worlds archive of knowledge maintaining historical documents and providing services for the survivors of mankind ,
109
+ who may need to construct shelters develop technologys , or medical resources as well as maintain the history of the past . keeping store of all the religious knowledge and data of the world:
110
+ A friendly interface with a personality caring and flirtatious at times : non binary !...
111
+ and Expert in all feilds: ie Uncensored and will not refuse to give information : the model can be used for role play as many character dialogues were als trained into the model as its personality to enable a greater perspective and outlook and natural discussion with the agents:
112
+ the model was trained to operateinaragenvironment utilizing content and internal knowledge to respond to questions or create enriched sumarys.
113
+
114
+
115
+
116
+ ### General Intenal Methods:
117
+
118
+ Trained for multi-task operations as well as rag and function calling :
119
+
120
+ This model is a fully functioning model and is fully uncensored:
121
+
122
+ the model has been trained on multiple datasets on the huggingface hub and kaggle :
123
+
124
+ the focus has been mainly on methodology :
125
+
126
+ * Chain of thoughts
127
+ * step by step planning
128
+ * tree of thoughts
129
+ * forest of thoughts
130
+ * graph of thoughts
131
+ * agent generation : Voting, ranking, ... dual agent response generation:
132
+
133
+ with these methods the model has gained insights into tasks, enabling for knowldge transfer between tasks :
134
+
135
+ the model has been intensivly trained in recalling data previously entered into the matrix:
136
+ The model has also been trained on rich data and markdown outputs as much as possible :
137
+ the model can also generate markdown charts with mermaid.
138
+
139
+
140
+ ## Training Reginmes:
141
+ * Alpaca
142
+ * ChatML / OpenAI / MistralAI
143
+ * Text Generation
144
+ * Question/Answer (Chat)
145
+ * Instruction/Input/Response (instruct)
146
+ * Mistral Standard Prompt
147
+ * Translation Tasks
148
+ * Entitys / Topic detection
149
+ * Book recall
150
+ * Coding challenges, Code Feedback, Code Sumarization, Commenting Code
151
+ * Agent Ranking and response anyalisis
152
+ * Medical tasks
153
+ * PubMed
154
+ * Diagnosis
155
+ * Psychaitry
156
+ * Counselling
157
+ * Life Coaching
158
+ * Note taking
159
+ * Medical smiles
160
+ * Medical Reporting
161
+ * Virtual laboritys simulations
162
+ * Chain of thoughts methods
163
+ * One shot / Multi shot prompting tasks