--- license: apache-2.0 datasets: - cxllin/medinstruct language: - en metrics: - accuracy library_name: transformers pipeline_tag: question-answering tags: - medical --- # cxllin/Llama2-7b-med-v1 ## Model Details ### Description The **cxllin/Llama2-7b-med-v1** model, derived from the Llama 7b model, is posited to specialize in Natural Language Processing tasks within the medical domain. #### Development Details - **Developer**: Collin Heenan - **Model Architecture**: Transformer - **Base Model**: [Llama-2-7b](https://huggingface.co/NousResearch/Nous-Hermes-llama-2-7b) - **Primary Language**: English - **License**: apache 2.0 ### Model Source Links - **Repository**: Not Specified - **Paper**: [Jin, Di, et al. "What Disease does this Patient Have?..."](https://github.com/jind11/MedQA) ### Direct Applications The model is presumed to be applicable for various NLP tasks within the medical domain, such as: - Medical text generation or summarization. - Question answering related to medical topics. ### Downstream Applications Potential downstream applications might encompass: - Healthcare chatbot development. - Information extraction from medical documentation. ### Out-of-Scope Utilizations - Rendering definitive medical diagnoses or advice. - Employing in critical healthcare systems without stringent validation. - Applying in any high-stakes or legal contexts without thorough expert validation. ## Bias, Risks, and Limitations - **Biases**: The model may perpetuate biases extant in the training data, influencing neutrality. - **Risks**: There exists the peril of disseminating inaccurate or misleading medical information. - **Limitations**: Expertise in highly specialized or novel medical topics may be deficient. ### Recommendations for Use Utilizers are urged to: - Confirm outputs via expert medical review, especially in professional contexts. - Employ the model judiciously, adhering to pertinent legal and ethical guidelines. - Maintain transparency with end-users regarding the model’s capabilities and limitations. ## Getting Started with the Model Details regarding model deployment and interaction remain to be provided. ### Training Dataset - **Dataset Source**:[cxllin/medinstruct](https://huggingface.co/datasets/cxllin/medinstruct) - **Size**: 10.2k rows - **Scope**: Medical exam-related question-answering data. #### Preprocessing Steps Details regarding data cleaning, tokenization, and special term handling during training are not specified. --- @article{jin2020disease, title={What Disease does this Patient Have? A Large-scale Open Domain Question Answering Dataset from Medical Exams}, author={Jin, Di and Pan, Eileen and Oufattole, Nassim and Weng, Wei-Hung and Fang, Hanyi and Szolovits, Peter}, journal={arXiv preprint arXiv:2009.13081}, year={2020} }