LLM

LLM

Usos

  • Autocomplete
    • Chatbots
      • Semi autonomous agents Jo him kkhuxohn oui demander la liste de problèmes rencontrés par les adhérents dans l GRC classés en catégories et comptabilisés.
        • Lui poser des questions sur les erreurs les plus fréquents ce qui les provoque ...

          Bases

          https://chat.lmsys.org/

          llm

          A ler

          To host or to API ?

          https://ai.plainenglish.io/is-open-ai-cheaper-than-hosting-your-own-llm-lets-find-out-30e861e138f1

          Servizos de calculo

          • Vast.ai : https://cloud.vast.ai/create/
            • Lamdalabs : https://lambdalabs.com/

              Fine tunning (raffinement)

              PEFT :
              Parameter-Efficient Fine-Tuning fine-tuning : pre-trained language models for a wide array of downstream applications without having to change all the model's parameters, saving on computational resources while also retaining the good properties of more generalistic models and improving performance on your specific task.
              DPO :

              Mistral-7B: A Step-by-Step guide on how to finetune a Large Language Model into a Medical Chat Doctor using Huggingface🤗, PEFT and Low rank adaptation

              RAG (retrieval augmented generation)

              basic rag
              stages

              https://docs.llamaindex.ai/en/stable/getting_started/concepts.html

              VantaxesNo training : cheap Data up to dateShows retrieved docs
              https://expertbeacon.com/retrieval-augmented-generation/
              RAGExplorer : A tool to build Retrieval Augmented Generation (RAG) visualisations https://github.com/gabrielchua/RAGxplorer
              Mistral 7Bhttps://www.kaggle.com/code/lusfernandotorres/retrieval-augmented-generation-with-mistral-7b
              Mistral 7Bhttps://medium.com/@prakharsaxena11111/finetuning-mixtral-7bx8-6071b0ebf114
              Mixtral 7xB8 on A100https://medium.com/@prakharsaxena11111/finetuning-mixtral-7bx8-6071b0ebf114

              Fine-tunning vs RAG

              1. https://symbl.ai/developers/blog/fine-tuning-vs-rag-an-opinion-and-comparative-analysis/
                1. https://arxiv.org/abs/2401.08406
                  1. https://www.rungalileo.io/blog/optimizing-llm-performance-rag-vs-finetune-vs-both
                    1. https://deci.ai/blog/fine-tuning-peft-prompt-engineering-and-rag-which-one-is-right-for-you/

                      Evaluacion

                      1. https://docs.parea.ai/blog/eval-metrics-for-llm-apps-in-prod
                        1. https://towardsdatascience.com/top-evaluation-metrics-for-rag-failures-acb27d2a5485

                          Tecnicas

                          1. LoRaX (”LoRA eXchange”) : is designed to serve a multitude of fine-tuned models on a single GPU,
                            1. Formato do prompt
                              1. Axolotl : axuda pro refinamento con parametraxes óptimos por defecto

                                Deployement

                                1. Dedicated server
                                  1. https://github.com/huggingface/text-generation-inference
                                    1. https://huggingface.co/docs/text-generation-inference/index
                                      1. https://docs.vllm.ai/en/latest/index.html
                                        1. Serveless
                                          1. https://www.together.ai/
                                            1. https://www.anyscale.com/
                                              1. https://abacus.ai/
                                                1. https://deepinfra.com/
                                                  1. https://huggingface.co/docs/inference-endpoints/index

                                                  LangChain

                                                  https://akash-mathur.medium.com/advanced-rag-optimizing-retrieval-with-additional-context-metadata-using-llamaindex-aeaa32d7aa2f
                                                  https://akash-mathur.medium.com/advanced-rag-enhancing-retrieval-efficiency-through-evaluating-reranker-models-using-llamaindex-3f104f24607e
                                                  https://akash-mathur.medium.com/advanced-rag-query-augmentation-for-next-level-search-using-llamaindex-d362fed7ecc3

                                                  LlamaIndex

                                                  python.langchain.comhttps://python.langchain.com/docs/expression_language/cookbook/retrieval
                                                  https://www.kaggle.com/code/lusfernandotorres/retrieval-augmented-generation-with-mistral-7b/notebook
                                                  n 13 min https://www.youtube.com/watch?v=aywZrzNaKjs
                                                  https://kirenz.github.io/lab-langchain-rag/slide.html
                                                  https://medium.com/@thakermadhav/build-your-own-rag-with-mistral-7b-and-langchain-97d0c92fa146

                                                  Ollama

                                                  https://colab.research.google.com/github/embedchain/embedchain/blob/main/notebooks/ollama.ipynb#scrollTo=gyJ6ui2vhtMY
                                                  meu : https://colab.research.google.com/drive/1qpeJdiTlggVEFDCY1AlrqNK7zm3JUhjV#scrollTo=P8HIYBBtqJnq
                                                  https://colab.research.google.com/github/BerriAI/litellm/blob/main/cookbook/liteLLM_Ollama.ipynb#scrollTo=h90vBYmqg80B
                                                  **** https://colab.research.google.com/drive/1cRqpPYWJi4eZsf1-BnUoO68d30qrpEdB#scrollTo=OjDYfeUneXTs